Shortcuts

torch.cuda.comm.broadcast_coalesced

torch.cuda.comm.broadcast_coalesced(tensors, devices, buffer_size=10485760)[source]

Broadcasts a sequence tensors to the specified GPUs. Small tensors are first coalesced into a buffer to reduce the number of synchronizations.

Parameters
  • tensors (sequence) – tensors to broadcast. Must be on the same device, either CPU or GPU.

  • devices (Iterable[torch.device, str or int]) – an iterable of GPU devices, among which to broadcast.

  • buffer_size (int) – maximum size of the buffer used for coalescing

Returns

A tuple containing copies of tensor, placed on devices.

Docs

Access comprehensive developer documentation for PyTorch

View Docs

Tutorials

Get in-depth tutorials for beginners and advanced developers

View Tutorials

Resources

Find development resources and get your questions answered

View Resources