torch.cuda.comm.reduce_add(inputs, destination=None)[source]

Sums tensors from multiple GPUs.

All inputs should have matching shapes, dtype, and layout. The output tensor will be of the same shape, dtype, and layout.

Parameters
• inputs (Iterable[Tensor]) – an iterable of tensors to add.

• destination (int, optional) – a device on which the output will be placed (default: current device).

Returns

A tensor containing an elementwise sum of all inputs, placed on the destination device.