Shortcuts

Template Function torch::nn::parallel::data_parallel

Function Documentation

template<typename ModuleType>
Tensor torch::nn::parallel::data_parallel(ModuleType module, Tensor input, std::optional<std::vector<Device>> devices = std::nullopt, std::optional<Device> output_device = std::nullopt, int64_t dim = 0)

Evaluates module(input) in parallel across the given devices.

If devices is not supplied, the invocation is parallelized across all available CUDA devices. If output_device is supplied, the final, combined tensor will be placed on this device. If not, it defaults to the first device in devices.

In detail, this method performs the following four distinct steps:

  1. Scatter the input to the given devices,

  2. Replicate (deep clone) the model on each device,

  3. Evaluate each module with its input on its device,

  4. Gather the outputs of each replica into a single output tensor, located on the output_device.

Docs

Access comprehensive developer documentation for PyTorch

View Docs

Tutorials

Get in-depth tutorials for beginners and advanced developers

View Tutorials

Resources

Find development resources and get your questions answered

View Resources