Shortcuts

auto_dataloader#

ignite.distributed.auto.auto_dataloader(dataset, **kwargs)[source]#

Helper method to create a dataloader adapted for non-distributed and distributed configurations (supporting all available backends from available_backends()).

Internally, we create a dataloader with provided kwargs while applying the following updates:

  • batch size is scaled by world size: batch_size / world_size if larger or equal world size.

  • number of workers is scaled by number of local processes: num_workers / nprocs if larger or equal world size.

  • if no sampler provided by user, a torch DistributedSampler is setup.

  • if a torch DistributedSampler is provided by user, it is used without wrapping it.

  • if another sampler is provided, it is wrapped by DistributedProxySampler.

  • if the default device is ‘cuda’, pin_memory is automatically set to True.

Warning

Custom batch sampler is not adapted for distributed configuration. Please, make sure that provided batch sampler is compatible with distributed configuration.

Parameters
  • dataset (Dataset) – input torch dataset. If input dataset is torch IterableDataset then dataloader will be created without any distributed sampling. Please, make sure that the dataset itself produces different data on different ranks.

  • kwargs (Any) – keyword arguments for torch DataLoader.

Returns

torch DataLoader or XLA MpDeviceLoader for XLA devices

Return type

Union[DataLoader, _MpDeviceLoader]

Examples

import ignite.distribted as idist

train_loader = idist.auto_dataloader(
    train_dataset,
    batch_size=32,
    num_workers=4,
    shuffle=True,
    pin_memory="cuda" in idist.device().type,
    drop_last=True,
)