DistributedReadingService¶
- class torchdata.dataloader2.DistributedReadingService(timeout: int = 1800)¶
DistributedReadingSerivce
handles distributed sharding on the graph ofDataPipe
and guarantee the randomness by sharing the same seed across the distributed processes.- Parameters:
timeout – Timeout for operations executed against the process group in seconds. Default value equals 30 minutes.
- finalize() None ¶
Clean up the distributed process group.
- initialize(datapipe: Union[IterDataPipe, MapDataPipe]) Union[IterDataPipe, MapDataPipe] ¶
Launches the
gloo
-backend distributed process group. Carries out distributed sharding on the graph ofDataPipe
and returns the graph attached with aFullSyncIterDataPipe
at the end.
- initialize_iteration(seed_generator: SeedGenerator, iter_reset_fn: Optional[Callable[[Union[IterDataPipe, MapDataPipe]], Union[IterDataPipe, MapDataPipe]]] = None) Optional[Callable[[Union[IterDataPipe, MapDataPipe]], Union[IterDataPipe, MapDataPipe]]] ¶
Shares the same seed from rank 0 to other ranks across the distributed processes and apply the random seed to the
DataPipe
graph.