.. only:: html .. note:: :class: sphx-glr-download-link-note Click :ref:`here ` to download the full example code .. rst-class:: sphx-glr-example-title .. _sphx_glr_examples_apps_compute_world_size_main.py: Compute World Size Example ============================ This is a minimal "hello world" style example application that uses PyTorch Distributed to compute the world size. It does not do ML training but it does initialize process groups and performs a single collective operation (all_reduce) which is enough to validate the infrastructure and scheduler setup. As simple as this application is, the actual ``compute_world_size()`` function is split into a separate submodule (``.module.util.compute_world_size``) to double as a E2E test for workspace patching logic, which typically diff-patches a full project directory rather than a single file. This application also uses `Hydra `_ configs as an expository example of how to use Hydra configs in an application that launches with TorchX. Run it with the ``dist.ddp`` builtin component to use as a validation application to ensure that the stack has been setup properly for more serious distributed training jobs. .. code-block:: default import hydra from omegaconf import DictConfig, OmegaConf from torch.distributed.elastic.multiprocessing.errors import record from torchx.examples.apps.compute_world_size.module.util import compute_world_size @record def run(cfg: DictConfig) -> None: print(OmegaConf.to_yaml(cfg)) if cfg.main.throws: raise RuntimeError(f"raising error because cfg.main.throws={cfg.main.throws}") compute_world_size(cfg) if __name__ == "__main__": # use compose API to make this compatible with ipython notebooks # need to initialize the config directory as a module to make it # not depends on rel path (PWD) or abs path (torchx install dir) # see: https://hydra.cc/docs/advanced/jupyter_notebooks/ with hydra.initialize_config_module( config_module="torchx.examples.apps.compute_world_size.config" ): cfg: DictConfig = hydra.compose(config_name="defaults") run(cfg) .. rst-class:: sphx-glr-timing **Total running time of the script:** ( 0 minutes 0.000 seconds) .. _sphx_glr_download_examples_apps_compute_world_size_main.py: .. only :: html .. container:: sphx-glr-footer :class: sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: main.py ` .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: main.ipynb ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_