Application Examples

This contains the example applications that demonstrates how to use TorchX for various styles of applications (e.g. single node, distributed, etc). These apps can be launched by themselves or part of a pipeline. It is important to note that TorchX’s job is to launch the apps. You’ll notice that the apps are implemented without any TorchX imports.

See the Pipelines Examples for how to use the components in a pipeline.

Prerequisites

Before executing examples, install TorchX and dependencies necessary to run examples:

` $ pip install torchx $ git clone https://github.com/pytorch/torchx.git $ cd torchx/examples/apps $ TORCHX_VERSION=$(torchx --version | sed 's/torchx-//') $ git checkout v$TORCHX_VERSION $ pip install -r dev-requirements.txt `

Compute World Size Example

This is a minimal “hello world” style example application that uses PyTorch Distributed to compute the world size. It is a minimal example in that it initializes the torch.distributed process group and performs a single collective operation (all_reduce) which is enough to validate the infrastructure and scheduler setup.

This example is compatible with the dist.ddp. To run from CLI:

$ cd $torchx-git-repo-root/torchx/examples/apps
$ torchx run dist.ddp --script compute_world_size/main.py -j 1x2

Data Preprocessing Example

This is a simple TorchX app that downloads some data via HTTP, normalizes the images via torchvision and then reuploads it via fsspec.

This examples has two Python files: the app which actually does the preprocessing and the component definition which can be used with TorchX to launch the app.

Lightning Trainer Example

This example consists of model training and interpretability apps that uses PyTorch Lightning. The apps have shared logic so are split across several files.

The trainer and interpret apps do not have any TorchX-isms and are simply torchvision and Captum applications. TorchX helps you run these applications on various schedulers and localhost. The trainer app is a distributed data parallel style application and is launched with the dist.ddp built-in. The interpret app is a single node application and is launched as a regular python process with the utils.python built-in.

For instructions on how to run these apps with TorchX refer to the documentations in their respective main modules: train.py and interpret.py.