Shortcuts

TorchX

TorchX is an SDK for quickly building and deploying ML applications from R&D to production. It offers various builtin components that encode MLOps best practices and make advanced features like distributed training and hyperparameter optimization accessible to all. Users can get started with TorchX with no added setup cost since it supports popular ML schedulers and pipeline orchestrators that are already widely adopted and deployed in production.

No two production environments are the same. To comply with various use cases, TorchX’s core APIs allow tons of customization at well-defined extension points so that even the most unique applications can be serviced without customizing the whole vertical stack.

GETTING STARTED? First learn the basic concepts and follow the quickstart guide.

_images/torchx_index_diag.png

In 1-2-3

01 DEFINE OR CHOOSE Start by writing a component – a python function that returns an AppDef object for your application. Or you can choose one of the builtin components.

02 RUN AS A JOB Once you’ve defined or chosen a component, you can run it by submitting it as a job in one of the supported Schedulers. TorchX supports several popular ones, such as Kubernetes and SLURM out of the box.

03 CONVERT TO PIPELINE In production, components are often run as a workflow (aka pipeline). TorchX components can be converted to pipeline stages by passing them through the torchx.pipelines adapter. Pipelines lists the pipeline orchestrators supported out of the box.

Documentation

Components Library

Runtime Library

Works With

Pipeline Adapters

Experimental

Experimental Features

Docs

Access comprehensive developer documentation for PyTorch

View Docs

Tutorials

Get in-depth tutorials for beginners and advanced developers

View Tutorials

Resources

Find development resources and get your questions answered

View Resources