Building from source

TorchAudio integrates PyTorch for numerical computation and third party libraries for multimedia I/O. It requires the following tools to build from source.

PyTorch
CMake
Ninja
C++ complier with C++ 17 support
- GCC (Linux)
- Clang (macOS)
- MSVC 2019 or newer (Windows)
pkg-config (Linux/macOS, if building sox extension)
CUDA toolkit and cuDNN (if building CUDA extension)

Most of the tools are available in Conda, so we recommend using conda.

Customizing the build

TorchAudio’s integration with third party libraries can be enabled/disabled via environment variables.

They can be enabled by passing 1 and disabled by 0.

BUILD_SOX: Enable/disable I/O features based on libsox.
BUILD_KALDI: Enable/disable feature extraction based on Kaldi.
BUILD_RNNT: Enable/disable custom RNN-T loss function.
BUILD_CTC_DECODER: Enable/disable CTC decoder based on Flashlight Text.
USE_FFMPEG: Enable/disable I/O features based on FFmpeg libraries.
USE_ROCM: Enable/disable AMD ROCm support.
USE_CUDA: Enable/disable CUDA support.

For the latest configurations and their default values, please check the source code. https://github.com/pytorch/audio/blob/main/tools/setup_helpers/extension.py

Building from source

Customizing the build

Docs

Tutorials

Resources