Shortcuts

torchaudio.prototype.pipelines

The pipelines subpackage contains APIs to models with pretrained weights and relevant utilities.

RNN-T Streaming/Non-Streaming ASR

Pretrained Models

EMFORMER_RNNT_BASE_MUSTC

Pre-trained Emformer-RNNT-based ASR pipeline capable of performing both streaming and non-streaming inference.

EMFORMER_RNNT_BASE_TEDLIUM3

Pre-trained Emformer-RNNT-based ASR pipeline capable of performing both streaming and non-streaming inference.

HiFiGAN Vocoder

Interface

HiFiGANVocoderBundle defines HiFiGAN Vocoder pipeline capable of transforming mel spectrograms into waveforms.

HiFiGANVocoderBundle

Data class that bundles associated information to use pretrained HiFiGANVocoder.

Pretrained Models

HIFIGAN_VOCODER_V3_LJSPEECH

HiFiGAN Vocoder pipeline, trained on The LJ Speech Dataset [Ito and Johnson, 2017].

VGGish

Interface

VGGishBundle

VGGish [Hershey et al., 2017] inference pipeline ported from torchvggish and tensorflow-models.

VGGishBundle.VGGish

Implementation of VGGish model [Hershey et al., 2017].

VGGishBundle.VGGishInputProcessor

Converts raw waveforms to batches of examples to use as inputs to VGGish.

Pretrained Models

VGGISH

Pre-trained VGGish [Hershey et al., 2017] inference pipeline ported from torchvggish and tensorflow-models.

Docs

Access comprehensive developer documentation for PyTorch

View Docs

Tutorials

Get in-depth tutorials for beginners and advanced developers

View Tutorials

Resources

Find development resources and get your questions answered

View Resources