Shortcuts

torchaudio.prototype.models

The torchaudio.prototype.models subpackage contains definitions of models for addressing common audio tasks.

Note

For models with pre-trained parameters, please refer to torchaudio.prototype.pipelines module.

Model defintions are responsible for constructing computation graphs and executing them.

Some models have complex structure and variations. For such models, factory functions are provided.

ConformerWav2Vec2PretrainModel

Conformer Wav2Vec2 pre-train model for training from scratch.

ConvEmformer

Implements the convolution-augmented streaming transformer architecture introduced in Streaming Transformer Transducer based Speech Recognition Using Non-Causal Convolution [Shi et al., 2022].

HiFiGANVocoder

Generator part of HiFi GAN [Kong et al., 2020].

Prototype Factory Functions of Beta Models

Some model definitions are in beta, but there are new factory functions that are still in prototype. Please check “Prototype Factory Functions” section in each model.

Wav2Vec2Model

Acoustic model used in wav2vec 2.0 [Baevski et al., 2020].

RNNT

Recurrent neural network transducer (RNN-T) model.

Docs

Access comprehensive developer documentation for PyTorch

View Docs

Tutorials

Get in-depth tutorials for beginners and advanced developers

View Tutorials

Resources

Find development resources and get your questions answered

View Resources