torchaudio.prototype.models

The torchaudio.prototype.models subpackage contains definitions of models for addressing common audio tasks.

Note

For models with pre-trained parameters, please refer to torchaudio.prototype.pipelines module.

Model defintions are responsible for constructing computation graphs and executing them.

Some models have complex structure and variations. For such models, factory functions are provided.

`ConformerWav2Vec2PretrainModel`	Conformer Wav2Vec2 pre-train model for training from scratch.
`ConvEmformer`	Implements the convolution-augmented streaming transformer architecture introduced in Streaming Transformer Transducer based Speech Recognition Using Non-Causal Convolution [Shi et al., 2022].
`HiFiGANVocoder`	Generator part of HiFi GAN [Kong et al., 2020].

Prototype Factory Functions of Beta Models

Some model definitions are in beta, but there are new factory functions that are still in prototype. Please check “Prototype Factory Functions” section in each model.

`Wav2Vec2Model`	Acoustic model used in wav2vec 2.0 [Baevski et al., 2020].
`RNNT`	Recurrent neural network transducer (RNN-T) model.

torchaudio.prototype.models

Prototype Factory Functions of Beta Models

Docs

Tutorials

Resources