torchaudio.prototype.pipelines¶
The pipelines subpackage contains APIs to models with pretrained weights and relevant utilities.
RNN-T Streaming/Non-Streaming ASR¶
Pretrained Models¶
Pre-trained Emformer-RNNT-based ASR pipeline capable of performing both streaming and non-streaming inference. |
|
Pre-trained Emformer-RNNT-based ASR pipeline capable of performing both streaming and non-streaming inference. |
HiFiGAN Vocoder¶
Interface¶
HiFiGANVocoderBundle
defines HiFiGAN Vocoder pipeline capable of transforming mel spectrograms into waveforms.
Data class that bundles associated information to use pretrained |
Pretrained Models¶
HiFiGAN Vocoder pipeline, trained on The LJ Speech Dataset [Ito and Johnson, 2017]. |
Squim Objective¶
Interface¶
SquimObjectiveBundle
defines speech quality and intelligibility measurement (SQUIM) pipeline that can predict objecive metric scores given the input waveform.
Data class that bundles associated information to use pretrained |
Pretrained Models¶
SquimObjective pipeline trained using approach described in [Kumar et al., 2023] on the DNS 2020 Dataset [Reddy et al., 2020]. |
Squim Subjective¶
Interface¶
SquimSubjectiveBundle
defines speech quality and intelligibility measurement (SQUIM) pipeline that can predict subjective metric scores given the input waveform.
Data class that bundles associated information to use pretrained |
Pretrained Models¶
SquimSubjective pipeline trained as described in [Manocha and Kumar, 2022] and [Kumar et al., 2023] on the BVCC [Cooper and Yamagishi, 2021] and DAPS [Mysore, 2014] datasets. |