HUBERT_ASR_XLARGE

torchaudio.pipelines.HUBERT_ASR_XLARGE

HuBERT model (“extra large” architecture), pre-trained on 60,000 hours of unlabeled audio from Libri-Light dataset [Kahn et al., 2020], and fine-tuned for ASR on 960 hours of transcribed audio from LibriSpeech dataset [Panayotov et al., 2015] (the combination of “train-clean-100”, “train-clean-360”, and “train-other-500”).

Originally published by the authors of HuBERT [Hsu et al., 2021] under MIT License and redistributed with the same license. [License, Source]

Please refer to torchaudio.pipelines.Wav2Vec2ASRBundle for the usage.

HUBERT_ASR_XLARGE

Docs

Tutorials

Resources