Wav2vec 2.0 model (“large-lv60k” architecture), pre-trained on 60,000 hours of unlabeled audio from Libri-Light dataset [Kahn et al., 2020], not fine-tuned.
Originally published by the authors of wav2vec 2.0 [Baevski et al., 2020] under MIT License and redistributed with the same license. [License, Source]
Please refer to
torchaudio.pipelines.Wav2Vec2Bundlefor the usage.