WavLM Base model (“base” architecture), pre-trained on 960 hours of unlabeled audio from LibriSpeech dataset [Panayotov et al., 2015], not fine-tuned.
Originally published by the authors of WavLM [Chen et al., 2022] under MIT License and redistributed with the same license. [License, Source]
Please refer to
torchaudio.pipelines.Wav2Vec2Bundlefor the usage.