Shortcuts

TACOTRON2_GRIFFINLIM_PHONE_LJSPEECH

torchaudio.pipelines.TACOTRON2_GRIFFINLIM_PHONE_LJSPEECH

Phoneme-based TTS pipeline with Tacotron2 trained on LJSpeech [Ito and Johnson, 2017] for 1,500 epochs and GriffinLim as vocoder.

The text processor encodes the input texts based on phoneme. It uses DeepPhonemizer to convert graphemes to phonemes. The model (en_us_cmudict_forward) was trained on CMUDict.

You can find the training script here. The text processor is set to the “english_phonemes”.

Please refer to torchaudio.pipelines.Tacotron2TTSBundle() for the usage.

Example - “Hello world! T T S stands for Text to Speech!”

Spectrogram generated by Tacotron2

Example - “The examination and testimony of the experts enabled the Commission to conclude that five shots may have been fired,”

Spectrogram generated by Tacotron2

Docs

Access comprehensive developer documentation for PyTorch

View Docs

Tutorials

Get in-depth tutorials for beginners and advanced developers

View Tutorials

Resources

Find development resources and get your questions answered

View Resources