Shortcuts

CMUDict

class torchaudio.datasets.CMUDict(root: Union[str, Path], exclude_punctuations: bool = True, *, download: bool = False, url: str = 'http://svn.code.sf.net/p/cmusphinx/code/trunk/cmudict/cmudict-0.7b', url_symbols: str = 'http://svn.code.sf.net/p/cmusphinx/code/trunk/cmudict/cmudict-0.7b.symbols')[source]

CMU Pronouncing Dictionary [Weide, 1998] (CMUDict) dataset.

Parameters:
  • root (str or Path) – Path to the directory where the dataset is found or downloaded.

  • exclude_punctuations (bool, optional) – When enabled, exclude the pronounciation of punctuations, such as !EXCLAMATION-POINT and #HASH-MARK.

  • download (bool, optional) – Whether to download the dataset if it is not found at root path. (default: False).

  • url (str, optional) – The URL to download the dictionary from. (default: "http://svn.code.sf.net/p/cmusphinx/code/trunk/cmudict/cmudict-0.7b")

  • url_symbols (str, optional) – The URL to download the list of symbols from. (default: "http://svn.code.sf.net/p/cmusphinx/code/trunk/cmudict/cmudict-0.7b.symbols")

Properties

symbols

property CMUDict.symbols: List[str]

A list of phonemes symbols, such as "AA", "AE", "AH".

Type:

list[str]

Methods

__getitem__

CMUDict.__getitem__(n: int) Tuple[str, List[str]][source]

Load the n-th sample from the dataset.

Parameters:

n (int) – The index of the sample to be loaded.

Returns:

Tuple of a word and its phonemes

str:

Word

List[str]:

Phonemes

Docs

Access comprehensive developer documentation for PyTorch

View Docs

Tutorials

Get in-depth tutorials for beginners and advanced developers

View Tutorials

Resources

Find development resources and get your questions answered

View Resources