LIBRISPEECH¶
- class torchaudio.datasets.LIBRISPEECH(root: Union[str, Path], url: str = 'train-clean-100', folder_in_archive: str = 'LibriSpeech', download: bool = False)[source]¶
LibriSpeech [Panayotov et al., 2015] dataset.
- Parameters:
root (str or Path) – Path to the directory where the dataset is found or downloaded.
url (str, optional) – The URL to download the dataset from, or the type of the dataset to dowload. Allowed type values are
"dev-clean"
,"dev-other"
,"test-clean"
,"test-other"
,"train-clean-100"
,"train-clean-360"
and"train-other-500"
. (default:"train-clean-100"
)folder_in_archive (str, optional) – The top-level directory of the dataset. (default:
"LibriSpeech"
)download (bool, optional) – Whether to download the dataset if it is not found at root path. (default:
False
).
__getitem__¶
- LIBRISPEECH.__getitem__(n: int) Tuple[Tensor, int, str, int, int, int] [source]¶
Load the n-th sample from the dataset.
- Parameters:
n (int) – The index of the sample to be loaded
- Returns:
Tuple of the following items;
- Tensor:
Waveform
- int:
Sample rate
- str:
Transcript
- int:
Speaker ID
- int:
Chapter ID
- int:
Utterance ID
get_metadata¶
- LIBRISPEECH.get_metadata(n: int) Tuple[Tensor, int, str, int, int, int] [source]¶
Get metadata for the n-th sample from the dataset. Returns filepath instead of waveform, but otherwise returns the same fields as
__getitem__()
.- Parameters:
n (int) – The index of the sample to be loaded
- Returns:
Tuple of the following items;
- str:
Path to audio
- int:
Sample rate
- str:
Transcript
- int:
Speaker ID
- int:
Chapter ID
- int:
Utterance ID