LIBRISPEECH

class torchaudio.datasets.LIBRISPEECH(root: Union[str, Path], url: str = 'train-clean-100', folder_in_archive: str = 'LibriSpeech', download: bool = False)[source]

LibriSpeech [Panayotov et al., 2015] dataset.

Parameters:

root (str or Path) – Path to the directory where the dataset is found or downloaded.
url (str, optional) – The URL to download the dataset from, or the type of the dataset to dowload. Allowed type values are "dev-clean", "dev-other", "test-clean", "test-other", "train-clean-100", "train-clean-360" and "train-other-500". (default: "train-clean-100")
folder_in_archive (str, optional) – The top-level directory of the dataset. (default: "LibriSpeech")
download (bool, optional) – Whether to download the dataset if it is not found at root path. (default: False).

getitem

LIBRISPEECH.__getitem__(n: int) → Tuple[Tensor, int, str, int, int, int][source]

Load the n-th sample from the dataset.

Parameters:

n (int) – The index of the sample to be loaded

Returns:

Tuple of the following items;

Tensor:: Waveform
int:: Sample rate
str:: Transcript
int:: Speaker ID
int:: Chapter ID
int:: Utterance ID

get_metadata

LIBRISPEECH.get_metadata(n: int) → Tuple[Tensor, int, str, int, int, int][source]

Get metadata for the n-th sample from the dataset. Returns filepath instead of waveform, but otherwise returns the same fields as __getitem__().

Parameters:

n (int) – The index of the sample to be loaded

Returns:

Tuple of the following items;

str:: Path to audio
int:: Sample rate
str:: Transcript
int:: Speaker ID
int:: Chapter ID
int:: Utterance ID

LIBRISPEECH

getitem

get_metadata

Docs

Tutorials

Resources