VoxCeleb1Identification¶
- class torchaudio.datasets.VoxCeleb1Identification(root: Union[str, Path], subset: str = 'train', meta_url: str = 'https://www.robots.ox.ac.uk/~vgg/data/voxceleb/meta/iden_split.txt', download: bool = False)[source]¶
VoxCeleb1 [Nagrani et al., 2017] dataset for speaker identification task.
Each data sample contains the waveform, sample rate, speaker id, and the file id.
- Parameters:
root (str or Path) – Path to the directory where the dataset is found or downloaded.
subset (str, optional) – Subset of the dataset to use. Options: [“train”, “dev”, “test”]. (Default:
"train"
)meta_url (str, optional) – The url of meta file that contains the list of subset labels and file paths. The format of each row is
subset file_path". For example: ``1 id10006/nLEBBc9oIFs/00003.wav
.1
,2
,3
meantrain
,dev
, andtest
subest, respectively. (Default:"https://www.robots.ox.ac.uk/~vgg/data/voxceleb/meta/iden_split.txt"
)download (bool, optional) – Whether to download the dataset if it is not found at root path. (Default:
False
).
Note
The file structure of VoxCeleb1Identification dataset is as follows:
└─ root/
└─ wav/
└─ speaker_id folders
Users who pre-downloaded the
"vox1_dev_wav.zip"
and"vox1_test_wav.zip"
files need to move the extracted files into the sameroot
directory.
__getitem__¶
get_metadata¶
- VoxCeleb1Identification.get_metadata(n: int) Tuple[str, int, int, str] [source]¶
Get metadata for the n-th sample from the dataset. Returns filepath instead of waveform, but otherwise returns the same fields as
__getitem__()
.- Parameters:
n (int) – The index of the sample
- Returns:
Tuple of the following items;
- str:
Path to audio
- int:
Sample rate
- int:
Speaker ID
- str:
File ID