VoxCeleb1Verification¶
- class torchaudio.datasets.VoxCeleb1Verification(root: Union[str, Path], meta_url: str = 'https://www.robots.ox.ac.uk/~vgg/data/voxceleb/meta/veri_test.txt', download: bool = False)[source]¶
VoxCeleb1 [Nagrani et al., 2017] dataset for speaker verification task.
Each data sample contains a pair of waveforms, sample rate, the label indicating if they are from the same speaker, and the file ids.
- Parameters:
root (str or Path) – Path to the directory where the dataset is found or downloaded.
meta_url (str, optional) – The url of meta file that contains a list of utterance pairs and the corresponding labels. The format of each row is
label file_path1 file_path2". For example: ``1 id10270/x6uYqmx31kE/00001.wav id10270/8jEAjG6SegY/00008.wav
.1
means the two utterances are from the same speaker,0
means not. (Default:"https://www.robots.ox.ac.uk/~vgg/data/voxceleb/meta/veri_test.txt"
)download (bool, optional) – Whether to download the dataset if it is not found at root path. (Default:
False
).
Note
The file structure of VoxCeleb1Verification dataset is as follows:
└─ root/
└─ wav/
└─ speaker_id folders
Users who pre-downloaded the
"vox1_dev_wav.zip"
and"vox1_test_wav.zip"
files need to move the extracted files into the sameroot
directory.
__getitem__¶
- VoxCeleb1Verification.__getitem__(n: int) Tuple[Tensor, Tensor, int, int, str, str] [source]¶
Load the n-th sample from the dataset.
- Parameters:
n (int) – The index of the sample to be loaded.
- Returns:
Tuple of the following items;
- Tensor:
Waveform of speaker 1
- Tensor:
Waveform of speaker 2
- int:
Sample rate
- int:
Label
- str:
File ID of speaker 1
- str:
File ID of speaker 2
get_metadata¶
- VoxCeleb1Verification.get_metadata(n: int) Tuple[str, str, int, int, str, str] [source]¶
Get metadata for the n-th sample from the dataset. Returns filepaths instead of waveforms, but otherwise returns the same fields as
__getitem__()
.- Parameters:
n (int) – The index of the sample
- Returns:
Tuple of the following items;
- str:
Path to audio file of speaker 1
- str:
Path to audio file of speaker 2
- int:
Sample rate
- int:
Label
- str:
File ID of speaker 1
- str:
File ID of speaker 2