DatasetFolder¶
- class torchvision.datasets.DatasetFolder(root: str, loader: Callable[[str], Any], extensions: Optional[Tuple[str, ...]] = None, transform: Optional[Callable] = None, target_transform: Optional[Callable] = None, is_valid_file: Optional[Callable[[str], bool]] = None)[source]¶
A generic data loader.
This default directory structure can be customized by overriding the
find_classes()
method.- Parameters:
root (string) – Root directory path.
loader (callable) – A function to load a sample given its path.
extensions (tuple[string]) – A list of allowed extensions. both extensions and is_valid_file should not be passed.
transform (callable, optional) – A function/transform that takes in a sample and returns a transformed version. E.g,
transforms.RandomCrop
for images.target_transform (callable, optional) – A function/transform that takes in the target and transforms it.
is_valid_file – A function that takes path of a file and check if the file is a valid file (used to check of corrupt files) both extensions and is_valid_file should not be passed.
- find_classes(directory: str) Tuple[List[str], Dict[str, int]] [source]¶
Find the class folders in a dataset structured as follows:
directory/ ├── class_x │ ├── xxx.ext │ ├── xxy.ext │ └── ... │ └── xxz.ext └── class_y ├── 123.ext ├── nsdf3.ext └── ... └── asd932_.ext
This method can be overridden to only consider a subset of classes, or to adapt to a different dataset directory structure.
- Parameters:
directory (str) – Root directory path, corresponding to
self.root
- Raises:
FileNotFoundError – If
dir
has no class folders.- Returns:
List of all classes and dictionary mapping each class to an index.
- Return type:
- static make_dataset(directory: str, class_to_idx: Dict[str, int], extensions: Optional[Tuple[str, ...]] = None, is_valid_file: Optional[Callable[[str], bool]] = None) List[Tuple[str, int]] [source]¶
Generates a list of samples of a form (path_to_sample, class).
This can be overridden to e.g. read files from a compressed zip file instead of from the disk.
- Parameters:
directory (str) – root dataset directory, corresponding to
self.root
.class_to_idx (Dict[str, int]) – Dictionary mapping class name to class index.
extensions (optional) – A list of allowed extensions. Either extensions or is_valid_file should be passed. Defaults to None.
is_valid_file (optional) – A function that takes path of a file and checks if the file is a valid file (used to check of corrupt files) both extensions and is_valid_file should not be passed. Defaults to None.
- Raises:
ValueError – In case
class_to_idx
is empty.ValueError – In case
extensions
andis_valid_file
are None or both are not None.FileNotFoundError – In case no valid file was found for any class.
- Returns:
samples of a form (path_to_sample, class)
- Return type: