Shortcuts

DatasetFolder

class torchvision.datasets.DatasetFolder(root: Union[str, Path], loader: Callable[[str], Any], extensions: Optional[Tuple[str, ...]] = None, transform: Optional[Callable] = None, target_transform: Optional[Callable] = None, is_valid_file: Optional[Callable[[str], bool]] = None, allow_empty: bool = False)[source]

A generic data loader.

This default directory structure can be customized by overriding the find_classes() method.

Parameters:
  • root (str or pathlib.Path) – Root directory path.

  • loader (callable) – A function to load a sample given its path.

  • extensions (tuple[string]) – A list of allowed extensions. both extensions and is_valid_file should not be passed.

  • transform (callable, optional) – A function/transform that takes in a sample and returns a transformed version. E.g, transforms.RandomCrop for images.

  • target_transform (callable, optional) – A function/transform that takes in the target and transforms it.

  • is_valid_file (callable, optional) – A function that takes path of a file and check if the file is a valid file (used to check of corrupt files) both extensions and is_valid_file should not be passed.

  • allow_empty – If True, empty folders are considered to be valid classes. An error is raised on empty folders if False (default).

find_classes(directory: Union[str, Path]) Tuple[List[str], Dict[str, int]][source]

Find the class folders in a dataset structured as follows:

directory/
├── class_x
│   ├── xxx.ext
│   ├── xxy.ext
│   └── ...
│       └── xxz.ext
└── class_y
    ├── 123.ext
    ├── nsdf3.ext
    └── ...
    └── asd932_.ext

This method can be overridden to only consider a subset of classes, or to adapt to a different dataset directory structure.

Parameters:

directory (str) – Root directory path, corresponding to self.root

Raises:

FileNotFoundError – If dir has no class folders.

Returns:

List of all classes and dictionary mapping each class to an index.

Return type:

(Tuple[List[str], Dict[str, int]])

static make_dataset(directory: Union[str, Path], class_to_idx: Dict[str, int], extensions: Optional[Tuple[str, ...]] = None, is_valid_file: Optional[Callable[[str], bool]] = None, allow_empty: bool = False) List[Tuple[str, int]][source]

Generates a list of samples of a form (path_to_sample, class).

This can be overridden to e.g. read files from a compressed zip file instead of from the disk.

Parameters:
  • directory (str) – root dataset directory, corresponding to self.root.

  • class_to_idx (Dict[str, int]) – Dictionary mapping class name to class index.

  • extensions (optional) – A list of allowed extensions. Either extensions or is_valid_file should be passed. Defaults to None.

  • is_valid_file (optional) – A function that takes path of a file and checks if the file is a valid file (used to check of corrupt files) both extensions and is_valid_file should not be passed. Defaults to None.

  • allow_empty (bool, optional) – If True, empty folders are considered to be valid classes. An error is raised on empty folders if False (default).

Raises:
  • ValueError – In case class_to_idx is empty.

  • ValueError – In case extensions and is_valid_file are None or both are not None.

  • FileNotFoundError – In case no valid file was found for any class.

Returns:

samples of a form (path_to_sample, class)

Return type:

List[Tuple[str, int]]

Docs

Access comprehensive developer documentation for PyTorch

View Docs

Tutorials

Get in-depth tutorials for beginners and advanced developers

View Tutorials

Resources

Find development resources and get your questions answered

View Resources