Shortcuts

AISFileLister

class torchdata.datapipes.iter.AISFileLister(source_datapipe: IterDataPipe[str], url: str, length: int = -1)

Iterable Datapipe that lists files from the AIStore backends with the given URL prefixes (functional name: list_files_by_ais). Acceptable prefixes include but not limited to - ais://bucket-name, ais://bucket-name/

Note

  • This function also supports files from multiple backends (aws://.., gcp://.., azure://.., etc)

  • Input must be a list and direct URLs are not supported.

  • length is -1 by default, all calls to len() are invalid as

    not all items are iterated at the start.

  • This internally uses AIStore Python SDK.

Parameters:
  • source_datapipe (IterDataPipe[str]) – a DataPipe that contains URLs/URL prefixes to objects on AIS

  • url (str) – AIStore endpoint

  • length (int) – length of the datapipe

Example

>>> from torchdata.datapipes.iter import IterableWrapper, AISFileLister
>>> ais_prefixes = IterableWrapper(['gcp://bucket-name/folder/', 'aws:bucket-name/folder/', 'ais://bucket-name/folder/', ...])
>>> dp_ais_urls = AISFileLister(url='localhost:8080', source_datapipe=ais_prefixes)
>>> for url in dp_ais_urls:
...     pass
>>> # Functional API
>>> dp_ais_urls = ais_prefixes.list_files_by_ais(url='localhost:8080')
>>> for url in dp_ais_urls:
...     pass

Docs

Access comprehensive developer documentation for PyTorch

View Docs

Tutorials

Get in-depth tutorials for beginners and advanced developers

View Tutorials

Resources

Find development resources and get your questions answered

View Resources