Shortcuts

Decompressor

class torchdata.datapipes.iter.Decompressor(source_datapipe: IterDataPipe[Tuple[str, IOBase]], file_type: Optional[Union[str, CompressionType]] = None)

Takes tuples of path and compressed stream of data, and returns tuples of path and decompressed stream of data (functional name: decompress). The input compression format can be specified or automatically detected based on the files’ file extensions.

Parameters:
  • source_datapipe – IterDataPipe containing tuples of path and compressed stream of data

  • file_type – Optional string or CompressionType that represents what compression format of the inputs

Example

>>> from torchdata.datapipes.iter import FileLister, FileOpener
>>> tar_file_dp = FileLister(self.temp_dir.name, "*.tar")
>>> tar_load_dp = FileOpener(tar_file_dp, mode="b")
>>> tar_decompress_dp = Decompressor(tar_load_dp, file_type="tar")
>>> for _, stream in tar_decompress_dp:
>>>     print(stream.read())
b'0123456789abcdef'

Docs

Access comprehensive developer documentation for PyTorch

View Docs

Tutorials

Get in-depth tutorials for beginners and advanced developers

View Tutorials

Resources

Find development resources and get your questions answered

View Resources