IoPathSaver¶
- class torchdata.datapipes.iter.IoPathSaver(source_datapipe: IterDataPipe[Tuple[Any, Union[bytes, bytearray, str]]], mode: str = 'w', filepath_fn: Optional[Callable] = None, *, pathmgr=None)¶
Takes in a DataPipe of tuples of metadata and data, saves the data to the target path which is generated by the
filepath_fn
and metadata, and yields the resulting path in iopath format (functional name:save_by_iopath
).- Parameters:
source_datapipe – Iterable DataPipe with tuples of metadata and data
mode – Mode in which the file will be opened for write the data (
"w"
by default)filepath_fn – Function that takes in metadata and returns the target path of the new file
pathmgr – Custom
iopath.PathManager
. If not specified, a defaultPathManager
is created.
Note
Default
PathManager
currently supports local file path, normal HTTP URL and OneDrive URL. S3 URL is supported only with iopath>=0.1.9.Example
>>> from torchdata.datapipes.iter import IterableWrapper >>> def filepath_fn(name: str) -> str: >>> return S3URL + name >>> name_to_data = {"1.txt": b"DATA1", "2.txt": b"DATA2", "3.txt": b"DATA3"} >>> source_dp = IterableWrapper(sorted(name_to_data.items())) >>> iopath_saver_dp = source_dp.save_by_iopath(filepath_fn=filepath_fn, mode="wb") >>> res_file_paths = list(iopath_saver_dp)