FSSpecSaver¶
- class torchdata.datapipes.iter.FSSpecSaver(source_datapipe: IterDataPipe[Tuple[Any, Union[bytes, bytearray, str]]], mode: str = 'w', filepath_fn: Optional[Callable] = None, *, kwargs_for_open: Optional[Dict] = None, **kwargs)¶
Takes in a DataPipe of tuples of metadata and data, saves the data to the target path (generated by the filepath_fn and metadata), and yields the resulting fsspec path (functional name:
save_by_fsspec
).- Parameters:
source_datapipe – Iterable DataPipe with tuples of metadata and data
mode – Mode in which the file will be opened for write the data (
"w"
by default)filepath_fn – Function that takes in metadata and returns the target path of the new file
kwargs_for_open – Optional Dict to specify kwargs for opening files (
fs.open()
)kwargs – Extra options that are used to establish a particular storage connection, e.g. host, port, username, password, etc.
Example:
from torchdata.datapipes.iter import IterableWrapper def filepath_fn(name: str) -> str: return file_prefix + name name_to_data = {"1.txt": b"DATA1", "2.txt": b"DATA2", "3.txt": b"DATA3"} source_dp = IterableWrapper(sorted(name_to_data.items())) fsspec_saver_dp = source_dp.save_by_fsspec(filepath_fn=filepath_fn, mode="wb") res_file_paths = list(fsspec_saver_dp)