HuggingFaceHubReader¶
- class torchdata.datapipes.iter.HuggingFaceHubReader(dataset: str, **config_kwargs)¶
Takes in dataset names and returns an Iterable HuggingFace dataset. Please refer to https://huggingface.co/docs/datasets/loading for the meaning and type of each argument. Contrary to their implementation, default behavior differs in the following:
streaming
is set toTrue
- Parameters:
dataset – path or name of the dataset
**config_kwargs – additional arguments for
datasets.load_dataset()
Example:
huggingface_reader_dp = HuggingFaceHubReader("lhoestq/demo1", revision="main") elem = next(iter(huggingface_reader_dp)) assert elem["package_name"] == "com.mantz_it.rfanalyzer"