OnlineReader¶
- class torchdata.datapipes.iter.OnlineReader(source_datapipe: IterDataPipe[str], *, timeout: Optional[float] = None)¶
Takes file URLs (can be HTTP URLs pointing to files or URLs to GDrive files), and yields tuples of file URL and IO stream (functional name:
read_from_remote
).- Parameters:
source_datapipe – a DataPipe that contains URLs
timeout – timeout in seconds for HTTP request
Example
>>> from torchdata.datapipes.iter import IterableWrapper, OnlineReader >>> file_url = "https://raw.githubusercontent.com/pytorch/data/main/LICENSE" >>> online_reader_dp = OnlineReader(IterableWrapper([file_url])) >>> reader_dp = online_reader_dp.readlines() >>> it = iter(reader_dp) >>> path, line = next(it) >>> path https://raw.githubusercontent.com/pytorch/data/main/LICENSE >>> line b'BSD 3-Clause License'