Shortcuts

HttpReader

class torchdata.datapipes.iter.HttpReader(source_datapipe: IterDataPipe[str], timeout: Optional[float] = None, skip_on_error: bool = False, **kwargs: Optional[Dict[str, Any]])

Takes file URLs (HTTP URLs pointing to files), and yields tuples of file URL and IO stream (functional name: read_from_http).

Parameters:
  • source_datapipe – a DataPipe that contains URLs

  • timeout – timeout in seconds for HTTP request

  • skip_on_error – whether to skip over urls causing problems, otherwise an exception is raised

  • **kwargs – a Dictionary to pass optional arguments that requests takes. For the full list check out https://docs.python-requests.org/en/master/api/

Example:

from torchdata.datapipes.iter import IterableWrapper, HttpReader

file_url = "https://raw.githubusercontent.com/pytorch/data/main/LICENSE"
query_params = {"auth" : ("fake_username", "fake_password"), "allow_redirects" : True}
timeout = 120
http_reader_dp = HttpReader(IterableWrapper([file_url]), timeout=timeout, **query_params)
reader_dp = http_reader_dp.readlines()
it = iter(reader_dp)
path, line = next(it)
print((path, line))

Output:

('https://raw.githubusercontent.com/pytorch/data/main/LICENSE', b'BSD 3-Clause License')

Docs

Access comprehensive developer documentation for PyTorch

View Docs

Tutorials

Get in-depth tutorials for beginners and advanced developers

View Tutorials

Resources

Find development resources and get your questions answered

View Resources