Shortcuts

torchtext.utils

reporthook

torchtext.utils.reporthook(t)[source]

https://github.com/tqdm/tqdm.

download_from_url

torchtext.utils.download_from_url(url, path=None, root='.data', overwrite=False, hash_value=None, hash_type='sha256')[source]

Download file, with logic (from tensor2tensor) for Google Drive. Returns the path to the downloaded file.

Parameters
  • url – the url of the file from URL header. (None)

  • path – path where file will be saved

  • root – download folder used to store the file in (.data)

  • overwrite – overwrite existing files (False)

  • hash_value (str, optional) – hash for url (Default: None).

  • hash_type (str, optional) – hash type, among “sha256” and “md5” (Default: "sha256").

Examples

>>> url = 'http://www.quest.dcs.shef.ac.uk/wmt16_files_mmt/validation.tar.gz'
>>> torchtext.utils.download_from_url(url)
>>> url = 'http://www.quest.dcs.shef.ac.uk/wmt16_files_mmt/validation.tar.gz'
>>> torchtext.utils.download_from_url(url)
>>> '.data/validation.tar.gz'

unicode_csv_reader

torchtext.utils.unicode_csv_reader(unicode_csv_data, **kwargs)[source]

Since the standard csv library does not handle unicode in Python 2, we need a wrapper. Borrowed and slightly modified from the Python docs: https://docs.python.org/2/library/csv.html#csv-examples

Parameters

unicode_csv_data – unicode csv data (see example below)

Examples

>>> from torchtext.utils import unicode_csv_reader
>>> import io
>>> with io.open(data_path, encoding="utf8") as f:
>>>     reader = unicode_csv_reader(f)

extract_archive

torchtext.utils.extract_archive(from_path, to_path=None, overwrite=False)[source]

Extract archive.

Parameters
  • from_path – the path of the archive.

  • to_path – the root path of the extracted files (directory of from_path)

  • overwrite – overwrite existing files (False)

Returns

List of paths to extracted files even if not overwritten.

Examples

>>> url = 'http://www.quest.dcs.shef.ac.uk/wmt16_files_mmt/validation.tar.gz'
>>> from_path = './validation.tar.gz'
>>> to_path = './'
>>> torchtext.utils.download_from_url(url, from_path)
>>> torchtext.utils.extract_archive(from_path, to_path)
>>> ['.data/val.de', '.data/val.en']
>>> torchtext.utils.download_from_url(url, from_path)
>>> torchtext.utils.extract_archive(from_path, to_path)
>>> ['.data/val.de', '.data/val.en']

Docs

Access comprehensive developer documentation for PyTorch

View Docs

Tutorials

Get in-depth tutorials for beginners and advanced developers

View Tutorials

Resources

Find development resources and get your questions answered

View Resources