torchtext.models¶

RobertaModelBundle¶

class torchtext.models.RobertaModelBundle(_params: torchtext.models.RobertaEncoderParams, _path: Optional[str] = None, _head: Optional[torch.nn.Module] = None, transform: Optional[Callable] = None)[source]¶

Example - Pretrained base xlmr encoder

>>> import torch, torchtext
>>> from torchtext.functional import to_tensor
>>> xlmr_base = torchtext.models.XLMR_BASE_ENCODER
>>> model = xlmr_base.get_model()
>>> transform = xlmr_base.transform()
>>> input_batch = ["Hello world", "How are you!"]
>>> model_input = to_tensor(transform(input_batch), padding_value=1)
>>> output = model(model_input)
>>> output.shape
torch.Size([2, 6, 768])

Example - Pretrained large xlmr encoder attached to un-initialized classification head

>>> import torch, torchtext
>>> from torchtext.models import RobertaClassificationHead
>>> from torchtext.functional import to_tensor
>>> xlmr_large = torchtext.models.XLMR_LARGE_ENCODER
>>> classifier_head = torchtext.models.RobertaClassificationHead(num_classes=2, input_dim = 1024)
>>> model = xlmr_large.get_model(head=classifier_head)
>>> transform = xlmr_large.transform()
>>> input_batch = ["Hello world", "How are you!"]
>>> model_input = to_tensor(transform(input_batch), padding_value=1)
>>> output = model(model_input)
>>> output.shape
torch.Size([1, 2])

Example - User-specified configuration and checkpoint

>>> from torchtext.models import RobertaEncoderConf, RobertaModelBundle, RobertaClassificationHead
>>> model_weights_path = "https://download.pytorch.org/models/text/xlmr.base.encoder.pt"
>>> encoder_conf = RobertaEncoderConf(vocab_size=250002)
>>> classifier_head = RobertaClassificationHead(num_classes=2, input_dim=768)
>>> model = RobertaModelBundle.build_model(encoder_conf=encoder_conf, head=classifier_head, checkpoint=model_weights_path)

get_model(head: Optional[torch.nn.Module] = None, load_weights: bool = True, freeze_encoder: bool = False, *, dl_kwargs=None) → torctext.models.RobertaModel[source]¶

Parameters

head (nn.Module) – A module to be attached to the encoder to perform specific task. If provided, it will replace the default member head (Default: None)
load_weights (bool) – Indicates whether or not to load weights if available. (Default: True)
freeze_encoder (bool) – Indicates whether or not to freeze the encoder weights. (Default: False)
dl_kwargs (dictionary of keyword arguments) – Passed to torch.hub.load_state_dict_from_url(). (Default: None)

XLMR_BASE_ENCODER¶

torchtext.models.XLMR_BASE_ENCODER¶

XLM-R Encoder with Base configuration

The XLM-RoBERTa model was proposed in Unsupervised Cross-lingual Representation Learning at Scale <https://arxiv.org/abs/1911.02116>. It is a large multi-lingual language model, trained on 2.5TB of filtered CommonCrawl data and based on the RoBERTa model architecture.

Originally published by the authors of XLM-RoBERTa under MIT License and redistributed with the same license. [License, Source]

Please refer to torchtext.models.RobertaModelBundle() for the usage.

XLMR_LARGE_ENCODER¶

torchtext.models.XLMR_LARGE_ENCODER¶

XLM-R Encoder with Large configuration

The XLM-RoBERTa model was proposed in Unsupervised Cross-lingual Representation Learning at Scale <https://arxiv.org/abs/1911.02116>. It is a large multi-lingual language model, trained on 2.5TB of filtered CommonCrawl data and based on the RoBERTa model architecture.

Originally published by the authors of XLM-RoBERTa under MIT License and redistributed with the same license. [License, Source]

Please refer to torchtext.models.RobertaModelBundle() for the usage.

ROBERTA_BASE_ENCODER¶

torchtext.models.ROBERTA_BASE_ENCODER¶

Roberta Encoder with Base configuration

RoBERTa iterates on BERT’s pretraining procedure, including training the model longer, with bigger batches over more data; removing the next sentence prediction objective; training on longer sequences; and dynamically changing the masking pattern applied to the training data.

The RoBERTa model was pretrained on the reunion of five datasets: BookCorpus, English Wikipedia, CC-News, OpenWebText, and STORIES. Together theses datasets contain over a 160GB of text.

Originally published by the authors of RoBERTa under MIT License and redistributed with the same license. [License, Source]

Please refer to torchtext.models.RobertaModelBundle() for the usage.

ROBERTA_LARGE_ENCODER¶

torchtext.models.ROBERTA_LARGE_ENCODER¶

Roberta Encoder with Large configuration

RoBERTa iterates on BERT’s pretraining procedure, including training the model longer, with bigger batches over more data; removing the next sentence prediction objective; training on longer sequences; and dynamically changing the masking pattern applied to the training data.

The RoBERTa model was pretrained on the reunion of five datasets: BookCorpus, English Wikipedia, CC-News, OpenWebText, and STORIES. Together theses datasets contain over a 160GB of text.

Originally published by the authors of RoBERTa under MIT License and redistributed with the same license. [License, Source]

Please refer to torchtext.models.RobertaModelBundle() for the usage.

torchtext.models¶

RobertaModelBundle¶

XLMR_BASE_ENCODER¶

XLMR_LARGE_ENCODER¶

ROBERTA_BASE_ENCODER¶

ROBERTA_LARGE_ENCODER¶

Docs

Tutorials

Resources