EmbeddingBag¶
- class torch.nn.EmbeddingBag(num_embeddings, embedding_dim, max_norm=None, norm_type=2.0, scale_grad_by_freq=False, mode='mean', sparse=False, _weight=None, include_last_offset=False, padding_idx=None, device=None, dtype=None)[source]¶
Computes sums or means of ‘bags’ of embeddings, without instantiating the intermediate embeddings.
For bags of constant length, no
per_sample_weights
, no indices equal topadding_idx
, and with 2D inputs, this classHowever,
EmbeddingBag
is much more time and memory efficient than using a chain of these operations.EmbeddingBag also supports per-sample weights as an argument to the forward pass. This scales the output of the Embedding before performing a weighted reduction as specified by
mode
. Ifper_sample_weights
is passed, the only supportedmode
is"sum"
, which computes a weighted sum according toper_sample_weights
.- Parameters:
num_embeddings (int) – size of the dictionary of embeddings
embedding_dim (int) – the size of each embedding vector
max_norm (float, optional) – If given, each embedding vector with norm larger than
max_norm
is renormalized to have normmax_norm
.norm_type (float, optional) – The p of the p-norm to compute for the
max_norm
option. Default2
.scale_grad_by_freq (bool, optional) – if given, this will scale gradients by the inverse of frequency of the words in the mini-batch. Default
False
. Note: this option is not supported whenmode="max"
.mode (str, optional) –
"sum"
,"mean"
or"max"
. Specifies the way to reduce the bag."sum"
computes the weighted sum, takingper_sample_weights
into consideration."mean"
computes the average of the values in the bag,"max"
computes the max value over each bag. Default:"mean"
sparse (bool, optional) – if
True
, gradient w.r.t.weight
matrix will be a sparse tensor. See Notes for more details regarding sparse gradients. Note: this option is not supported whenmode="max"
.include_last_offset (bool, optional) – if
True
,offsets
has one additional element, where the last element is equivalent to the size of indices. This matches the CSR format.padding_idx (int, optional) – If specified, the entries at
padding_idx
do not contribute to the gradient; therefore, the embedding vector atpadding_idx
is not updated during training, i.e. it remains as a fixed “pad”. For a newly constructed EmbeddingBag, the embedding vector atpadding_idx
will default to all zeros, but can be updated to another value to be used as the padding vector. Note that the embedding vector atpadding_idx
is excluded from the reduction.
- Variables:
weight (Tensor) – the learnable weights of the module of shape (num_embeddings, embedding_dim) initialized from $\mathcal{N}(0, 1)$.
Examples:
>>> # an EmbeddingBag module containing 10 tensors of size 3 >>> embedding_sum = nn.EmbeddingBag(10, 3, mode='sum') >>> # a batch of 2 samples of 4 indices each >>> input = torch.tensor([1, 2, 4, 5, 4, 3, 2, 9], dtype=torch.long) >>> offsets = torch.tensor([0, 4], dtype=torch.long) >>> embedding_sum(input, offsets) tensor([[-0.8861, -5.4350, -0.0523], [ 1.1306, -2.5798, -1.0044]]) >>> # Example with padding_idx >>> embedding_sum = nn.EmbeddingBag(10, 3, mode='sum', padding_idx=2) >>> input = torch.tensor([2, 2, 2, 2, 4, 3, 2, 9], dtype=torch.long) >>> offsets = torch.tensor([0, 4], dtype=torch.long) >>> embedding_sum(input, offsets) tensor([[ 0.0000, 0.0000, 0.0000], [-0.7082, 3.2145, -2.6251]]) >>> # An EmbeddingBag can be loaded from an Embedding like so >>> embedding = nn.Embedding(10, 3, padding_idx=2) >>> embedding_sum = nn.EmbeddingBag.from_pretrained( embedding.weight, padding_idx=embedding.padding_idx, mode='sum')
- forward(input, offsets=None, per_sample_weights=None)[source]¶
Forward pass of EmbeddingBag.
- Parameters:
input (Tensor) – Tensor containing bags of indices into the embedding matrix.
offsets (Tensor, optional) – Only used when
input
is 1D.offsets
determines the starting index position of each bag (sequence) ininput
.per_sample_weights (Tensor, optional) – a tensor of float / double weights, or None to indicate all weights should be taken to be
1
. If specified,per_sample_weights
must have exactly the same shape as input and is treated as having the sameoffsets
, if those are notNone
. Only supported formode='sum'
.
- Returns:
Tensor output shape of (B, embedding_dim).
- Return type:
Note
A few notes about
input
andoffsets
:input
andoffsets
have to be of the same type, either int or longIf
input
is 2D of shape (B, N), it will be treated asB
bags (sequences) each of fixed lengthN
, and this will returnB
values aggregated in a way depending on themode
.offsets
is ignored and required to beNone
in this case.If
input
is 1D of shape (N), it will be treated as a concatenation of multiple bags (sequences).offsets
is required to be a 1D tensor containing the starting index positions of each bag ininput
. Therefore, foroffsets
of shape (B),input
will be viewed as havingB
bags. Empty bags (i.e., having 0-length) will have returned vectors filled by zeros.
- classmethod from_pretrained(embeddings, freeze=True, max_norm=None, norm_type=2.0, scale_grad_by_freq=False, mode='mean', sparse=False, include_last_offset=False, padding_idx=None)[source]¶
Creates EmbeddingBag instance from given 2-dimensional FloatTensor.
- Parameters:
embeddings (Tensor) – FloatTensor containing weights for the EmbeddingBag. First dimension is being passed to EmbeddingBag as ‘num_embeddings’, second as ‘embedding_dim’.
freeze (bool, optional) – If
True
, the tensor does not get updated in the learning process. Equivalent toembeddingbag.weight.requires_grad = False
. Default:True
max_norm (float, optional) – See module initialization documentation. Default:
None
norm_type (float, optional) – See module initialization documentation. Default
2
.scale_grad_by_freq (bool, optional) – See module initialization documentation. Default
False
.mode (str, optional) – See module initialization documentation. Default:
"mean"
sparse (bool, optional) – See module initialization documentation. Default:
False
.include_last_offset (bool, optional) – See module initialization documentation. Default:
False
.padding_idx (int, optional) – See module initialization documentation. Default:
None
.
- Return type:
Examples:
>>> # FloatTensor containing pretrained weights >>> weight = torch.FloatTensor([[1, 2.3, 3], [4, 5.1, 6.3]]) >>> embeddingbag = nn.EmbeddingBag.from_pretrained(weight) >>> # Get embeddings for index 1 >>> input = torch.LongTensor([[1, 0]]) >>> embeddingbag(input) tensor([[ 2.5000, 3.7000, 4.6500]])