Shortcuts

RMSNorm

class torch.nn.RMSNorm(normalized_shape, eps=None, elementwise_affine=True, device=None, dtype=None)[source]

Applies Root Mean Square Layer Normalization over a mini-batch of inputs.

This layer implements the operation as described in the paper Root Mean Square Layer Normalization

y=xRMS[x]+ϵγy = \frac{x}{\sqrt{\mathrm{RMS}[x] + \epsilon}} * \gamma

The root mean squared norm is taken over the last D dimensions, where D is the dimension of normalized_shape. For example, if normalized_shape is (3, 5) (a 2-dimensional shape), the rms norm is computed over the last 2 dimensions of the input.

Parameters
  • normalized_shape (int or list or torch.Size) –

    input shape from an expected input of size

    [×normalized_shape[0]×normalized_shape[1]××normalized_shape[1]][* \times \text{normalized\_shape}[0] \times \text{normalized\_shape}[1] \times \ldots \times \text{normalized\_shape}[-1]]

    If a single integer is used, it is treated as a singleton list, and this module will normalize over the last dimension which is expected to be of that specific size.

  • eps (Optional[float]) – a value added to the denominator for numerical stability. Default: torch.finfo(x.dtype).eps()

  • elementwise_affine (bool) – a boolean value that when set to True, this module has learnable per-element affine parameters initialized to ones (for weights) and zeros (for biases). Default: True.

Shape:
  • Input: (N,)(N, *)

  • Output: (N,)(N, *) (same shape as input)

Examples:

>>> rms_norm = nn.RMSNorm([2, 3])
>>> input = torch.randn(2, 2, 3)
>>> rms_norm(input)
extra_repr()[source]

Extra information about the module.

Return type

str

forward(x)[source]

Runs forward pass.

Return type

Tensor

reset_parameters()[source]

Resets parameters based on their initialization used in __init__.

Docs

Access comprehensive developer documentation for PyTorch

View Docs

Tutorials

Get in-depth tutorials for beginners and advanced developers

View Tutorials

Resources

Find development resources and get your questions answered

View Resources