Shortcuts

DoRALinear

class torchtune.modules.peft.DoRALinear(in_dim: int, out_dim: int, rank: int, alpha: float, dropout: float = 0.0, use_bias: bool = False, quantize_base: bool = False, **quantization_kwargs)[source]

DoRA linear layer as introduced in DoRA: Weight-Decomposed Low-Rank Adaptation of Large Language Models.

DoRA (Weight-Decomposed Low-Rank Adaptation) fine-tunes a layer by decomposing the pre-trained weights into two components: magnitude and direction. The magnitude component is a learnable scalar vector that scales each output channel, while the direction component, modified via LoRA, adjusts the orientation of weights. By scaling the LoRA update component \(BAx\) with the magnitude vector, DoRA allows the model to apply distinct scaling adjustments across different output dimensions.

Parameters:
  • in_dim (int) – input dimension

  • out_dim (int) – output dimension

  • rank (int) – rank of the low-rank approximation

  • alpha (float) – scaling factor for the low-rank approximation

  • dropout (float) – dropout probability. Default: 0.0

  • use_bias (bool) – whether to include bias in the original linear layer. Default: False

  • quantize_base (bool) – Whether to quantize base linear weight or not. Default: False

  • **quantization_kwargs – Keyword arguments to pass to to_nf4 when quantizing the base linear weight. Examples of valid arguments are block_size and scaler_block_size, which control the granularity of weight quantization and scaler quantization respectively. This is only used if quantize_base is True. Default None

Raises:

ValueError – If quantize_base is False, but quantization kwargs are provided.

adapter_params() List[str][source]

Return lora_a.weight and lora_b.weight as adapter params. If bias is enabled, also return lora_a.bias and lora_b.bias.

forward(x: Tensor) Tensor[source]
Parameters:

x (torch.Tensor) – input tensor with shape (..., in_dim)

Returns:

output tensor with shape (..., out_dim)

Return type:

Tensor

initialize_dora_magnitude()[source]

DoRA initializes the magnitude vector such that its outputs are initially identical to standard LoRA’s outputs.

Docs

Access comprehensive developer documentation for PyTorch

View Docs

Tutorials

Get in-depth tutorials for beginners and advanced developers

View Tutorials

Resources

Find development resources and get your questions answered

View Resources