lora_qwen2_5_1_5b_base

torchtune.models.qwen2_5.lora_qwen2_5_1_5b_base(lora_attn_modules: List[Literal['q_proj', 'k_proj', 'v_proj', 'output_proj']], apply_lora_to_mlp: bool = False, apply_lora_to_output: bool = False, lora_rank: int = 8, lora_alpha: float = 16, lora_dropout: float = 0.0, use_dora: bool = False, quantize_base: bool = False) → TransformerDecoder[source]

Builder for creating a Qwen2.5 1.5B base model with LoRA enabled.

The Qwen2.5 defaults are the same as in qwen2_5_1_5b_base(), while LoRA default params are based on https://github.com/tloen/alpaca-lora/blob/8bb8579e403dc78e37fe81ffbb253c413007323f/finetune.py#L41-L43.

Parameters:

lora_attn_modules (List[LORA_ATTN_MODULES]) – list of which linear layers LoRA should be applied to in each self-attention block. Options are {"q_proj", "k_proj", "v_proj", "output_proj"}.
apply_lora_to_mlp (bool) – whether to apply LoRA to the MLP in each transformer layer. Default: False
lora_rank (int) – rank of each low-rank approximation
lora_alpha (float) – scaling factor for the low-rank approximation
lora_dropout (float) – dropout probability for the low-rank approximation. Default: 0.0
quantize_base (bool) – Whether to quantize base model weights

Returns:

Instantiation of Qwen2.5 1.5B model with LoRA applied

Return type:

TransformerDecoder

Note

Qwen2.5 0.5B-3B model builders will enable tie_word_embeddings by default (see qwen2())

Note

The base and instruct versions have slightly different architectures for all Qwen2.5 model sizes except 0.5B and 3B. Make sure to select the correct model builder for the weights.

lora_qwen2_5_1_5b_base

Docs

Tutorials

Resources