lora_qwen2_5_1_5b_instruct¶
- torchtune.models.qwen2_5.lora_qwen2_5_1_5b_instruct(lora_attn_modules: List[Literal['q_proj', 'k_proj', 'v_proj', 'output_proj']], apply_lora_to_mlp: bool = False, apply_lora_to_output: bool = False, lora_rank: int = 8, lora_alpha: float = 16, lora_dropout: float = 0.0, use_dora: bool = False, quantize_base: bool = False) TransformerDecoder [source]¶
Builder for creating a Qwen2.5 1.5B instruct model with LoRA enabled.
The Qwen2.5 defaults are the same as in
qwen2_5_1_5b_instruct()
, while LoRA default params are based on https://github.com/tloen/alpaca-lora/blob/8bb8579e403dc78e37fe81ffbb253c413007323f/finetune.py#L41-L43.- Parameters:
lora_attn_modules (List[LORA_ATTN_MODULES]) – list of which linear layers LoRA should be applied to in each self-attention block. Options are
{"q_proj", "k_proj", "v_proj", "output_proj"}
.apply_lora_to_mlp (bool) – whether to apply LoRA to the MLP in each transformer layer. Default: False
lora_rank (int) – rank of each low-rank approximation
lora_alpha (float) – scaling factor for the low-rank approximation
lora_dropout (float) – dropout probability for the low-rank approximation. Default: 0.0
quantize_base (bool) – Whether to quantize base model weights
- Returns:
Instantiation of Qwen2.5 1.5B model with LoRA applied
- Return type:
Note
Qwen2.5 0.5B-3B model builders will enable
tie_word_embeddings
by default (seeqwen2()
)Note
The base and instruct versions have slightly different architectures for all Qwen2.5 model sizes except 0.5B and 3B. Make sure to select the correct model builder for the weights.