Shortcuts

lora_llama3_2_vision_11b

torchtune.models.llama3_2_vision.lora_llama3_2_vision_11b(lora_attn_modules: List[Literal['q_proj', 'k_proj', 'v_proj', 'output_proj']], decoder_trainable: str = 'frozen', encoder_trainable: str = 'lora', fusion_trainable: str = 'lora', apply_lora_to_mlp: bool = False, apply_lora_to_output: bool = False, lora_rank: int = 8, lora_alpha: float = 16, lora_dropout: float = 0.0, use_dora: bool = False, quantize_base: bool = False, image_size: int = 560) DeepFusionModel[source]

Return a version of Llama3.2 vision (an instance of DeepFusionModel()) with LoRA applied based on the passed in configuration.

Parameters:
  • lora_attn_modules (List[LORA_ATTN_MODULES]) – list of which linear layers LoRA should be applied to in each self-attention block. Options are {"q_proj", "k_proj", "v_proj", "output_proj"}.

  • decoder_trainable (str) – Option to set decoder params as fully trainble (full), lora trainable (lora), or frozen (frozen). The default is “frozen”.

  • encoder_trainable (str) – Option to set encoder params as fully trainble (full), lora trainable (lora), or frozen (frozen). The default is “lora”.

  • fusion_trainable (str) – Option to set fusion params as fully trainble (full), lora trainable (lora), or frozen (frozen). The default is “lora”.

  • apply_lora_to_mlp (bool) – whether to apply LoRA to the MLP in each transformer layer. Default: False

  • apply_lora_to_output (bool) – whether to apply LoRA to the model’s final output projection. Default: False

  • lora_rank (int) – rank of each low-rank approximation

  • lora_alpha (float) – scaling factor for the low-rank approximation

  • lora_dropout (float) – LoRA dropout probability. Default: 0.0

  • quantize_base – (bool): Whether to quantize base model weights or not. Only applied to base weights within linear layers LoRA is applied to. The final output linear projection is not supported for quantization currently.

  • image_size (int) – Base image size that images will be tiled and resized to. Default is 560 for Instruct weights, use 448 for pre-trained.

Returns:

Instantiation of Llama3.2 vision model with LoRA applied to a subset of the attention projections in each layer.

Return type:

DeepFusionModel

Docs

Access comprehensive developer documentation for PyTorch

View Docs

Tutorials

Get in-depth tutorials for beginners and advanced developers

View Tutorials

Resources

Find development resources and get your questions answered

View Resources