Shortcuts

llama3_2_vision_11b

torchtune.models.llama3_2_vision.llama3_2_vision_11b(decoder_trainable: bool = False, encoder_trainable: bool = True, fusion_trainable: bool = True, image_size: int = 560) DeepFusionModel[source]

Llama 3.2 Vision 11B model

Parameters:
  • decoder_trainable (bool) – Whether to make decoder params trainable. Default is False.

  • encoder_trainable (bool) – Whether to make encoder params trainable. Default is True.

  • fusion_trainable (bool) – Whether to make fusion params trainable. Default is True.

  • image_size (int) – Base image size that images will be tiled and resized to. Default is 560 for Instruct weights, use 448 for pre-trained.

Returns:

Instantiation of the Llama 3.2 Vision 11B model

Return type:

DeepFusionModel

Docs

Access comprehensive developer documentation for PyTorch

View Docs

Tutorials

Get in-depth tutorials for beginners and advanced developers

View Tutorials

Resources

Find development resources and get your questions answered

View Resources