torchtune.models¶
llama3.3¶
Text-only models from the 3.3 version of Llama3 family.
Important: You need to request access on Hugging Face before downloading it.
To download the Llama-3.3-70B-Instruct model:
tune download meta-llama/Llama-3.3-70B-Instruct --ignore-patterns "original/consolidated.00.pth" --hf-token <HF_TOKEN>
Builder for creating a Llama3.3 model initialized w/ the default 70B parameter values. |
|
Builder for creating a Llama3.3 70B model with LoRA enabled. |
|
Builder for creating a Llama3.3 70B model with QLoRA enabled. |
Note
The Llama3.3 tokenizer reuses the llama3_tokenizer
class.
llama3.2¶
Text-only models from the 3.2 version of Llama3 family.
Important: You need to request access on Hugging Face before downloading it.
To download the Llama-3.2-1B-Instruct model:
tune download meta-llama/Llama-3.2-1B-Instruct --output-dir /tmp/Llama-3.2-1B-Instruct --ignore-patterns "original/consolidated.00.pth" --hf-token <HF_TOKEN>
To download the Llama-3.2-3B-Instruct model:
tune download meta-llama/Llama-3.2-3B-Instruct --output-dir /tmp/Llama-3.2-3B-Instruct --ignore-patterns "original/consolidated*" --hf-token <HF_TOKEN>
Builder for creating a Llama3.2 model initialized w/ the default 1b parameter values. |
|
Builder for creating a Llama3.2 model initialized w/ the default 3b parameter values. |
|
Builder for creating a Llama3.2 1B model with LoRA enabled. |
|
Builder for creating a Llama3.2 3B model with LoRA enabled. |
|
Builder for creating a Llama3.2 1B model with QLoRA enabled. |
|
Builder for creating a Llama3.2 3B model with QLoRA enabled. |
Note
The Llama3.2 tokenizer reuses the llama3_tokenizer
class.
llama3.2 Vision¶
Vision-Language Models from the 3.2 version of Llama3 family.
Important: You need to request access on Hugging Face before downloading it.
To download the Llama-3.2-11B-Instruct model:
tune download meta-llama/Llama-3.2-11B-Vision-Instruct --output-dir /tmp/Llama-3.2-11B-Vision-Instruct --hf-token <HF_TOKEN>
Llama 3.2 Vision 11B model |
|
Data Transforms (including Tokenizer) for Llama3 Vision. |
|
Return a version of Llama3.2 vision (an instance of |
|
Builder for creating a Llama3.2 vision 11B model with QLoRA enabled. |
|
Build the decoder associated with the Llama3 model with additional fused cross attention layers. |
|
Build the Llama 3.2 vision encoder by combining the CLIP image model with an additional projection head fusion module. |
|
Build the decoder associated with the Llama3 model with additional fused cross attention layers. |
|
Build the Llama 3.2 vision encoder by combining the CLIP image model with an additional projection head fusion module. |
|
Vision encoder model for Llama 3.2 Vision. |
|
Projection transformer to adapt the output of a pretrained frozen encoder (CLIP) to a pretrained decoder model. |
|
This transform combines the transforms for the different modalities of Llama 3.2 Vision. |
Note
The Llama3.2 tokenizer reuses the llama3_tokenizer
class.
llama3 & llama3.1¶
Models 3 and 3.1 from the Llama3 family.
Important: You need to request access on Hugging Face before downloading it.
To download the Llama3.1-8B-Instruct model:
tune download meta-llama/Meta-Llama-3.1-8B-Instruct --output-dir /tmp/Meta-Llama-3.1-8B-Instruct --ignore-patterns "original/consolidated.00.pth" --hf-token <HF_TOKEN>
To download the Llama3.1-70B-Instruct model:
tune download meta-llama/Meta-Llama-3.1-70B-Instruct --output-dir /tmp/Meta-Llama-3.1-70B-Instruct --ignore-patterns "original/consolidated*" --hf-token <HF_TOKEN>
To download the Llama3.1-405B-Instruct model:
tune download meta-llama/Meta-Llama-3.1-405B-Instruct --ignore-patterns "original/consolidated*" --hf-token <HF_TOKEN>
To download the Llama3 weights of the above models, you can instead download from Meta-Llama-3-8B-Instruct and Meta-Llama-3-70B-Instruct, and remove the ignore patterns flag.
Build the decoder associated with the Llama3 model. |
|
Return a version of Llama3 (an instance of |
|
Builder for creating a Llama3 model initialized w/ the default 8b parameter values. |
|
Builder for creating a Llama3 8B model with LoRA enabled. |
|
Builder for creating a Llama3 8B model with QLoRA enabled. |
|
Builder for creating a Llama3 model initialized w/ the default 70B parameter values. |
|
Builder for creating a Llama3 70B model with LoRA enabled. |
|
Builder for creating a Llama3 70B model with QLoRA enabled. |
|
Tokenizer for Llama3. |
|
Build the decoder associated with the Llama3.1 model. |
|
Return a version of Llama3.1 (an instance of |
|
Builder for creating a Llama3.1 model initialized w/ the default 8b parameter values. |
|
Builder for creating a Llama3.1 8B model with LoRA enabled. |
|
Builder for creating a Llama3.1 8B model with QLoRA enabled. |
|
Builder for creating a Llama3.3 model initialized w/ the default 70B parameter values. |
|
Builder for creating a Llama3.3 70B model with LoRA enabled. |
|
Builder for creating a Llama3.3 70B model with QLoRA enabled. |
|
Builder for creating a Llama3.1 model initialized w/ the default 405B parameter values. |
|
Builder for creating a Llama3.1 405B model with LoRA enabled. |
|
Builder for creating a Llama3.1 405B model with QLoRA enabled. |
Note
The Llama3.1 tokenizer reuses the llama3.llama3_tokenizer builder class.
llama2¶
All models from the Llama2 family.
Important: You need to request access on Hugging Face before downloading it.
To download the Llama2-7B model:
tune download meta-llama/Llama-2-7b-hf --output-dir /tmp/Llama-2-7b-hf --hf-token <HF_TOKEN>
To download the Llama2-13B model:
tune download meta-llama/Llama-2-13b-hf --output-dir /tmp/Llama-2-13b-hf --hf-token <HF_TOKEN>
To download the Llama2-70B model:
tune download meta-llama/Llama-2-70b-hf --output-dir /tmp/Llama-2-70b-hf --hf-token <HF_TOKEN>
Build the decoder associated with the Llama2 model. |
|
Return a version of Llama2 (an instance of |
|
Builder for creating a Llama2 model initialized w/ the default 7B parameter values from https://arxiv.org/abs/2307.09288 |
|
Builder for creating a Llama2 7B model with LoRA enabled. |
|
Builder for creating a Llama2 7B model with QLoRA enabled. |
|
Builder for creating a Llama2 model initialized w/ the default 13B parameter values from https://arxiv.org/abs/2307.09288 |
|
Builder for creating a Llama2 13B model with LoRA enabled. |
|
Builder for creating a Llama2 13B model with QLoRA enabled. |
|
Builder for creating a Llama2 model initialized w/ the default 70B parameter values from https://arxiv.org/abs/2307.09288 |
|
Builder for creating a Llama2 70B model with LoRA enabled. |
|
Builder for creating a Llama2 70B model with QLoRA enabled. |
|
Tokenizer for Llama2. |
|
Builder for creating a Llama2 model initialized w/ the default 7B parameter values from https://arxiv.org/abs/2307.09288, where the output layer is a classification layer projecting to a single class for reward modelling. |
|
Builder for creating a Llama2 7B reward model with LoRA enabled. |
|
Builder for creating a Llama2 reward 7b model with QLoRA enabled. |
|
Prompt template that formats chat data of human and system prompts with appropriate tags used in Llama2 pre-training. |
code llama¶
Models from the Code Llama family.
Important: You need to request access on Hugging Face before downloading it.
To download the CodeLlama-7B model:
tune download meta-llama/CodeLlama-7b-hf --output-dir /tmp/CodeLlama-7b-hf --hf-token <HF_TOKEN>
Builder for creating a Code-Llama2 model initialized w/ the default 7B parameter values from https://arxiv.org/pdf/2308.12950.pdf |
|
Builder for creating a Code-Llama2 7B model with LoRA enabled. |
|
Builder for creating a Code-Llama2 7B model with QLoRA enabled. |
|
Builder for creating a Code-Llama2 model initialized w/ the default 13B parameter values from https://arxiv.org/pdf/2308.12950.pdf |
|
Builder for creating a Code-Llama2 13B model with LoRA enabled. |
|
Builder for creating a Code-Llama2 13B model with QLoRA enabled. |
|
Builder for creating a Code-Llama2 model initialized w/ the default 70B parameter values from https://arxiv.org/pdf/2308.12950.pdf |
|
Builder for creating a Code-Llama2 70B model with LoRA enabled. |
|
Builder for creating a Code-Llama2 70B model with QLoRA enabled. |
qwen-2.5¶
Models of size 0.5B, 1.5B, 3B, 7B, 14B, 32B, 72B from the Qwen2.5 family.
To download the Qwen2.5 1.5B model, for example:
tune download Qwen/Qwen2.5-1.5B-Instruct --output-dir /tmp/Qwen2_5-1_5B-Instruct
Builder for creating a Qwen2.5 model (base or instruct) initialized w/ the default 0.5B parameter values from https://huggingface.co/Qwen/Qwen2.5-0.5B-Instruct |
|
Builder for creating a Qwen2.5 0.5B model (base or instruct) with LoRA enabled. |
|
Builder for creating a Qwen2.5 base model initialized w/ the default 1.5B parameter values from https://huggingface.co/Qwen/Qwen2.5-1.5B |
|
Builder for creating a Qwen2.5 instruct model initialized w/ the default 1.5B parameter values from https://huggingface.co/Qwen/Qwen2.5-1.5B-Instruct |
|
Builder for creating a Qwen2.5 1.5B base model with LoRA enabled. |
|
Builder for creating a Qwen2.5 1.5B instruct model with LoRA enabled. |
|
Builder for creating a Qwen2.5 model (base or instruct) initialized w/ the default 3B parameter values from https://huggingface.co/Qwen/Qwen2.5-3B-Instruct |
|
Builder for creating a Qwen2.5 3B model (base or instruct) with LoRA enabled. |
|
Builder for creating a Qwen2.5 base model initialized w/ the default 7B parameter values from https://huggingface.co/Qwen/Qwen2.5-7B |
|
Builder for creating a Qwen2.5 instruct model initialized w/ the default 7B parameter values from https://huggingface.co/Qwen/Qwen2.5-7B-Instruct |
|
Builder for creating a Qwen2.5 7B base model with LoRA enabled. |
|
Builder for creating a Qwen2.5 7B instruct model with LoRA enabled. |
|
Builder for creating a Qwen2.5 base model initialized w/ the default 14B parameter values from https://huggingface.co/Qwen/Qwen2.5-14B |
|
Builder for creating a Qwen2.5 instruct model initialized w/ the default 14B parameter values from https://huggingface.co/Qwen/Qwen2.5-14B-Instruct |
|
Builder for creating a Qwen2.5 14B base model with LoRA enabled. |
|
Builder for creating a Qwen2.5 14B instruct model with LoRA enabled. |
|
Builder for creating a Qwen2.5 base model initialized w/ the default 32B parameter values from https://huggingface.co/Qwen/Qwen2.5-32B |
|
Builder for creating a Qwen2.5 instruct model initialized w/ the default 32B parameter values from https://huggingface.co/Qwen/Qwen2.5-32B-Instruct |
|
Builder for creating a Qwen2.5 32B base model with LoRA enabled. |
|
Builder for creating a Qwen2.5 32B instruct model with LoRA enabled. |
|
Builder for creating a Qwen2.5 base model initialized w/ the default 72B parameter values from https://huggingface.co/Qwen/Qwen2.5-72B |
|
Builder for creating a Qwen2.5 instruct model initialized w/ the default 72B parameter values from https://huggingface.co/Qwen/Qwen2.5-72B-Instruct |
|
Builder for creating a Qwen2.5 72B base model with LoRA enabled. |
|
Builder for creating a Qwen2.5 72B instruct model with LoRA enabled. |
|
Tokenizer for Qwen2.5. |
qwen-2¶
Models of size 0.5B, 1.5B, and 7B from the Qwen2 family.
To download the Qwen2 1.5B model, for example:
tune download Qwen/Qwen2-1.5B-Instruct --output-dir /tmp/Qwen2-1.5B-Instruct
Build the decoder associated with the Qwen2 model. |
|
Return a version of Qwen2 (an instance of |
|
Builder for creating a Qwen2 model initialized w/ the default 0.5B parameter values from https://huggingface.co/Qwen/Qwen2-0.5B-Instruct |
|
Builder for creating a Qwen2 0.5B model with LoRA enabled. |
|
Builder for creating a Qwen2 model initialized w/ the default 1.5B parameter values from https://huggingface.co/Qwen/Qwen2-1.5B-Instruct |
|
Builder for creating a Qwen2 1.5B model with LoRA enabled. |
|
Builder for creating a Qwen2 model initialized w/ the default 7B parameter values from https://huggingface.co/Qwen/Qwen2-7B-Instruct |
|
Builder for creating a Qwen2 7B model with LoRA enabled. |
|
Tokenizer for Qwen2. |
phi-3¶
Models from the Phi-3 mini family.
To download the Phi-3 Mini 4k instruct model:
tune download microsoft/Phi-3-mini-4k-instruct --output-dir /tmp/Phi-3-mini-4k-instruct --hf-token <HF_TOKEN>
|
|
Return a version of Phi3 (an instance of |
|
Builder for creating the Phi3 Mini 4K Instruct Model. |
|
Builder for creating a Phi3 Mini (3.8b) model with LoRA enabled. |
|
Builder for creating a Phi3 mini model with QLoRA enabled. |
|
Phi-3 Mini tokenizer. |
mistral¶
All models from Mistral AI family.
Important: You need to request access on Hugging Face to download this model.
To download the Mistral 7B v0.1 model:
tune download mistralai/Mistral-7B-v0.1 --output-dir /tmp/Mistral-7B-v0.1 --ignore-patterns "*.safetensors" --hf-token <HF_TOKEN>
Build the decoder associated with the mistral model. |
|
Return a version of Mistral (an instance of |
|
Build a base mistral model with an added classification layer. |
|
Return a version of Mistral classifier (an instance of |
|
Builder for creating a Mistral 7B model initialized w/ the default 7b parameter values from https://mistral.ai/news/announcing-mistral-7b/ |
|
Builder for creating a Mistral 7B model with LoRA enabled. |
|
Builder for creating a Mistral model with QLoRA enabled. |
|
Builder for creating a Mistral 7B model initialized w/ the default 7b parameter values from: https://huggingface.co/Ray2333/reward-model-Mistral-7B-instruct-Unified-Feedback where the output layer is a classification layer projecting to a single class for reward modelling. |
|
Builder for creating a Mistral reward 7B model with LoRA enabled. |
|
Builder for creating a Mistral reward 7B model with QLoRA enabled. |
|
Tokenizer for Mistral models. |
|
Formats according to Mistral's instruct model. |
gemma¶
Models of size 2B and 7B from the Gemma family.
Important: You need to request access on Hugging Face to use this model.
To download the Gemma 2B model (not Gemma2):
tune download google/gemma-2b --ignore-patterns "gemma-2b.gguf" --hf-token <HF_TOKEN>
To download the Gemma 7B model:
tune download google/gemma-7b --ignore-patterns "gemma-7b.gguf" --hf-token <HF_TOKEN>
Build the decoder associated with the gemma model. |
|
Return a version of Gemma with LoRA applied based on the passed in configuration. |
|
Builder for creating a Gemma 2B model initialized w/ the default 2b parameter values from: https://blog.google/technology/developers/gemma-open-models/ |
|
Builder for creating a Gemma 2B model with LoRA enabled. |
|
Builder for creating a Gemma model with QLoRA enabled. |
|
Builder for creating a Gemma 7B model initialized w/ the default 7b parameter values from: https://blog.google/technology/developers/gemma-open-models/ |
|
Builder for creating a Gemma 7B model with LoRA enabled. |
|
Builder for creating a Gemma model with QLoRA enabled. |
|
Tokenizer for Gemma. |
gemma2 :¶
Models of size 2B, 9B, 27B from the Gemma family.
Important: You need to request access on Hugging Face to use this model.
To download the Gemma2 2B, 9B, 27B models :
tune download google/gemma-2-<MODEL_SIZE>b --ignore-patterns "gemma-2-<MODEL_SIZE>b.gguf" --hf-token <HF_TOKEN>
Build the decoder associated with the gemma2 model. |
|
Return a version of Gemma with LoRA applied based on the passed in configuration. |
|
Builder for creating a Gemma2 2B model initialized w/ the default 2b parameter values from: https://github.com/google/gemma_pytorch/blob/main/gemma/config.py |
|
Builder for creating a Gemma2 2B model with LoRA enabled. |
|
Builder for creating a Gemma2 model with QLoRA enabled. |
|
Builder for creating a Gemma2 9B model initialized w/ the default 9b parameter values from: https://github.com/google/gemma_pytorch/blob/main/gemma/config.py |
|
Builder for creating a Gemma 9B model with LoRA enabled. |
|
Builder for creating a Gemma model with QLoRA enabled. |
|
Builder for creating a Gemma2 27B model initialized w/ the default 27b parameter values from: https://github.com/google/gemma_pytorch/blob/main/gemma/config.py |
|
Builder for creating a Gemma2 27B model with LoRA enabled. |
|
Builder for creating a Gemma model with QLoRA enabled. |
|
Tokenizer for Gemma. |
clip¶
Vision components to support multimodality using CLIP encoder.
Builds the vision encoder associated with the clip model. |
|
Token positional embedding for images, different for every token in an image. |
|
Token positional embedding for tiled images, different for every tile, different for every token. |
|
Positional embedding for tiles, different for every tile, same for every token within a tile. |