Shortcuts

torchtune.models

llama3 & llama3.1

All models from the Llama3 family.

To download the Llama3-8B-Instruct model:

tune download meta-llama/Meta-Llama-3-8B-Instruct --hf-token <HF_TOKEN>

To download the Llama3-70B-Instruct model:

tune download meta-llama/Meta-Llama-3-70B-Instruct --hf-token <HF_TOKEN> --ignore-patterns "original/consolidated*"

To download the Llama3.1 weights of the above models, you can instead download from Meta-Llama-3.1-8B-Instruct or Meta-Llama-3.1-70B-Instruct.

llama3.llama3

Build the decoder associated with the Llama3 model.

llama3.lora_llama3

Return a version of Llama3 (an instance of TransformerDecoder()) with LoRA applied based on the passed in configuration.

llama3.llama3_8b

Builder for creating a Llama3 model initialized w/ the default 8b parameter values.

llama3.lora_llama3_8b

Builder for creating a Llama3 8B model with LoRA enabled.

llama3.qlora_llama3_8b

Builder for creating a Llama3 8B model with QLoRA enabled.

llama3.llama3_70b

Builder for creating a Llama3 model initialized w/ the default 70B parameter values.

llama3.lora_llama3_70b

Builder for creating a Llama3 70B model with LoRA enabled.

llama3.qlora_llama3_70b

Builder for creating a Llama3 70B model with QLoRA enabled.

llama3.llama3_tokenizer

Tokenizer for Llama3.

llama3.Llama3Tokenizer

tiktoken tokenizer configured with Llama3 Instruct's special tokens, as described in https://llama.meta.com/docs/model-cards-and-prompt-formats/meta-llama-3

llama3_1.llama3_1

Build the decoder associated with the Llama3.1 model.

llama3_1.lora_llama3_1

Return a version of Llama3.1 (an instance of TransformerDecoder()) with LoRA applied based on the passed in configuration.

llama3_1.llama3_1_8b

Builder for creating a Llama3.1 model initialized w/ the default 8b parameter values.

llama3_1.lora_llama3_1_8b

Builder for creating a Llama3.1 8B model with LoRA enabled.

llama3_1.qlora_llama3_1_8b

Builder for creating a Llama3.1 8B model with QLoRA enabled.

llama3_1.llama3_1_70b

Builder for creating a Llama3.1 model initialized w/ the default 70B parameter values.

llama3_1.lora_llama3_1_70b

Builder for creating a Llama3.1 70B model with LoRA enabled.

llama3_1.qlora_llama3_1_70b

Builder for creating a Llama3.1 70B model with QLoRA enabled.

Note

The Llama3.1 tokenizer reuses the llama3.llama3_tokenizer builder class.

llama2

All models from the Llama2 family.

To download the Llama2-7B model:

tune download meta-llama/Llama-2-7b-hf --hf-token <HF_TOKEN>

To download the Llama2-13B model:

tune download meta-llama/Llama-2-13b-hf --hf-token <HF_TOKEN>

To download the Llama2-70B model:

tune download meta-llama/Llama-2-70b-hf --hf-token <HF_TOKEN>

llama2.llama2

Build the decoder associated with the Llama2 model.

llama2.lora_llama2

Return a version of Llama2 (an instance of TransformerDecoder()) with LoRA applied based on the passed in configuration.

llama2.llama2_7b

Builder for creating a Llama2 model initialized w/ the default 7B parameter values from https://arxiv.org/abs/2307.09288

llama2.lora_llama2_7b

Builder for creating a Llama2 7B model with LoRA enabled.

llama2.qlora_llama2_7b

Builder for creating a Llama2 7B model with QLoRA enabled.

llama2.llama2_13b

Builder for creating a Llama2 model initialized w/ the default 13B parameter values from https://arxiv.org/abs/2307.09288

llama2.lora_llama2_13b

Builder for creating a Llama2 13B model with LoRA enabled.

llama2.qlora_llama2_13b

Builder for creating a Llama2 13B model with QLoRA enabled.

llama2.llama2_70b

Builder for creating a Llama2 model initialized w/ the default 70B parameter values from https://arxiv.org/abs/2307.09288

llama2.lora_llama2_70b

Builder for creating a Llama2 70B model with LoRA enabled.

llama2.qlora_llama2_70b

Builder for creating a Llama2 70B model with QLoRA enabled.

llama2.llama2_tokenizer

Tokenizer for Llama2.

llama2.Llama2Tokenizer

Llama2's implementation of the SentencePiece tokenizer.

code llama

Models from the Code Llama family.

To download the CodeLlama-7B model:

tune download codellama/CodeLlama-7b-hf --hf-token <HF_TOKEN>

code_llama2.code_llama2_7b

Builder for creating a Code-Llama2 model initialized w/ the default 7B parameter values from https://arxiv.org/pdf/2308.12950.pdf

code_llama2.lora_code_llama2_7b

Builder for creating a Code-Llama2 7B model with LoRA enabled.

code_llama2.qlora_code_llama2_7b

Builder for creating a Code-Llama2 7B model with QLoRA enabled.

code_llama2.code_llama2_13b

Builder for creating a Code-Llama2 model initialized w/ the default 13B parameter values from https://arxiv.org/pdf/2308.12950.pdf

code_llama2.lora_code_llama2_13b

Builder for creating a Code-Llama2 13B model with LoRA enabled.

code_llama2.qlora_code_llama2_13b

Builder for creating a Code-Llama2 13B model with QLoRA enabled.

code_llama2.code_llama2_70b

Builder for creating a Code-Llama2 model initialized w/ the default 70B parameter values from https://arxiv.org/pdf/2308.12950.pdf

code_llama2.lora_code_llama2_70b

Builder for creating a Code-Llama2 70B model with LoRA enabled.

code_llama2.qlora_code_llama2_70b

Builder for creating a Code-Llama2 70B model with QLoRA enabled.

phi-3

Models from the Phi-3 mini family.

To download the Phi-3 Mini 4k instruct model:

tune download microsoft/Phi-3-mini-4k-instruct --hf-token <HF_TOKEN> --ignore-patterns ""

phi3.phi3

param vocab_size:

number of tokens in vocabulary.

phi3.lora_phi3

Return a version of Phi3 (an instance of TransformerDecoder()) with LoRA applied based on the passed in configuration.

phi3.phi3_mini

Builder for creating the Phi3 Mini 4K Instruct Model.

phi3.lora_phi3_mini

Builder for creating a Phi3 Mini (3.8b) model with LoRA enabled.

phi3.qlora_phi3_mini

Builder for creating a Phi3 mini model with QLoRA enabled.

phi3.phi3_mini_tokenizer

Phi-3 Mini tokenizer.

phi3.Phi3MiniTokenizer

SentencePiece tokenizer configured with Phi3 Mini's special tokens.

mistral

All models from Mistral AI family.

To download the Mistral 7B v0.1 model:

tune download mistralai/Mistral-7B-v0.1 --hf-token <HF_TOKEN>

mistral.mistral

Build the decoder associated with the mistral model.

mistral.lora_mistral

Return a version of Mistral (an instance of TransformerDecoder()) with LoRA applied based on the passed in configuration.

mistral.mistral_classifier

Build a base mistral model with an added classification layer.

mistral.lora_mistral_classifier

Return a version of Mistral classifier (an instance of TransformerDecoder()) with LoRA applied to some of the linear layers in its self-attention modules.

mistral.mistral_7b

Builder for creating a Mistral 7B model initialized w/ the default 7b parameter values from https://mistral.ai/news/announcing-mistral-7b/

mistral.lora_mistral_7b

Builder for creating a Mistral 7B model with LoRA enabled.

mistral.qlora_mistral_7b

Builder for creating a Mistral model with QLoRA enabled.

mistral.mistral_classifier_7b

Builder for creating a Mistral 7B classifier model initialized w/ the default 7b parameter values from: https://huggingface.co/Ray2333/reward-model-Mistral-7B-instruct-Unified-Feedback

mistral.lora_mistral_classifier_7b

Builder for creating a Mistral classifier 7B model with LoRA enabled.

mistral.qlora_mistral_classifier_7b

Builder for creating a Mistral classifier model with QLoRA enabled.

mistral.mistral_tokenizer

Tokenizer for Mistral models.

mistral.MistralTokenizer

Mistral's implementation of the SentencePiece tokenizer

gemma

Models of size 2B and 7B from the Gemma family.

To download the Gemma 2B model:

tune download google/gemma-2b --hf-token <HF_TOKEN> --ignore-patterns ""

To download the Gemma 7B model:

tune download google/gemma-7b --hf-token <HF_TOKEN> --ignore-patterns "gemma-7b.gguf"

gemma.gemma

Build the decoder associated with the gemma model.

gemma.lora_gemma

Return a version of Gemma with LoRA applied based on the passed in configuration.

gemma.gemma_2b

Builder for creating a Gemma 2B model initialized w/ the default 2b parameter values from: https://blog.google/technology/developers/gemma-open-models/

gemma.lora_gemma_2b

Builder for creating a Gemma 2B model with LoRA enabled.

gemma.qlora_gemma_2b

Builder for creating a Gemma model with QLoRA enabled.

gemma.gemma_7b

Builder for creating a Gemma 7B model initialized w/ the default 7b parameter values from: https://blog.google/technology/developers/gemma-open-models/

gemma.lora_gemma_7b

Builder for creating a Gemma 7B model with LoRA enabled.

gemma.qlora_gemma_7b

Builder for creating a Gemma model with QLoRA enabled.

gemma.gemma_tokenizer

Tokenizer for Gemma.

gemma.GemmaTokenizer

Gemma's implementation of the SentencePiece tokenizer

Docs

Access comprehensive developer documentation for PyTorch

View Docs

Tutorials

Get in-depth tutorials for beginners and advanced developers

View Tutorials

Resources

Find development resources and get your questions answered

View Resources