Shortcuts

torchtune.modules

Modeling Components and Building Blocks

CausalSelfAttention

Multi-headed grouped query self-attention (GQA) layer introduced in https://arxiv.org/pdf/2305.13245v1.pdf.

FeedForward

This class implements the feed-forward network derived from Llama2.

KVCache

Standalone nn.Module containing a kv-cache to cache past key and values during inference.

get_cosine_schedule_with_warmup

Create a learning rate schedule that linearly increases the learning rate from 0.0 to lr over num_warmup_steps, then decreases to 0.0 on a cosine schedule over the remaining num_training_steps-num_warmup_steps (assuming num_cycles = 0.5).

RotaryPositionalEmbeddings

This class implements Rotary Positional Embeddings (RoPE) proposed in https://arxiv.org/abs/2104.09864.

RMSNorm

Implements Root Mean Square Normalization introduced in https://arxiv.org/pdf/1910.07467.pdf.

TransformerDecoderLayer

Transformer layer derived from the Llama2 model.

TransformerDecoder

Transformer Decoder derived from the Llama2 architecture.

Tokenizers

tokenizers.SentencePieceTokenizer

A wrapper around SentencePieceProcessor.

tokenizers.TikTokenTokenizer

A wrapper around tiktoken Encoding.

PEFT Components

peft.LoRALinear

LoRA linear layer as introduced in LoRA: Low-Rank Adaptation of Large Language Models.

peft.AdapterModule

Interface for an nn.Module containing adapter weights.

peft.get_adapter_params

Return the subset of parameters from a model that correspond to an adapter.

peft.set_trainable_params

Set trainable parameters for an nn.Module based on a state dict of adapter parameters.

Module Utilities

These are utilities that are common to and can be used by all modules.

common_utils.reparametrize_as_dtype_state_dict_post_hook

A state_dict hook that replaces NF4 tensors with their restored higher-precision weight and optionally offloads the restored weight to CPU.

Docs

Access comprehensive developer documentation for PyTorch

View Docs

Tutorials

Get in-depth tutorials for beginners and advanced developers

View Tutorials

Resources

Find development resources and get your questions answered

View Resources