torchtune.modules¶

Modeling Components and Building Blocks¶

`CausalSelfAttention`	Multi-headed grouped query self-attention (GQA) layer introduced in https://arxiv.org/pdf/2305.13245v1.pdf.
`FeedForward`	This class implements the feed-forward network derived from Llama2.
`KVCache`	Standalone nn.Module containing a kv-cache to cache past key and values during inference.
`get_cosine_schedule_with_warmup`	Create a learning rate schedule that linearly increases the learning rate from 0.0 to lr over num_warmup_steps, then decreases to 0.0 on a cosine schedule over the remaining num_training_steps-num_warmup_steps (assuming num_cycles = 0.5).
`RotaryPositionalEmbeddings`	This class implements Rotary Positional Embeddings (RoPE) proposed in https://arxiv.org/abs/2104.09864.
`RMSNorm`	Implements Root Mean Square Normalization introduced in https://arxiv.org/pdf/1910.07467.pdf.
`TransformerDecoderLayer`	Transformer layer derived from the Llama2 model.
`TransformerDecoder`	Transformer Decoder derived from the Llama2 architecture.

`tokenizers.SentencePieceTokenizer`	A wrapper around SentencePieceProcessor.
`tokenizers.TikTokenTokenizer`	A wrapper around tiktoken Encoding.

`peft.LoRALinear`	LoRA linear layer as introduced in LoRA: Low-Rank Adaptation of Large Language Models.
`peft.AdapterModule`	Interface for an nn.Module containing adapter weights.
`peft.get_adapter_params`	Return the subset of parameters from a model that correspond to an adapter.
`peft.set_trainable_params`	Set trainable parameters for an nn.Module based on a state dict of adapter parameters.

These are utilities that are common to and can be used by all modules.

A state_dict hook that replaces NF4 tensors with their restored higher-precision weight and optionally offloads the restored weight to CPU.