Shortcuts

torchtune.data

Text templates

Templates for instruct prompts and chat prompts. Includes some specific formatting for difference datasets and models.

InstructTemplate

Interface for instruction templates.

AlpacaInstructTemplate

Prompt template for Alpaca-style datasets.

GrammarErrorCorrectionTemplate

Prompt template for grammar correction datasets.

SummarizeTemplate

Prompt template to format datasets for summarization tasks.

StackExchangedPairedTemplate

Prompt template for preference datasets similar to StackExchangedPaired.

ChatFormat

Interface for chat formats.

ChatMLFormat

OpenAI's Chat Markup Language used by their chat models.

Llama2ChatFormat

Chat format that formats human and system prompts with appropriate tags used in Llama2 pre-training.

MistralChatFormat

Formats according to Mistral's instruct model.

Types

Message

This dataclass represents individual messages in an instruction or chat dataset.

Converters

Converts data from common JSON formats into a torchtune Message.

get_sharegpt_messages

Convert a chat sample adhering to the ShareGPT json structure to torchtune's Message structure.

get_openai_messages

Convert a chat sample adhering to the OpenAI API json structure to torchtune's Message structure.

Helper funcs

Miscellaneous helper functions used in modifying data.

validate_messages

Given a list of messages, ensure that messages form a valid back-and-forth conversation.

truncate

Truncate a list of tokens to a maximum length.

Docs

Access comprehensive developer documentation for PyTorch

View Docs

Tutorials

Get in-depth tutorials for beginners and advanced developers

View Tutorials

Resources

Find development resources and get your questions answered

View Resources