Shortcuts

Message

class torchtune.data.Message(role: Literal['system', 'user', 'assistant', 'ipython'], content: Union[str, List[Dict[str, Any]]], masked: bool = False, ipython: bool = False, eot: bool = True)[source]

This class represents individual messages in a fine-tuning dataset. It supports text-only content, text with interleaved images, and tool calls. The ModelTokenizer will tokenize the content of the message using tokenize_messages and attach the appropriate special tokens based on the flags set in this class.

Parameters:
  • role (Role) – role of the message writer. Can be “system” for system prompts, “user” for human prompts, “assistant” for model responses, or “ipython” for tool call returns.

  • content (Union[str, List[Dict[str, Any]]]) –

    content of the message. If it is text only content, you can pass in a string. If it is multimodal content, pass in a list of dictionaries formatted as follows:

    [
        {"type": "image", "content": <PIL.Image.Image>},
        {"type": "text", "content": "What is in this image?"},
    ]
    

  • masked (bool) – whether the message is masked in the sample. If True, do not use in loss calculation. Default: False

  • ipython (bool) – whether the message is a tool call. Default: False

  • eot (bool) –

    whether the message corresponds to the end of a turn, where control is handed over to the assistant from the user or the user from the assistant. Default: True. Should be true in most cases except for:

    • For multiple consecutive assistant messages (i.e., tool calls by assistant), only the last assistant message will have eot=True

    • All ipython messages (tool call returns) should set eot=False.

Note

Message class expects any image content to be in PIL Image format.

property contains_media: bool

Returns whether the message contains media.

classmethod from_dict(d: dict) Message[source]

Construct a Message from a dictionary.

Parameters:

d (dict) – dictionary containing the fields of the Message.

Returns:

constructed Message.

Return type:

Message

get_media() List[PIL.Image.Image][source]

Returns media content of the message.

property text_content: str

Returns text-only content of the message.

Docs

Access comprehensive developer documentation for PyTorch

View Docs

Tutorials

Get in-depth tutorials for beginners and advanced developers

View Tutorials

Resources

Find development resources and get your questions answered

View Resources