Shortcuts

format_content_with_images

torchtune.data.format_content_with_images(content: str, *, image_tag: str, images: List[PIL.Image.Image]) List[Dict[str, Any]][source]

Given a raw text string, split by the specified image_tag and form into list of dictionaries to be used in the Message content field:

[
    {
        "role": "system" | "user" | "assistant",
        "content":
            [
                {"type": "image", "content": <PIL.Image.Image>},
                {"type": "text", "content": "This is a sample image."},
            ],
    },
    ...
]
Parameters:
  • content (str) – raw message text

  • image_tag (str) – string to split the text by

  • images (List["PIL.Image.Image"]) – list of images to be used in the content

Raises:

ValueError – If the number of images does not match the number of image tags in the content

Examples

>>> content = format_content_with_images(
...     "<|image|>hello <|image|>world",
...     image_tag="<|image|>",
...     images=[<PIL.Image.Image>, <PIL.Image.Image>]
... )
>>> print(content)
[
    {"type": "image", "content": <PIL.Image.Image>},
    {"type": "text", "content": "hello "},
    {"type": "image", "content": <PIL.Image.Image>},
    {"type": "text", "content": "world"}
]
Returns:

list of dictionaries to be used in the Message content field

Return type:

List[Dict[str, Any]]

Docs

Access comprehensive developer documentation for PyTorch

View Docs

Tutorials

Get in-depth tutorials for beginners and advanced developers

View Tutorials

Resources

Find development resources and get your questions answered

View Resources