ChosenRejectedToMessages

class torchtune.data.ChosenRejectedToMessages(train_on_input: Optional[bool] = None, column_map: Optional[Dict[str, str]] = None, new_system_prompt: Optional[str] = None, masking_strategy: Optional[str] = 'train_on_assistant')[source]

Transform for converting a single sample from datasets with “chosen” and “rejected” columns containing conversations to a list of chosen and rejected messages. For example:

|  chosen                                |  rejected                              |
|----------------------------------------|----------------------------------------|
| [{"role": "user", "content": Q1},      | [{"role": "user", "content": Q1},      |
|  {"role": "assistant", "content": A1}] |  {"role": "assistant", "content": A2}] |

will be converted to:

chosen = [
    Message(role="user", content="Q1"),
    Message(role="assistant", content="A1"),
]
rejected = [
    Message(role="user", content="Q1"),
    Message(role="assistant", content="A2"),
]

A single sample typically consists of a single optional system prompt and one or multiple turns of user and assistant messages.

Parameters:

train_on_input (Optional[bool]) – whether the model is trained on the user prompt or not. Deprecated parameter and will be removed in a future release. Default is None.
column_map (Optional[Dict[str, str]]) – a mapping to change the expected “chosen” and “rejected” column names to the actual column names in the dataset. Keys should be “chosen” and “rejected” and values should be the actual column names. Default is None, keeping the default column names.
new_system_prompt (Optional[str]) – if specified, prepend a system message. This can serve as instructions to guide the model response. Setting this will OVERRIDE any system messages already present in the dataset. Default is None.
masking_strategy (Optional[str]) –
masking strategy to use for model training. Must be one of: train_on_all, train_on_assistant, train_on_last. Default is “train_on_assistant”.
- train_on_all: both user and assistant messages are unmasked
- train_on_assistant: user messages are masked, only assistant messages are unmasked
- train_on_last: only the last assistant message is unmasked

Raises:

ValueError – If column_map is provided and chosen not in column_map, or rejected not in column_map.

ChosenRejectedToMessages

Docs

Tutorials

Resources