InputOutputToMessages¶
- class torchtune.data.InputOutputToMessages(train_on_input: bool = False, column_map: Optional[Dict[str, str]] = None, new_system_prompt: Optional[str] = None)[source]¶
Message transform class that converts a single sample with “input” and “output” fields, (or equivalent fields specified in column_map) to user and assistant messages, respectively. This is useful for datasets that have two columns, one containing the user prompt string and the other containing the model response string:
| input | output | |-----------------|------------------| | "user prompt" | "model response" |
- Parameters:
train_on_input (bool) – Whether the model is trained on the user prompt or not. Default is False.
column_map (Optional[Dict[str, str]]) – a mapping to change the expected “input” and “output” column names to the actual column names in the dataset. Keys should be “input” and “output” and values should be the actual column names. Default is None, keeping the default “input” and “output” column names.
new_system_prompt (Optional[str]) – if specified, prepend a system message. This can serve as instructions to guide the model response. Default is None.
- Raises:
ValueError – If
column_map
is provided andinput
not incolumn_map
, oroutput
not incolumn_map
.