.. _preference_dataset_usage_label: =================== Preference Datasets =================== Preference datasets are used for reward modelling, where the downstream task is to fine-tune a base model to capture some underlying human preferences. Currently, these datasets are used in torchtune with the Direct Preference Optimization (DPO) `recipe <https://github.com/pytorch/torchtune/blob/main/recipes/lora_dpo_single_device.py>`_. The ground-truth in preference datasets is usually the outcome of a binary comparison between two completions for the same prompt, and where a human annotator has indicated that one completion is more preferable than the other, according to some pre-set criterion. These prompt-completion pairs could be instruct style (single-turn, optionally with a single prompt), chat style (multi-turn), or some other set of interactions between a user and model (e.g. free-form text completion). The primary entry point for fine-tuning with preference datasets in torchtune with the DPO recipe is :func:`~torchtune.datasets.preference_dataset`. Example local preference dataset -------------------------------- .. code-block:: bash # my_preference_dataset.json [ { "chosen_conversations": [ { "content": "What do I do when I have a hole in my trousers?", "role": "user" }, { "content": "Fix the hole.", "role": "assistant" } ], "rejected_conversations": [ { "content": "What do I do when I have a hole in my trousers?", "role": "user" }, { "content": "Take them off.", "role": "assistant" } ] } ] .. code-block:: python from torchtune.models.mistral import mistral_tokenizer from torchtune.datasets import preference_dataset m_tokenizer = mistral_tokenizer( path="/tmp/Mistral-7B-v0.1/tokenizer.model", prompt_template="torchtune.models.mistral.MistralChatTemplate", max_seq_len=8192, ) column_map = { "chosen": "chosen_conversations", "rejected": "rejected_conversations" } ds = preference_dataset( tokenizer=tokenizer, source="json", column_map=column_map, data_files="my_preference_dataset.json", train_on_input=False, split="train", ) tokenized_dict = ds[0] print(m_tokenizer.decode(tokenized_dict["rejected_input_ids"])) # user\n\nWhat do I do when I have a hole in my trousers?assistant\n\nTake them off. print(tokenized_dict["rejected_labels"]) # [-100,-100,-100,-100,-100,-100,-100,-100,-100,-100,-100,-100, -100,-100,\ # -100,-100,-100,-100,-100,128006,78191,128007,271,18293,1124,1022,13,128009,-100] This can also be accomplished via the yaml config: .. code-block:: yaml # In config tokenizer: _component_: torchtune.models.mistral.mistral_tokenizer path: /tmp/Mistral-7B-v0.1/tokenizer.model prompt_template: torchtune.models.mistral.MistralChatTemplate max_seq_len: 8192 dataset: _component_: torchtune.datasets.preference_dataset source: json data_files: my_preference_dataset.json column_map: chosen: chosen_conversations rejected: rejected_conversations train_on_input: False split: train In this example, we've also shown how `column_map` can be used when the "chosen" and/or "rejected" column names differ from the corresponding columns in your dataset. Preference dataset format ------------------------- Preference datasets are expected to have two columns: *"chosen"*, which indicates the human annotator's preferred response, and *"rejected"*, indicating the human annotator's dis-preferred response. Each of these columns should contain a list of messages with an identical prompt. The list of messages could include a system prompt, an instruction, multiple turns between user and assistant, or tool calls/returns. Let's take a look at Anthropic's helpfulness/harmlessness dataset `on Hugging Face <https://huggingface.co/datasets/RLHFlow/HH-RLHF-Helpful-standard>`_ as an example of a multi-turn chat-style format: .. code-block:: text | chosen | rejected | |---------------------------------------|---------------------------------------| |[{ |[{ | | "role": "user", | "role": "user", | | "content": "helping my granny with her| "content": "helping my granny with her| | mobile phone issue" | mobile phone issue" | | }, | }, | | { | { | | "role": "assistant", | "role": "assistant", | | "content": "I see you are chatting | "content": "Well, the best choice here| | with your grandmother about an issue | could be helping with so-called 'self-| | with her mobile phone. How can I | management behaviors'. These are | | help?" | things your grandma can do on her own | | }, | to help her feel more in control." | | { | }] | | "role": "user", | | | "content": "her phone is not turning | | | on" | | | }, | | | {...}, | | |] | | Currently, only JSON-format conversations are supported, as shown in the example above. You can use this dataset out-of-the-box in torchtune through :func:`~torchtune.datasets.hh_rlhf_helpful_dataset`. Loading preference datasets from Hugging Face --------------------------------------------- To load in preference datasets from Hugging Face you'll need to pass in the dataset repo name to ``source``. For most HF datasets, you will also need to specify the ``split``. .. code-block:: python from torchtune.models.gemma import gemma_tokenizer from torchtune.datasets import preference_dataset g_tokenizer = gemma_tokenizer("/tmp/gemma-7b/tokenizer.model") ds = chat_dataset( tokenizer=g_tokenizer, source="hendrydong/preference_700K", split="train", ) .. code-block:: yaml # Tokenizer is passed into the dataset in the recipe so we don't need it here dataset: _component_: torchtune.datasets.preference_dataset source: hendrydong/preference_700K split: train Built-in preference datasets ---------------------------- - :func:`~torchtune.datasets.hh_rlhf_helpful_dataset` - :func:`~torchtune.datasets.stack_exchange_paired_dataset`