alpaca_cleaned_dataset¶
- torchtune.datasets.alpaca_cleaned_dataset(tokenizer: ModelTokenizer, *, source: str = 'yahma/alpaca-cleaned', column_map: Optional[Dict[str, str]] = None, train_on_input: bool = True, packed: bool = False, filter_fn: Optional[Callable] = None, split: str = 'train', **load_dataset_kwargs: Dict[str, Any]) Union[SFTDataset, PackedDataset] ¶
Builder for a variant of Alpaca-style datasets with the cleaned version of the original Alpaca dataset, yahma/alpaca-cleaned. See the dataset page and
alpaca_dataset()
for more details.