mistral_tokenizer

torchtune.models.mistral.mistral_tokenizer(path: str, max_seq_len: Optional[int] = None, prompt_template: Optional[Union[str, Dict[Literal['system', 'user', 'assistant', 'ipython', 'tool'], Tuple[str, str]]]] = 'torchtune.models.mistral.MistralChatTemplate', truncation_type: str = 'right') → MistralTokenizer[source]

Tokenizer for Mistral models.

Parameters:

path (str) – path to the tokenizer
max_seq_len (Optional[int]) – maximum sequence length for tokenizing a single list of messages, after which the input will be truncated. Default is None.
prompt_template (Optional[_TemplateType]) – optional specified prompt template. If a string, it is assumed to be the dotpath of a PromptTemplateInterface class. If a dictionary, it is assumed to be a custom prompt template mapping role to the prepend/append tags. Default is MistralChatTemplate.
truncation_type (str) – type of truncation to apply, either “left” or “right”. Default is “right”.

Returns:

Instantiation of the Mistral tokenizer

Return type:

MistralTokenizer

mistral_tokenizer

Docs

Tutorials

Resources