Shortcuts

parse_hf_tokenizer_json

torchtune.modules.tokenizers.parse_hf_tokenizer_json(tokenizer_json_path: str) Dict[str, int][source]

Parse the tokenizer.json file from a Hugging Face model to extract the special token str to id mapping.

Parameters:

tokenizer_json_path (str) – Path to the tokenizer.json file.

Returns:

The special token str to id mapping.

Return type:

Dict[str, int]

Docs

Access comprehensive developer documentation for PyTorch

View Docs

Tutorials

Get in-depth tutorials for beginners and advanced developers

View Tutorials

Resources

Find development resources and get your questions answered

View Resources