torchtune.utils¶
Checkpointing¶
TorchTune offers checkpointers to allow seamless transitioning between checkpoint formats for training and interoperability with the rest of the ecosystem. For a comprehensive overview of checkpointing, please see the checkpointing deep-dive.
Checkpointer which reads and writes checkpoints in HF's format. |
|
Checkpointer which reads and writes checkpoints in Meta's format. |
Distributed¶
Utilities for enabling and working with distributed training.
Initialize torch.distributed. |
|
Function that gets the current world size (aka total number of ranks) and rank number of the current trainer. |
Reduced Precision¶
Utilities for working in a reduced precision setting.
Get the torch.dtype corresponding to the given precision string. |
|
Return a list of supported dtypes for finetuning. |
Memory Management¶
Utilities to reduce memory consumption during training.
Utility to setup activation checkpointing and wrap the model for checkpointing. |
Performance and Profiling¶
TorchTune provides utilities to profile and debug the performance of your finetuning job.
Utility component that wraps around torch.profiler to profile model's operators. |
Metric Logging¶
Various logging utilities.
Logger for use w/ Weights and Biases application (https://wandb.ai/). |
|
Logger for use w/ PyTorch's implementation of TensorBoard (https://pytorch.org/docs/stable/tensorboard.html). |
|
Logger to standard output. |
|
Logger to disk. |
Data¶
Utilities for working with data and datasets.
Pad a batch of sequences to the longest sequence length in the batch, and convert integer lists to tensors. |
Miscellaneous¶
A helpful utility subclass of the |
|
Get a logger with a stream handler. |
|
Function that takes or device or device string, verifies it's correct and availabe given the machine and distributed settings, and returns a torch.device. |
|
Function that sets seed for pseudo-random number generators across commonly used libraries. |