Shortcuts

TiedLinear

class torchtune.modules.TiedLinear(tied_module: Module)[source]

A tied linear layer, without bias, that shares the same weight as another linear layer. This is useful for models that use tied weights, such as qwen2_0_5b(), qwen2_1_5b() and all of the gemma() and llama3_2() models.

It requires as input an nn.Module, instead of the weight of the module, so it can work with FSDP. When FSDP is applied, the memory pointer to the weight is different, but the nn.Module remains the same. This is why we need to pass the nn.Module instead of the weight, if we want to keep the weights tied.

Parameters:

tied_module (nn.Module) – The module whose weight is shared. Only the weight is used. The bias is ignored.

Raises:

AttributeError – If the provided module does not have an attribute ‘weight’.

Docs

Access comprehensive developer documentation for PyTorch

View Docs

Tutorials

Get in-depth tutorials for beginners and advanced developers

View Tutorials

Resources

Find development resources and get your questions answered

View Resources