TiedLinear

class torchtune.modules.TiedLinear(tied_module: Module)[source]

A tied linear layer, without bias, that shares the same weight as another linear layer. This is useful for models that use tied weights, such as qwen2_0_5b(), qwen2_1_5b() and all of the gemma() and llama3_2() models.

It requires as input an nn.Module, instead of the weight of the module, so it can work with FSDP. When FSDP is applied, the memory pointer to the weight is different, but the nn.Module remains the same. This is why we need to pass the nn.Module instead of the weight, if we want to keep the weights tied.

Parameters:: tied_module (nn.Module) – The module whose weight is shared. Only the weight is used. The bias is ignored.
Raises:: AttributeError – If the provided module does not have an attribute ‘weight’.

TiedLinear

Docs

Tutorials

Resources