LMHeadActorValueOperator
- class torchrl.modules.tensordict_module.LMHeadActorValueOperator(*args, **kwargs)[source]
Builds an Actor-Value operator from an huggingface-like *LMHeadModel.
This method:
takes as input an huggingface-like *LMHeadModel
extracts the final linear layer uses it as a base layer of the actor_head and adds the sampling layer
uses the common transformer as common model
adds a linear critic
- Parameters:
base_model – a torch model composed by a .transformer model and .lm_head linear layer