VDNMixer

class torchrl.modules.VDNMixer(n_agents: int, device: Union[device, str, int])[source]

Value-Decomposition Network mixer.

Mixes the local Q values of the agents into a global Q value by summing them together. From the paper https://arxiv.org/abs/1706.05296 .

It transforms the local value of each agent’s chosen action of shape (*B, self.n_agents, 1), into a global value with shape (*B, 1). Used with the torchrl.objectives.QMixerLoss. See examples/multiagent/qmix_vdn.py for examples.

Parameters:

n_agents (int) – number of agents.
device (str or torch.Device) – torch device for the network.

Examples

>>> import torch
>>> from tensordict import TensorDict
>>> from tensordict.nn import TensorDictModule
>>> from torchrl.modules.models.multiagent import VDNMixer
>>> n_agents = 4
>>> vdn = TensorDictModule(
...     module=VDNMixer(
...         n_agents=n_agents,
...         device="cpu",
...     ),
...     in_keys=[("agents","chosen_action_value")],
...     out_keys=["chosen_action_value"],
... )
>>> td = TensorDict({"agents": TensorDict({"chosen_action_value": torch.zeros(32, n_agents, 1)}, [32, n_agents])}, [32])
>>> td
TensorDict(
    fields={
        agents: TensorDict(
            fields={
                chosen_action_value: Tensor(shape=torch.Size([32, 4, 1]), device=cpu, dtype=torch.float32, is_shared=False)},
            batch_size=torch.Size([32, 4]),
            device=None,
            is_shared=False)},
    batch_size=torch.Size([32]),
    device=None,
    is_shared=False)
>>> vdn(td)
TensorDict(
    fields={
        agents: TensorDict(
            fields={
                chosen_action_value: Tensor(shape=torch.Size([32, 4, 1]), device=cpu, dtype=torch.float32, is_shared=False)},
            batch_size=torch.Size([32, 4]),
            device=None,
            is_shared=False),
        chosen_action_value: Tensor(shape=torch.Size([32, 1]), device=cpu, dtype=torch.float32, is_shared=False)},
    batch_size=torch.Size([32]),
    device=None,
    is_shared=False)

mix(chosen_action_value: Tensor, state: Tensor)[source]

Forward pass for the mixer.

Parameters:: chosen_action_value – Tensor of shape [*B, n_agents]
Returns:: Tensor of shape [*B]
Return type:: chosen_action_value

VDNMixer

Docs

Tutorials

Resources