MultiAgentConvNet¶
- class torchrl.modules.MultiAgentConvNet(n_agents: int, centralised: bool, share_params: bool, *, in_features: int | None = None, device: DEVICE_TYPING | None = None, num_cells: Sequence[int] | None = None, kernel_sizes: Union[Sequence[Union[int, Sequence[int]]], int] = 5, strides: Union[Sequence, int] = 2, paddings: Union[Sequence, int] = 0, activation_class: Type[nn.Module] = <class 'torch.nn.modules.activation.ELU'>, **kwargs)[source]¶
Multi-agent CNN.
In MARL settings, agents may or may not share the same policy for their actions: we say that the parameters can be shared or not. Similarly, a network may take the entire observation space (across agents) or on a per-agent basis to compute its output, which we refer to as “centralized” and “non-centralized”, respectively.
It expects inputs with shape
(*B, n_agents, channels, x, y)
.- Parameters:
n_agents (int) – number of agents.
centralised (bool) – If
True
, each agent will use the inputs of all agents to compute its output, resulting in input of shape(*B, n_agents * channels, x, y)
. Otherwise, each agent will only use its data as input.share_params (bool) – If
True
, the sameConvNet
will be used to make the forward pass for all agents (homogeneous policies). Otherwise, each agent will use a differentConvNet
to process its input (heterogeneous policies).
- Keyword Arguments:
in_features (int, optional) – the input feature dimension. If left to
None
, a lazy module is used.device (str or torch.device, optional) – device to create the module on.
num_cells (int or Sequence[int], optional) – number of cells of every layer in between the input and output. If an integer is provided, every layer will have the same number of cells. If an iterable is provided, the linear layers
out_features
will match the content ofnum_cells
.kernel_sizes (int, Sequence[Union[int, Sequence[int]]]) – Kernel size(s) of the convolutional network. Defaults to
5
.strides (int or Sequence[int]) – Stride(s) of the convolutional network. If iterable, the length must match the depth, defined by the num_cells or depth arguments. Defaults to
2
.activation_class (Type[nn.Module]) – activation class to be used. Default to
torch.nn.ELU
.**kwargs – for
ConvNet
can be passed to customize the ConvNet.
Examples
>>> import torch >>> from torchrl.modules import MultiAgentConvNet >>> batch = (3,2) >>> n_agents = 7 >>> channels, x, y = 3, 100, 100 >>> obs = torch.randn(*batch, n_agents, channels, x, y) >>> # First lets consider a centralised network with shared parameters. >>> cnn = MultiAgentConvNet( ... n_agents, ... centralised = True, ... share_params = True ... ) >>> print(cnn) MultiAgentConvNet( (agent_networks): ModuleList( (0): ConvNet( (0): LazyConv2d(0, 32, kernel_size=(5, 5), stride=(2, 2)) (1): ELU(alpha=1.0) (2): Conv2d(32, 32, kernel_size=(5, 5), stride=(2, 2)) (3): ELU(alpha=1.0) (4): Conv2d(32, 32, kernel_size=(5, 5), stride=(2, 2)) (5): ELU(alpha=1.0) (6): SquashDims() ) ) ) >>> result = cnn(obs) >>> # The final dimension of the resulting tensor would be determined based on the layer definition arguments and the shape of input 'obs'. >>> print(result.shape) torch.Size([3, 2, 7, 2592]) >>> # Since both observations and parameters are shared, we expect all agents to have identical outputs (eg. for a value function) >>> print(all(result[0,0,0] == result[0,0,1])) True
>>> # Alternatively, a local network with parameter sharing (eg. decentralised weight sharing policy) >>> cnn = MultiAgentConvNet( ... n_agents, ... centralised = False, ... share_params = True ... ) >>> print(cnn) MultiAgentConvNet( (agent_networks): ModuleList( (0): ConvNet( (0): Conv2d(4, 32, kernel_size=(5, 5), stride=(2, 2)) (1): ELU(alpha=1.0) (2): Conv2d(32, 32, kernel_size=(5, 5), stride=(2, 2)) (3): ELU(alpha=1.0) (4): Conv2d(32, 32, kernel_size=(5, 5), stride=(2, 2)) (5): ELU(alpha=1.0) (6): SquashDims() ) ) ) >>> print(result.shape) torch.Size([3, 2, 7, 2592]) >>> # Parameters are shared but not observations, hence each agent has a different output. >>> print(all(result[0,0,0] == result[0,0,1])) False
>>> # Or multiple local networks identical in structure but with differing weights. >>> cnn = MultiAgentConvNet( ... n_agents, ... centralised = False, ... share_params = False ... ) >>> print(cnn) MultiAgentConvNet( (agent_networks): ModuleList( (0-6): 7 x ConvNet( (0): Conv2d(4, 32, kernel_size=(5, 5), stride=(2, 2)) (1): ELU(alpha=1.0) (2): Conv2d(32, 32, kernel_size=(5, 5), stride=(2, 2)) (3): ELU(alpha=1.0) (4): Conv2d(32, 32, kernel_size=(5, 5), stride=(2, 2)) (5): ELU(alpha=1.0) (6): SquashDims() ) ) ) >>> print(result.shape) torch.Size([3, 2, 7, 2592]) >>> print(all(result[0,0,0] == result[0,0,1])) False
>>> # Or where inputs are shared but not parameters. >>> cnn = MultiAgentConvNet( ... n_agents, ... centralised = True, ... share_params = False ... ) >>> print(cnn) MultiAgentConvNet( (agent_networks): ModuleList( (0-6): 7 x ConvNet( (0): Conv2d(28, 32, kernel_size=(5, 5), stride=(2, 2)) (1): ELU(alpha=1.0) (2): Conv2d(32, 32, kernel_size=(5, 5), stride=(2, 2)) (3): ELU(alpha=1.0) (4): Conv2d(32, 32, kernel_size=(5, 5), stride=(2, 2)) (5): ELU(alpha=1.0) (6): SquashDims() ) ) ) >>> print(result.shape) torch.Size([3, 2, 7, 2592]) >>> print(all(result[0,0,0] == result[0,0,1])) False