VmasEnv

torchrl.envs.VmasEnv(*args, **kwargs)[source]

Vmas environment wrapper.

GitHub: https://github.com/proroklab/VectorizedMultiAgentSimulator

Parameters:

scenario (str or vmas.simulator.scenario.BaseScenario) – the vmas scenario to build. Must be one of available_envs. For a description and rendering of available scenarios see the README.

Keyword Arguments:

num_envs (int) – Number of vectorized simulation environments. VMAS performs vectorized simulations using PyTorch. This argument indicates the number of vectorized environments that should be simulated in a batch. It will also determine the batch size of the environment.
device (torch.device, optional) – Device for simulation. Defaults to the defaultt device. All the tensors created by VMAS will be placed on this device.
continuous_actions (bool, optional) – Whether to use continuous actions. Defaults to True. If False, actions will be discrete. The number of actions and their size will depend on the chosen scenario. See the VMAS repository for more info.
max_steps (int, optional) – Horizon of the task. Defaults to None (infinite horizon). Each VMAS scenario can be terminating or not. If max_steps is specified, the scenario is also terminated (and the "terminated" flag is set) whenever this horizon is reached. Unlike gym’s TimeLimit transform or torchrl’s StepCounter, this argument will not set the "truncated" entry in the tensordict.
categorical_actions (bool, optional) – if the environment actions are discrete, whether to transform them to categorical or one-hot. Defaults to True.
group_map (MarlGroupMapType or Dict[str, List[str]], optional) – how to group agents in tensordicts for input/output. By default, if the agent names follow the "<name>_<int>" convention, they will be grouped by "<name>". If they do not follow this convention, they will be all put in one group named "agents". Otherwise, a group map can be specified or selected from some premade options. See MarlGroupMapType for more info.
**kwargs (Dict, optional) – These are additional arguments that can be passed to the VMAS scenario constructor. (e.g., number of agents, reward sparsity). The available arguments will vary based on the chosen scenario. To see the available arguments for a specific scenario, see the constructor in its file from the scenario folder.

Variables:

group_map (Dict[str, List[str]]) – how to group agents in tensordicts for input/output. See MarlGroupMapType for more info.
agent_names (list of str) – names of the agent in the environment
agent_names_to_indices_map (Dict[str, int]) – dictionary mapping agent names to their index in the environment
unbatched_action_spec (TensorSpec) – version of the spec without the vectorized dimension
unbatched_observation_spec (TensorSpec) – version of the spec without the vectorized dimension
unbatched_reward_spec (TensorSpec) – version of the spec without the vectorized dimension
het_specs (bool) – whether the environment has any lazy spec
het_specs_map (Dict[str, bool]) – dictionary mapping each group to a flag representing of the group has lazy specs
available_envs (List[str]) – the list of the scenarios available to build.

Warning

VMAS returns a single done flag which does not distinguish between when the env reached max_steps and termination. If you deem the truncation signal necessary, set max_steps to None and use a StepCounter transform.

Examples

>>>  env = VmasEnv(
...      scenario="flocking",
...      num_envs=32,
...      continuous_actions=True,
...      max_steps=200,
...      device="cpu",
...      seed=None,
...      # Scenario kwargs
...      n_agents=5,
...  )
>>>  print(env.rollout(10))
TensorDict(
    fields={
        agents: TensorDict(
            fields={
                action: Tensor(shape=torch.Size([32, 10, 5, 2]), device=cpu, dtype=torch.float32, is_shared=False),
                info: TensorDict(
                    fields={
                        agent_collision_rew: Tensor(shape=torch.Size([32, 10, 5, 1]), device=cpu, dtype=torch.float32, is_shared=False),
                        agent_distance_rew: Tensor(shape=torch.Size([32, 10, 5, 1]), device=cpu, dtype=torch.float32, is_shared=False)},
                    batch_size=torch.Size([32, 10, 5]),
                    device=cpu,
                    is_shared=False),
                observation: Tensor(shape=torch.Size([32, 10, 5, 18]), device=cpu, dtype=torch.float32, is_shared=False)},
            batch_size=torch.Size([32, 10, 5]),
            device=cpu,
            is_shared=False),
        done: Tensor(shape=torch.Size([32, 10, 1]), device=cpu, dtype=torch.bool, is_shared=False),
        next: TensorDict(
            fields={
                agents: TensorDict(
                    fields={
                        info: TensorDict(
                            fields={
                                agent_collision_rew: Tensor(shape=torch.Size([32, 10, 5, 1]), device=cpu, dtype=torch.float32, is_shared=False),
                                agent_distance_rew: Tensor(shape=torch.Size([32, 10, 5, 1]), device=cpu, dtype=torch.float32, is_shared=False)},
                            batch_size=torch.Size([32, 10, 5]),
                            device=cpu,
                            is_shared=False),
                        observation: Tensor(shape=torch.Size([32, 10, 5, 18]), device=cpu, dtype=torch.float32, is_shared=False),
                        reward: Tensor(shape=torch.Size([32, 10, 5, 1]), device=cpu, dtype=torch.float32, is_shared=False)},
                    batch_size=torch.Size([32, 10, 5]),
                    device=cpu,
                    is_shared=False),
                done: Tensor(shape=torch.Size([32, 10, 1]), device=cpu, dtype=torch.bool, is_shared=False),
                terminated: Tensor(shape=torch.Size([32, 10, 1]), device=cpu, dtype=torch.bool, is_shared=False)},
            batch_size=torch.Size([32, 10]),
            device=cpu,
            is_shared=False),
        terminated: Tensor(shape=torch.Size([32, 10, 1]), device=cpu, dtype=torch.bool, is_shared=False)},
    batch_size=torch.Size([32, 10]),
    device=cpu,
    is_shared=False)

VmasEnv

Docs

Tutorials

Resources