Shortcuts

VmasEnv

torchrl.envs.VmasEnv(*args, **kwargs)[source]

Vmas environment wrapper.

GitHub: https://github.com/proroklab/VectorizedMultiAgentSimulator

Paper: https://arxiv.org/abs/2207.03530

Parameters:

scenario (str or vmas.simulator.scenario.BaseScenario) – the vmas scenario to build. Must be one of available_envs. For a description and rendering of available scenarios see the README.

Keyword Arguments:
  • num_envs (int) – Number of vectorized simulation environments. VMAS perfroms vectorized simulations using PyTorch. This argument indicates the number of vectorized environments that should be simulated in a batch. It will also determine the batch size of the environment.

  • device (torch.device, optional) – Device for simulation. Defaults to the defaultt device. All the tensors created by VMAS will be placed on this device.

  • continuous_actions (bool, optional) – Whether to use continuous actions. Defaults to True. If False, actions will be discrete. The number of actions and their size will depend on the chosen scenario. See the VMAS repositiory for more info.

  • max_steps (int, optional) – Horizon of the task. Defaults to None (infinite horizon). Each VMAS scenario can be terminating or not. If max_steps is specified, the scenario is also terminated (and the "terminated" flag is set) whenever this horizon is reached. Unlike gym’s TimeLimit transform or torchrl’s StepCounter, this argument will not set the "truncated" entry in the tensordict.

  • categorical_actions (bool, optional) – if the environment actions are discrete, whether to transform them to categorical or one-hot. Defaults to True.

  • group_map (MarlGroupMapType or Dict[str, List[str]], optional) – how to group agents in tensordicts for input/output. By default, if the agent names follow the "<name>_<int>" convention, they will be grouped by "<name>". If they do not follow this convention, they will be all put in one group named "agents". Otherwise, a group map can be specified or selected from some premade options. See MarlGroupMapType for more info.

  • **kwargs (Dict, optional) – These are additional arguments that can be passed to the VMAS scenario constructor. (e.g., number of agents, reward sparsity). The available arguments will vary based on the chosen scenario. To see the available arguments for a specific scenario, see the constructor in its file from the scenario folder.

Variables:
  • group_map (Dict[str, List[str]]) – how to group agents in tensordicts for input/output. See MarlGroupMapType for more info.

  • agent_names (list of str) – names of the agent in the environment

  • agent_names_to_indices_map (Dict[str, int]) – dictionary mapping agent names to their index in the enviornment

  • unbatched_action_spec (TensorSpec) – version of the spec without the vectorized dimension

  • unbatched_observation_spec (TensorSpec) – version of the spec without the vectorized dimension

  • unbatched_reward_spec (TensorSpec) – version of the spec without the vectorized dimension

  • het_specs (bool) – whether the enviornment has any lazy spec

  • het_specs_map (Dict[str, bool]) – dictionary mapping each group to a flag representing of the group has lazy specs

  • available_envs (List[str]) – the list of the scenarios available to build.

Warning

VMAS returns a single done flag which does not distinguish between when the env reached max_steps and termination. If you deem the truncation signal necessary, set max_steps to None and use a StepCounter transform.

Examples

>>>  env = VmasEnv(
...      scenario="flocking",
...      num_envs=32,
...      continuous_actions=True,
...      max_steps=200,
...      device="cpu",
...      seed=None,
...      # Scenario kwargs
...      n_agents=5,
...  )
>>>  print(env.rollout(10))
TensorDict(
    fields={
        agents: TensorDict(
            fields={
                action: Tensor(shape=torch.Size([32, 10, 5, 2]), device=cpu, dtype=torch.float32, is_shared=False),
                info: TensorDict(
                    fields={
                        agent_collision_rew: Tensor(shape=torch.Size([32, 10, 5, 1]), device=cpu, dtype=torch.float32, is_shared=False),
                        agent_distance_rew: Tensor(shape=torch.Size([32, 10, 5, 1]), device=cpu, dtype=torch.float32, is_shared=False)},
                    batch_size=torch.Size([32, 10, 5]),
                    device=cpu,
                    is_shared=False),
                observation: Tensor(shape=torch.Size([32, 10, 5, 18]), device=cpu, dtype=torch.float32, is_shared=False)},
            batch_size=torch.Size([32, 10, 5]),
            device=cpu,
            is_shared=False),
        done: Tensor(shape=torch.Size([32, 10, 1]), device=cpu, dtype=torch.bool, is_shared=False),
        next: TensorDict(
            fields={
                agents: TensorDict(
                    fields={
                        info: TensorDict(
                            fields={
                                agent_collision_rew: Tensor(shape=torch.Size([32, 10, 5, 1]), device=cpu, dtype=torch.float32, is_shared=False),
                                agent_distance_rew: Tensor(shape=torch.Size([32, 10, 5, 1]), device=cpu, dtype=torch.float32, is_shared=False)},
                            batch_size=torch.Size([32, 10, 5]),
                            device=cpu,
                            is_shared=False),
                        observation: Tensor(shape=torch.Size([32, 10, 5, 18]), device=cpu, dtype=torch.float32, is_shared=False),
                        reward: Tensor(shape=torch.Size([32, 10, 5, 1]), device=cpu, dtype=torch.float32, is_shared=False)},
                    batch_size=torch.Size([32, 10, 5]),
                    device=cpu,
                    is_shared=False),
                done: Tensor(shape=torch.Size([32, 10, 1]), device=cpu, dtype=torch.bool, is_shared=False),
                terminated: Tensor(shape=torch.Size([32, 10, 1]), device=cpu, dtype=torch.bool, is_shared=False)},
            batch_size=torch.Size([32, 10]),
            device=cpu,
            is_shared=False),
        terminated: Tensor(shape=torch.Size([32, 10, 1]), device=cpu, dtype=torch.bool, is_shared=False)},
    batch_size=torch.Size([32, 10]),
    device=cpu,
    is_shared=False)

Docs

Access comprehensive developer documentation for PyTorch

View Docs

Tutorials

Get in-depth tutorials for beginners and advanced developers

View Tutorials

Resources

Find development resources and get your questions answered

View Resources