Shortcuts

RewardSum

class torchrl.envs.transforms.RewardSum(in_keys: Sequence[NestedKey] | None = None, out_keys: Sequence[NestedKey] | None = None, reset_keys: Sequence[NestedKey] | None = None, *, reward_spec: bool = False)[source]

Tracks episode cumulative rewards.

This transform accepts a list of tensordict reward keys (i.e. ´in_keys´) and tracks their cumulative value along the time dimension for each episode.

When called, the transform writes a new tensordict entry for each in_key named episode_{in_key} where the cumulative values are written.

Parameters:
  • in_keys (list of NestedKeys, optional) – Input reward keys. All ´in_keys´ should be part of the environment reward_spec. If no in_keys are specified, this transform assumes "reward" to be the input key. However, multiple rewards (e.g. "reward1" and "reward2"") can also be specified.

  • out_keys (list of NestedKeys, optional) – The output sum keys, should be one per each input key.

  • reset_keys (list of NestedKeys, optional) – the list of reset_keys to be used, if the parent environment cannot be found. If provided, this value will prevail over the environment reset_keys.

Keyword Arguments:

reward_spec (bool, optional) – if True, the new reward entry will be registered in the reward specs. Defaults to False (registered in observation_specs).

Examples

>>> from torchrl.envs.transforms import RewardSum, TransformedEnv
>>> from torchrl.envs.libs.gym import GymEnv
>>> env = TransformedEnv(GymEnv("CartPole-v1"), RewardSum())
>>> env.set_seed(0)
>>> torch.manual_seed(0)
>>> td = env.reset()
>>> print(td["episode_reward"])
tensor([0.])
>>> td = env.rollout(3)
>>> print(td["next", "episode_reward"])
tensor([[1.],
        [2.],
        [3.]])
forward(tensordict: TensorDictBase) TensorDictBase[source]

Reads the input tensordict, and for the selected keys, applies the transform.

transform_input_spec(input_spec: TensorSpec) TensorSpec[source]

Transforms the input spec such that the resulting spec matches transform mapping.

Parameters:

input_spec (TensorSpec) – spec before the transform

Returns:

expected spec after the transform

transform_observation_spec(observation_spec: TensorSpec) TensorSpec[source]

Transforms the observation spec, adding the new keys generated by RewardSum.

transform_reward_spec(reward_spec: TensorSpec) TensorSpec[source]

Transforms the reward spec such that the resulting spec matches transform mapping.

Parameters:

reward_spec (TensorSpec) – spec before the transform

Returns:

expected spec after the transform

Docs

Access comprehensive developer documentation for PyTorch

View Docs

Tutorials

Get in-depth tutorials for beginners and advanced developers

View Tutorials

Resources

Find development resources and get your questions answered

View Resources