Shortcuts

OnlineDTLoss

class torchrl.objectives.OnlineDTLoss(*args, **kwargs)[source]

TorchRL implementation of the Online Decision Transformer loss.

Presented in “Online Decision Transformer” <https://arxiv.org/abs/2202.05607>

Parameters:

actor_network (ProbabilisticActor) – stochastic actor

Keyword Arguments:
  • alpha_init (float, optional) – initial entropy multiplier. Default is 1.0.

  • min_alpha (float, optional) – min value of alpha. Default is None (no minimum value).

  • max_alpha (float, optional) – max value of alpha. Default is None (no maximum value).

  • fixed_alpha (bool, optional) – if True, alpha will be fixed to its initial value. Otherwise, alpha will be optimized to match the ‘target_entropy’ value. Default is False.

  • target_entropy (float or str, optional) – Target entropy for the stochastic policy. Default is “auto”, where target entropy is computed as -prod(n_actions).

  • samples_mc_entropy (int) – number of samples to estimate the entropy

  • reduction (str, optional) – Specifies the reduction to apply to the output: "none" | "mean" | "sum". "none": no reduction will be applied, "mean": the sum of the output will be divided by the number of elements in the output, "sum": the output will be summed. Default: "mean".

forward(tensordict: TensorDictBase = None) TensorDictBase[source]

Compute the loss for the Online Decision Transformer.

Docs

Access comprehensive developer documentation for PyTorch

View Docs

Tutorials

Get in-depth tutorials for beginners and advanced developers

View Tutorials

Resources

Find development resources and get your questions answered

View Resources