PrioritizedSampler¶
- class torchrl.data.replay_buffers.PrioritizedSampler(max_capacity: int, alpha: float, beta: float, eps: float = 1e-08, dtype: dtype = torch.float32, reduction: str = 'max')[source]¶
Prioritized sampler for replay buffer.
- Presented in “Schaul, T.; Quan, J.; Antonoglou, I.; and Silver, D. 2015.
Prioritized experience replay.” (https://arxiv.org/abs/1511.05952)
- Parameters:
alpha (float) – exponent α determines how much prioritization is used, with α = 0 corresponding to the uniform case.
beta (float) – importance sampling negative exponent.
eps (float, optional) – delta added to the priorities to ensure that the buffer does not contain null priorities. Defaults to 1e-8.
reduction (str, optional) – the reduction method for multidimensional tensordicts (ie stored trajectories). Can be one of “max”, “min”, “median” or “mean”.
- update_priority(index: Union[int, Tensor], priority: Union[float, Tensor]) None [source]¶
Updates the priority of the data pointed by the index.
- Parameters:
index (int or torch.Tensor) – indexes of the priorities to be updated.
priority (Number or torch.Tensor) – new priorities of the indexed elements.