LazyStackedTensorDict

class tensordict.LazyStackedTensorDict(*tensordicts: T, stack_dim: int = 0, hook_out: callable | None = None, hook_in: callable | None = None, batch_size: Sequence[int] | None = None, device: torch.device | None = None, names: Sequence[str] | None = None, stack_dim_name: str | None = None, strict_shape: bool = False)

A Lazy stack of TensorDicts.

When stacking TensorDicts together, the default behaviour is to put them in a stack that is not instantiated. This allows to seamlessly work with stacks of tensordicts with operations that will affect the original tensordicts.

Parameters:

*tensordicts (TensorDict instances) – a list of tensordict with same batch size.
stack_dim (int) – a dimension (between -td.ndimension() and td.ndimension()-1 along which the stack should be performed.
hook_out (callable, optional) – a callable to execute after get().
hook_in (callable, optional) – a callable to execute before set().
stack_dim_name – the name of the stack dimension. Defaults to None.

Examples

>>> from tensordict import TensorDict
>>> import torch
>>> tds = [TensorDict({'a': torch.randn(3, 4)}, batch_size=[3])
...     for _ in range(10)]
>>> td_stack = torch.stack(tds, -1)
>>> print(td_stack.shape)
torch.Size([3, 10])
>>> print(td_stack.get("a").shape)
torch.Size([3, 10, 4])
>>> print(td_stack[:, 0] is tds[0])
True

abs() → T: Computes the absolute value of each element of the TensorDict.

abs_() → T: Computes the absolute value of each element of the TensorDict in-place.

acos() → T: Computes the acos() value of each element of the TensorDict.

acos_() → T: Computes the acos() value of each element of the TensorDict in-place.

add(other: tensordict.base.TensorDictBase | torch.Tensor, *, alpha: float | None = None, default: str | torch.Tensor | None = None) → TensorDictBase

Adds other, scaled by alpha, to self.

{out}_{i} = {input}_{i} + {alpha} \times {other}_{i}

Parameters:

other (TensorDictBase or torch.Tensor) – the tensor or TensorDict to add to self.

Keyword Arguments:

alpha (Number, optional) – the multiplier for other.
default (torch.Tensor or str, optional) – the default value to use for exclusive entries. If none is provided, the two tensordicts key list must match exactly. If default="intersection" is passed, only the intersecting key sets will be considered and other keys will be ignored. In all other cases, default will be used for all missing entries on both sides of the operation.

add_(other: tensordict.base.TensorDictBase | torch.Tensor | float, *, alpha: Optional[float] = None): In-place version of add().

Note

In-place add does not support default keyword argument.

addcdiv(other1: tensordict.base.TensorDictBase | torch.Tensor, other2: tensordict.base.TensorDictBase | torch.Tensor, value: float | None = 1)

Performs the element-wise division of other1 by other2, multiplies the result by the scalar value and adds it to self.

{out}_{i} = {input}_{i} + value \times \frac{{tensor1}_{i}}{{tensor2}_{i}}

The shapes of the elements of self, other1, and other2 must be broadcastable.

For inputs of type FloatTensor or DoubleTensor, value must be a real number, otherwise an integer.

Parameters:

other1 (TensorDict or Tensor) – the numerator tensordict (or tensor)
tensor2 (TensorDict or Tensor) – the denominator tensordict (or tensor)

Keyword Arguments:

value (Number, optional) – multiplier for $tensor1 / tensor2$

addcdiv_(other1, other2, *, value: float | None = 1): The in-place version of addcdiv().

addcmul(other1: tensordict.base.TensorDictBase | torch.Tensor, other2: tensordict.base.TensorDictBase | torch.Tensor, *, value: float | None = 1)

Performs the element-wise multiplication of other1 by other2, multiplies the result by the scalar value and adds it to self.

{out}_{i} = {input}_{i} + value \times {other1}_{i} \times {other2}_{i}

The shapes of self, other1, and other2 must be broadcastable.

For inputs of type FloatTensor or DoubleTensor, value must be a real number, otherwise an integer.

Parameters:

other1 (TensorDict or Tensor) – the tensordict or tensor to be multiplied
other2 (TensorDict or Tensor) – the tensordict or tensor to be multiplied

Keyword Arguments:

value (Number, optional) – multiplier for $o t h e r 1 . * o t h e r 2$

addcmul_(other1, other2, *, value: float | None = 1): The in-place version of addcmul().

all(dim: Optional[int] = None) → bool | tensordict.base.TensorDictBase

Checks if all values are True/non-null in the tensordict.

Parameters:: dim (int, optional) – if None, returns a boolean indicating whether all tensors return tensor.all() == True If integer, all is called upon the dimension specified if and only if this dimension is compatible with the tensordict shape.

amax(dim: int | NO_DEFAULT = _NoDefault.ZERO, keepdim: bool = False, *, reduce: bool | None = None) → TensorDictBase | torch.Tensor

Returns the maximum values of all elements in the input tensordict.

Same as max() with return_indices=False.

amin(dim: int | NO_DEFAULT = _NoDefault.ZERO, keepdim: bool = False, *, reduce: bool | None = None) → TensorDictBase | torch.Tensor

Returns the minimum values of all elements in the input tensordict.

Same as min() with return_indices=False.

any(dim: Optional[int] = None) → bool | tensordict.base.TensorDictBase

Checks if any value is True/non-null in the tensordict.

Parameters:: dim (int, optional) – if None, returns a boolean indicating whether all tensors return tensor.any() == True. If integer, all is called upon the dimension specified if and only if this dimension is compatible with the tensordict shape.

append(tensordict: T) → None

Append a TensorDict onto the stack.

Analogous to list.append. The appended TensorDict must have compatible batch_size and device. The append operation is in-place, nothing is returned.

Parameters:: tensordict (TensorDictBase) – The TensorDict to be appended onto the stack.

apply(fn: Callable, *others: T, batch_size: Optional[Sequence[int]] = None, device: torch.device | None = _NoDefault.ZERO, names: Optional[Sequence[str]] = _NoDefault.ZERO, inplace: bool = False, default: Any = _NoDefault.ZERO, filter_empty: Optional[bool] = None, propagate_lock: bool = False, call_on_nested: bool = False, out: Optional[TensorDictBase] = None, **constructor_kwargs) → Optional[T]

Applies a callable to all values stored in the tensordict and sets them in a new tensordict.

The callable signature must be Callable[Tuple[Tensor, ...], Optional[Union[Tensor, TensorDictBase]]].

Parameters:

fn (Callable) – function to be applied to the tensors in the tensordict.
*others (TensorDictBase instances, optional) – if provided, these tensordict instances should have a structure matching the one of self. The fn argument should receive as many unnamed inputs as the number of tensordicts, including self. If other tensordicts have missing entries, a default value can be passed through the default keyword argument.

Keyword Arguments:

batch_size (sequence of int, optional) – if provided, the resulting TensorDict will have the desired batch_size. The batch_size argument should match the batch_size after the transformation. This is a keyword only argument.
device (torch.device, optional) – the resulting device, if any.
names (list of str, optional) – the new dimension names, in case the batch_size is modified.
inplace (bool, optional) – if True, changes are made in-place. Default is False. This is a keyword only argument.
default (Any, optional) – default value for missing entries in the other tensordicts. If not provided, missing entries will raise a KeyError.
filter_empty (bool, optional) – if True, empty tensordicts will be filtered out. This also comes with a lower computational cost as empty data structures won’t be created and destroyed. Non-tensor data is considered as a leaf and thereby will be kept in the tensordict even if left untouched by the function. Defaults to False for backward compatibility.
propagate_lock (bool, optional) – if True, a locked tensordict will produce another locked tensordict. Defaults to False.

call_on_nested (bool, optional) –

if True, the function will be called on first-level tensors and containers (TensorDict or tensorclass). In this scenario, func is responsible of propagating its calls to nested levels. This allows a fine-grained behaviour when propagating the calls to nested tensordicts. If False, the function will only be called on leaves, and apply will take care of dispatching the function to all leaves.

>>> td = TensorDict({"a": {"b": [0.0, 1.0]}, "c": [1.0, 2.0]})
>>> def mean_tensor_only(val):
...     if is_tensor_collection(val):
...         raise RuntimeError("Unexpected!")
...     return val.mean()
>>> td_mean = td.apply(mean_tensor_only)
>>> def mean_any(val):
...     if is_tensor_collection(val):
...         # Recurse
...         return val.apply(mean_any, call_on_nested=True)
...     return val.mean()
>>> td_mean = td.apply(mean_any, call_on_nested=True)

out (TensorDictBase, optional) –
a tensordict where to write the results. This can be used to avoid creating a new tensordict:
```
>>> td = TensorDict({"a": 0})
>>> td.apply(lambda x: x+1, out=td)
>>> assert (td==1).all()
```
Warning

If the operation executed on the tensordict requires multiple keys to be accessed for a single computation, providing an out argument equal to self can cause the operation to provide silently wrong results. For instance:
```
>>> td = TensorDict({"a": 1, "b": 1})
>>> td.apply(lambda x: x+td["a"])["b"] # Right!
tensor(2)
>>> td.apply(lambda x: x+td["a"], out=td)["b"] # Wrong!
tensor(3)
```
**constructor_kwargs – additional keyword arguments to be passed to the TensorDict constructor.

Returns:

a new tensordict with transformed_in tensors.

Example

>>> td = TensorDict({
...     "a": -torch.ones(3),
...     "b": {"c": torch.ones(3)}},
...     batch_size=[3])
>>> td_1 = td.apply(lambda x: x+1)
>>> assert (td_1["a"] == 0).all()
>>> assert (td_1["b", "c"] == 2).all()
>>> td_2 = td.apply(lambda x, y: x+y, td)
>>> assert (td_2["a"] == -2).all()
>>> assert (td_2["b", "c"] == 2).all()

Note

If None is returned by the function, the entry is ignored. This can be used to filter the data in the tensordict:

>>> td = TensorDict({"1": 1, "2": 2, "b": {"2": 2, "1": 1}}, [])
>>> def filter(tensor):
...     if tensor == 1:
...         return tensor
>>> td.apply(filter)
TensorDict(
    fields={
        1: Tensor(shape=torch.Size([]), device=cpu, dtype=torch.int64, is_shared=False),
        b: TensorDict(
            fields={
                1: Tensor(shape=torch.Size([]), device=cpu, dtype=torch.int64, is_shared=False)},
            batch_size=torch.Size([]),
            device=None,
            is_shared=False)},
    batch_size=torch.Size([]),
    device=None,
    is_shared=False)

Note

The apply method will return an TensorDict instance, regardless of the input type. To keep the same type, one can execute

>>> out = td.clone(False).update(td.apply(...))

apply_(fn: Callable, *others, **kwargs)

Applies a callable to all values stored in the tensordict and re-writes them in-place.

Parameters:

fn (Callable) – function to be applied to the tensors in the tensordict.
*others (sequence of TensorDictBase, optional) – the other tensordicts to be used.

Keyword Args: See apply().

Returns:: self or a copy of self with the function applied

asin() → T: Computes the asin() value of each element of the TensorDict.

asin_() → T: Computes the asin() value of each element of the TensorDict in-place.

atan() → T: Computes the atan() value of each element of the TensorDict.

atan_() → T: Computes the atan() value of each element of the TensorDict in-place.

auto_batch_size_(batch_dims: Optional[int] = None) → T

Sets the maximum batch-size for the tensordict, up to an optional batch_dims.

Parameters:: batch_dims (int, optional) – if provided, the batch-size will be at most batch_dims long.
Returns:: self

Examples

>>> from tensordict import TensorDict
>>> import torch
>>> td = TensorDict({"a": torch.randn(3, 4, 5), "b": {"c": torch.randn(3, 4, 6)}}, batch_size=[])
>>> td.auto_batch_size_()
>>> print(td.batch_size)
torch.Size([3, 4])
>>> td.auto_batch_size_(batch_dims=1)
>>> print(td.batch_size)
torch.Size([3])

auto_device_() → T

Automatically sets the device, if it is unique.

Returns: self with the edited device attribute.

property batch_dims: int

Length of the tensordict batch size.

Returns:: int describing the number of dimensions of the tensordict.

property batch_size: Size

Shape (or batch_size) of a TensorDict.

The shape of a tensordict corresponds to the common first N dimensions of the tensors it contains, where N is an arbitrary number. The batch-size contrasts with the “feature size” which repesents the semantically relevant shapes of a tensor. For instance, a batch of videos may have shape [B, T, C, W, H], where [B, T] is the batch-size (batch and time dimensions) and [C, W, H] are the feature dimensions (channels and spacial dimensions).

The TensorDict shape is controlled by the user upon initialization (ie, it is not inferred from the tensor shapes).

The batch_size can be edited dynamically if the new size is compatible with the TensorDict content. For instance, setting the batch size to an empty value is always allowed.

Returns:: a Size object describing the TensorDict batch size.

Examples

>>> data = TensorDict({
...     "key 0": torch.randn(3, 4),
...     "key 1": torch.randn(3, 5),
...     "nested": TensorDict({"key 0": torch.randn(3, 4)}, batch_size=[3, 4])},
...     batch_size=[3])
>>> data.batch_size = () # resets the batch-size to an empty value

bfloat16(): Casts all tensors to torch.bfloat16.

bitwise_and(other: tensordict.base.TensorDictBase | torch.Tensor, *, default: str | torch.Tensor | None = None) → TensorDictBase

Performs a bitwise AND operation between self and other.

{out}_{i} = {input}_{i} \land {other}_{i}

Parameters:: other (TensorDictBase or torch.Tensor) – the tensor or TensorDict to perform the bitwise AND with.
Keyword Arguments:: default (torch.Tensor or str, optional) – the default value to use for exclusive entries. If none is provided, the two tensordicts key list must match exactly. If default="intersection" is passed, only the intersecting key sets will be considered and other keys will be ignored. In all other cases, default will be used for all missing entries on both sides of the operation.

bool(): Casts all tensors to torch.bool.

bytes(*, count_duplicates: bool = True) → int

Counts the number of bytes of the contained tensors.

Keyword Arguments:: count_duplicates (bool) – Whether to count duplicated tensor as independent or not. If False, only strictly identical tensors will be discarded (same views but different ids from a common base tensor will be counted twice). Defaults to True (each tensor is assumed to be a single copy).

classmethod cat(input, dim=0, *, out=None)

Concatenates tensordicts into a single tensordict along the given dimension.

This call is equivalent to calling torch.cat() but is compatible with torch.compile.

cat_from_tensordict(dim: int = 0, *, sorted: Optional[Union[bool, List[NestedKey]]] = None, out: Optional[Tensor] = None) → Tensor

Concatenates all entries of a tensordict in a single tensor.

Parameters:

dim (int, optional) – the dimension along which the entries should be concatenated.

Keyword Arguments:

sorted (bool or list of NestedKeys) – if True, the entries will be concatenated in alphabetical order. If False (default), the dict order will be used. Alternatively, a list of key names can be provided and the tensors will be concatenated accordingly. This incurs some overhead as the list of keys will be checked against the list of leaf names in the tensordict.
out (torch.Tensor, optional) – an optional destination tensor for the cat operation.

cat_tensors(*keys: NestedKey, out_key: NestedKey, dim: int = 0, keep_entries: bool = False) → T

Concatenates entries into a new entry and possibly remove the original values.

Parameters:: keys (sequence of NestedKey) – entries to concatenate.

Keyword Argument:

out_key (NestedKey): new key name for the concatenated inputs. keep_entries (bool, optional): if False, entries in keys will be deleted.

Defaults to False.

dim (int, optional): the dimension along which the concatenation must occur.: Defaults to 0.

Returns: self

Examples

>>> td = TensorDict(a=torch.zeros(1), b=torch.ones(1))
>>> td.cat_tensors("a", "b", out_key="c")
>>> assert "a" not in td
>>> assert (td["c"] == torch.tensor([0, 1])).all()

ceil() → T: Computes the ceil() value of each element of the TensorDict.

ceil_() → T: Computes the ceil() value of each element of the TensorDict in-place.

chunk(chunks: int, dim: int = 0) → tuple[tensordict.base.TensorDictBase, ...]

Splits a tensordict into the specified number of chunks, if possible.

Each chunk is a view of the input tensordict.

Parameters:

chunks (int) – number of chunks to return
dim (int, optional) – dimension along which to split the tensordict. Default is 0.

Examples

>>> td = TensorDict({
...     'x': torch.arange(24).reshape(3, 4, 2),
... }, batch_size=[3, 4])
>>> td0, td1 = td.chunk(dim=-1, chunks=2)
>>> td0['x']
tensor([[[ 0,  1],
         [ 2,  3]],
        [[ 8,  9],
         [10, 11]],
        [[16, 17],
         [18, 19]]])

clamp(min: tensordict.base.TensorDictBase | torch.Tensor = None, max: tensordict.base.TensorDictBase | torch.Tensor = None, *, out=None)

Clamps all elements in self into the range [ min, max ].

Letting min_value and max_value be min and max, respectively, this returns:

$y_{i} = min (max (x_{i}, {min\_value}_{i}), {max\_value}_{i})$

If min is None, there is no lower bound. Or, if max is None there is no upper bound.

Note

If min is greater than max torch.clamp(..., min, max) sets all elements in input to the value of max.

clamp_max(other: tensordict.base.TensorDictBase | torch.Tensor, *, default: str | torch.Tensor | None = None) → T

Clamps the elements of self to other if they’re superior to that value.

Parameters:: other (TensorDict or Tensor) – the other input tensordict or tensor.
Keyword Arguments:: default (torch.Tensor or str, optional) – the default value to use for exclusive entries. If none is provided, the two tensordicts key list must match exactly. If default="intersection" is passed, only the intersecting key sets will be considered and other keys will be ignored. In all other cases, default will be used for all missing entries on both sides of the operation.

clamp_max_(other: tensordict.base.TensorDictBase | torch.Tensor) → T: In-place version of clamp_max().

Note

Inplace clamp_max does not support default keyword argument.

clamp_min(other: tensordict.base.TensorDictBase | torch.Tensor, default: str | torch.Tensor | None = None) → T

Clamps the elements of self to other if they’re inferior to that value.

Parameters:: other (TensorDict or Tensor) – the other input tensordict or tensor.
Keyword Arguments:: default (torch.Tensor or str, optional) – the default value to use for exclusive entries. If none is provided, the two tensordicts key list must match exactly. If default="intersection" is passed, only the intersecting key sets will be considered and other keys will be ignored. In all other cases, default will be used for all missing entries on both sides of the operation.

clamp_min_(other: tensordict.base.TensorDictBase | torch.Tensor) → T: In-place version of clamp_min().

Note

Inplace clamp_min does not support default keyword argument.

clear() → T: Erases the content of the tensordict.

clear_device_() → T

Clears the device of the tensordict.

Returns: self

clear_refs_for_compile_() → T

Clears the weakrefs in order for the tensordict to get out of the compile region safely.

Use this whenever you hit torch._dynamo.exc.Unsupported: reconstruct: WeakRefVariable() before returning a TensorDict.

Returns: self

clone(recurse: bool = True, **kwargs) → T

Clones a TensorDictBase subclass instance onto a new TensorDictBase subclass of the same type.

To create a TensorDict instance from any other TensorDictBase subtype, call the to_tensordict() method instead.

Parameters:: recurse (bool, optional) – if True, each tensor contained in the TensorDict will be copied too. Otherwise only the TensorDict tree structure will be copied. Defaults to True.

Note

Unlike many other ops (pointwise arithmetic, shape operations, …) clone does not inherit the original lock attribute. This design choice is made such that a clone can be created to be modified, which is the most frequent usage.

complex128(): Casts all tensors to torch.complex128.

complex32(): Casts all tensors to torch.complex32.

complex64(): Casts all tensors to torch.complex64.

consolidate(filename: Optional[Union[Path, str]] = None, *, num_threads=0, device: Optional[device] = None, non_blocking: bool = False, inplace: bool = False, return_early: bool = False, use_buffer: bool = False, share_memory: bool = False, pin_memory: bool = False, metadata: bool = False) → None

Consolidates the tensordict content in a single storage for fast serialization.

Parameters:

filename (Path, optional) – an optional file path for a memory-mapped tensor to use as a storage for the tensordict.

Keyword Arguments:

num_threads (integer, optional) – the number of threads to use for populating the storage.
device (torch.device, optional) – an optional device where the storage must be instantiated.
non_blocking (bool, optional) – non_blocking argument passed to copy_().
inplace (bool, optional) – if True, the resulting tensordict is the same as self with updated values. Defaults to False.
return_early (bool, optional) – if True and num_threads>0, the method will return a future of the tensordict. The resulting tensordict can be queried using future.result().
use_buffer (bool, optional) – if True and a filename is passed, an intermediate local buffer will be created in shared memory, and the data will be copied at the storage location as a last step. This may be faster than writing directly to a distant physical memory (e.g., NFS). Defaults to False.
share_memory (bool, optional) – if True, the storage will be placed in shared memory. Defaults to False.
pin_memory (bool, optional) – whether the consolidated data should be placed in pinned memory. Defaults to False.
metadata (bool, optional) – if True, the metadata will be stored alongisde the common storage. If a filename is provided, this is without effect. Storing the metadata can be useful when one wants to control how serialization is achieved, as TensorDict handles the pickling/unpickling of consolidated TDs differently if the metadata is or isn’t available.

Note

If the tensordict is already consolidated, all arguments are ignored and self is returned. Call contiguous() to re-consolidate.

Examples

>>> import pickle
>>> import tempfile
>>> import torch
>>> import tqdm
>>> from torch.utils.benchmark import Timer
>>> from tensordict import TensorDict
>>> data = TensorDict({"a": torch.zeros(()), "b": {"c": torch.zeros(())}})
>>> data_consolidated = data.consolidate()
>>> # check that the data has a single data_ptr()
>>> assert torch.tensor([
...     v.untyped_storage().data_ptr() for v in data_c.values(True, True)
... ]).unique().numel() == 1
>>> # Serializing the tensordict will be faster with data_consolidated
>>> with open("data.pickle", "wb") as f:
...    print("regular", Timer("pickle.dump(data, f)", globals=globals()).adaptive_autorange())
>>> with open("data_c.pickle", "wb") as f:
...     print("consolidated", Timer("pickle.dump(data_consolidated, f)", globals=globals()).adaptive_autorange())

contiguous() → T: Returns a new tensordict of the same type with contiguous values (or self if values are already contiguous).

copy()

Return a shallow copy of the tensordict (ie, copies the structure but not the data).

Equivalent to TensorDictBase.clone(recurse=False)

copy_(tensordict: T, non_blocking: bool = False) → T

See TensorDictBase.update_.

The non-blocking argument will be ignored and is just present for compatibility with torch.Tensor.copy_().

copy_at_(tensordict: T, idx: Union[None, int, slice, str, Tensor, List[Any], Tuple[Any, ...]], non_blocking: bool = False) → T: See TensorDictBase.update_at_.

cos() → T: Computes the cos() value of each element of the TensorDict.

cos_() → T: Computes the cos() value of each element of the TensorDict in-place.

cosh() → T: Computes the cosh() value of each element of the TensorDict.

cosh_() → T: Computes the cosh() value of each element of the TensorDict in-place.

cpu(**kwargs) → T

Casts a tensordict to CPU.

This function also supports all the keyword arguments of to().

create_nested(key)

Creates a nested tensordict of the same shape, device and dim names as the current tensordict.

If the value already exists, it will be overwritten by this operation. This operation is blocked in locked tensordicts.

Examples

>>> data = TensorDict({}, [3, 4, 5])
>>> data.create_nested("root")
>>> data.create_nested(("some", "nested", "value"))
>>> print(data)
TensorDict(
    fields={
        root: TensorDict(
            fields={
            },
            batch_size=torch.Size([3, 4, 5]),
            device=None,
            is_shared=False),
        some: TensorDict(
            fields={
                nested: TensorDict(
                    fields={
                        value: TensorDict(
                            fields={
                            },
                            batch_size=torch.Size([3, 4, 5]),
                            device=None,
                            is_shared=False)},
                    batch_size=torch.Size([3, 4, 5]),
                    device=None,
                    is_shared=False)},
            batch_size=torch.Size([3, 4, 5]),
            device=None,
            is_shared=False)},
    batch_size=torch.Size([3, 4, 5]),
    device=None,
    is_shared=False)

cuda(device: Optional[int] = None, **kwargs) → T

Casts a tensordict to a cuda device (if not already on it).

Parameters:: device (int, optional) – if provided, the cuda device on which the tensor should be cast.

This function also supports all the keyword arguments of to().

cummax(dim: int, *, reduce: Optional[bool] = None, return_indices: bool = True) → tensordict.base.TensorDictBase | torch.Tensor

Returns the cumulative maximum values of all elements in the input tensordict.

Parameters:

dim (int) – integer representing the dimension along which to perform the cummax operation.

Keyword Arguments:

reduce (bool, optional) – if True, the reduciton will occur across all TensorDict values and a single reduced tensor will be returned. Defaults to False.
return_argmins (bool, optional) – cummax() returns a named tuple with values and indices when the dim argument is passed. The TensorDict equivalent of this is to return a tensorclass with entries "values" and "indices" with idendical structure within. Defaults to True.

Examples

>>> from tensordict import TensorDict
>>> import torch
>>> td = TensorDict(
...     a=torch.randn(3, 4, 5),
...     b=TensorDict(
...         c=torch.randn(3, 4, 5, 6),
...         d=torch.randn(3, 4, 5),
...         batch_size=(3, 4, 5),
...     ),
...     batch_size=(3, 4)
... )
>>> td.cummax(dim=0)
cummax(
    indices=TensorDict(
        fields={
            a: Tensor(shape=torch.Size([4, 5]), device=cpu, dtype=torch.int64, is_shared=False),
            b: TensorDict(
                fields={
                    c: Tensor(shape=torch.Size([4, 5, 6]), device=cpu, dtype=torch.int64, is_shared=False),
                    d: Tensor(shape=torch.Size([4, 5]), device=cpu, dtype=torch.int64, is_shared=False)},
                batch_size=torch.Size([4]),
                device=None,
                is_shared=False)},
        batch_size=torch.Size([4]),
        device=None,
        is_shared=False),
    vals=TensorDict(
        fields={
            a: Tensor(shape=torch.Size([4, 5]), device=cpu, dtype=torch.float32, is_shared=False),
            b: TensorDict(
                fields={
                    c: Tensor(shape=torch.Size([4, 5, 6]), device=cpu, dtype=torch.float32, is_shared=False),
                    d: Tensor(shape=torch.Size([4, 5]), device=cpu, dtype=torch.float32, is_shared=False)},
                batch_size=torch.Size([4]),
                device=None,
                is_shared=False)},
        batch_size=torch.Size([4]),
        device=None,
        is_shared=False),
    batch_size=torch.Size([4]),
    device=None,
    is_shared=False)
>>> td = TensorDict(
...     a=torch.randn(3, 4, 5),
...     b=TensorDict(
...         c=torch.randn(3, 4, 5),
...         d=torch.randn(3, 4, 5),
...         batch_size=(3, 4, 5),
...     ),
...     batch_size=(3, 4)
... )
>>> td.cummax(reduce=True, dim=0)
torch.return_types.cummax(...)

cummin(dim: int, *, reduce: Optional[bool] = None, return_indices: bool = True) → tensordict.base.TensorDictBase | torch.Tensor

Returns the cumulative minimum values of all elements in the input tensordict.

Parameters:

dim (int) – integer representing the dimension along which to perform the cummin operation.

Keyword Arguments:

reduce (bool, optional) – if True, the reduciton will occur across all TensorDict values and a single reduced tensor will be returned. Defaults to False.
return_argmins (bool, optional) – cummin() returns a named tuple with values and indices when the dim argument is passed. The TensorDict equivalent of this is to return a tensorclass with entries "values" and "indices" with idendical structure within. Defaults to True.

Examples

>>> from tensordict import TensorDict
>>> import torch
>>> td = TensorDict(
...     a=torch.randn(3, 4, 5),
...     b=TensorDict(
...         c=torch.randn(3, 4, 5, 6),
...         d=torch.randn(3, 4, 5),
...         batch_size=(3, 4, 5),
...     ),
...     batch_size=(3, 4)
... )
>>> td.cummin(dim=0)
cummin(
    indices=TensorDict(
        fields={
            a: Tensor(shape=torch.Size([4, 5]), device=cpu, dtype=torch.int64, is_shared=False),
            b: TensorDict(
                fields={
                    c: Tensor(shape=torch.Size([4, 5, 6]), device=cpu, dtype=torch.int64, is_shared=False),
                    d: Tensor(shape=torch.Size([4, 5]), device=cpu, dtype=torch.int64, is_shared=False)},
                batch_size=torch.Size([4]),
                device=None,
                is_shared=False)},
        batch_size=torch.Size([4]),
        device=None,
        is_shared=False),
    vals=TensorDict(
        fields={
            a: Tensor(shape=torch.Size([4, 5]), device=cpu, dtype=torch.float32, is_shared=False),
            b: TensorDict(
                fields={
                    c: Tensor(shape=torch.Size([4, 5, 6]), device=cpu, dtype=torch.float32, is_shared=False),
                    d: Tensor(shape=torch.Size([4, 5]), device=cpu, dtype=torch.float32, is_shared=False)},
                batch_size=torch.Size([4]),
                device=None,
                is_shared=False)},
        batch_size=torch.Size([4]),
        device=None,
        is_shared=False),
    batch_size=torch.Size([4]),
    device=None,
    is_shared=False)
>>> td = TensorDict(
...     a=torch.randn(3, 4, 5),
...     b=TensorDict(
...         c=torch.randn(3, 4, 5),
...         d=torch.randn(3, 4, 5),
...         batch_size=(3, 4, 5),
...     ),
...     batch_size=(3, 4)
... )
>>> td.cummin(reduce=True, dim=0)
torch.return_types.cummin(...)

property data: Returns a tensordict containing the .data attributes of the leaf tensors.

data_ptr(*, storage: bool = False)

Returns the data_ptr of the tensordict leaves.

This can be useful to check if two tensordicts share the same data_ptr().

Keyword Arguments:: storage (bool, optional) – if True, tensor.untyped_storage().data_ptr() will be called instead. Defaults to False.

Examples

>>> from tensordict import TensorDict
>>> td = TensorDict(a=torch.randn(2), b=torch.randn(2), batch_size=[2])
>>> assert (td0.data_ptr() == td.data_ptr()).all()

Note

LazyStackedTensorDict instances will be displayed as nested tensordicts to reflect the true data_ptr() of their leaves:

>>> td0 = TensorDict(a=torch.randn(2), b=torch.randn(2), batch_size=[2])
>>> td1 = TensorDict(a=torch.randn(2), b=torch.randn(2), batch_size=[2])
>>> td = TensorDict.lazy_stack([td0, td1])
>>> td.data_ptr()
TensorDict(
    fields={
        0: TensorDict(
            fields={
                a: Tensor(shape=torch.Size([]), device=cpu, dtype=torch.int64, is_shared=False),
                b: Tensor(shape=torch.Size([]), device=cpu, dtype=torch.int64, is_shared=False)},
            batch_size=torch.Size([]),
            device=cpu,
            is_shared=False),
        1: TensorDict(
            fields={
                a: Tensor(shape=torch.Size([]), device=cpu, dtype=torch.int64, is_shared=False),
                b: Tensor(shape=torch.Size([]), device=cpu, dtype=torch.int64, is_shared=False)},
            batch_size=torch.Size([]),
            device=cpu,
            is_shared=False)},
    batch_size=torch.Size([]),
    device=cpu,
    is_shared=False)

del_(key: NestedKey, **kwargs: Any) → T

Deletes a key of the tensordict.

Parameters:: key (NestedKey) – key to be deleted
Returns:: self

densify(*, layout: layout = torch.strided)

Attempts to represent the lazy stack with contiguous tensors (plain tensors or nested).

Keyword Arguments:: layout (torch.layout) – the layout of the nested tensors, if any. Defaults to strided.

property depth: int

Returns the depth - maximum number of levels - of a tensordict.

The minimum depth is 0 (no nested tensordict).

detach() → T

Detach the tensors in the tensordict.

Returns:: a new tensordict with no tensor requiring gradient.

detach_() → T

Detach the tensors in the tensordict in-place.

Returns:: self.

property device: torch.device | None

Device of a TensorDict.

If the TensorDict has a specified device, all its tensors (incl. nested ones) must live on the same device. If the TensorDict device is None, different values can be located on different devices.

Returns:: torch.device object indicating the device where the tensors are placed, or None if TensorDict does not have a device.

Examples

>>> td = TensorDict({
...     "cpu": torch.randn(3, device='cpu'),
...     "cuda": torch.randn(3, device='cuda'),
... }, batch_size=[], device=None)
>>> td['cpu'].device
device(type='cpu')
>>> td['cuda'].device
device(type='cuda')
>>> td = TensorDict({
...     "x": torch.randn(3, device='cpu'),
...     "y": torch.randn(3, device='cuda'),
... }, batch_size=[], device='cuda')
>>> td['x'].device
device(type='cuda')
>>> td['y'].device
device(type='cuda')
>>> td = TensorDict({
...     "x": torch.randn(3, device='cpu'),
...     "y": TensorDict({'z': torch.randn(3, device='cpu')}, batch_size=[], device=None),
... }, batch_size=[], device='cuda')
>>> td['x'].device
device(type='cuda')
>>> td['y'].device # nested tensordicts are also mapped onto the appropriate device.
device(type='cuda')
>>> td['y', 'x'].device
device(type='cuda')

dim() → int: See batch_dims().

div(other: tensordict.base.TensorDictBase | torch.Tensor, *, default: str | torch.Tensor | None = None) → T

Divides each element of the input self by the corresponding element of other.

{out}_{i} = \frac{{input}_{i}}{{other}_{i}}

Supports broadcasting, type promotion and integer, float, tensordict or tensor inputs. Always promotes integer types to the default scalar type.

Parameters:: other (TensorDict, Tensor or Number) – the divisor.
Keyword Arguments:: default (torch.Tensor or str, optional) – the default value to use for exclusive entries. If none is provided, the two tensordicts key list must match exactly. If default="intersection" is passed, only the intersecting key sets will be considered and other keys will be ignored. In all other cases, default will be used for all missing entries on both sides of the operation.

div_(other: tensordict.base.TensorDictBase | torch.Tensor) → T: In-place version of div().

Note

Inplace div does not support default keyword argument.

double(): Casts all tensors to torch.bool.

property dtype: Returns the dtype of the values in the tensordict, if it is unique.

dumps(prefix: Optional[str] = None, copy_existing: bool = False, *, num_threads: int = 0, return_early: bool = False, share_non_tensor: bool = False) → T

Saves the tensordict to disk.

This function is a proxy to memmap().

empty(recurse=False, *, batch_size=None, device=_NoDefault.ZERO, names=None) → T

Returns a new, empty tensordict with the same device and batch size.

Parameters:

recurse (bool, optional) – if True, the entire structure of the TensorDict will be reproduced without content. Otherwise, only the root will be duplicated. Defaults to False.

Keyword Arguments:

batch_size (torch.Size, optional) – a new batch-size for the tensordict.
device (torch.device, optional) – a new device.
names (list of str, optional) – dimension names.

entry_class(key: NestedKey) → type

Returns the class of an entry, possibly avoiding a call to isinstance(td.get(key), type).

This method should be preferred to tensordict.get(key).shape whenever get() can be expensive to execute.

erf() → T: Computes the erf() value of each element of the TensorDict.

erf_() → T: Computes the erf() value of each element of the TensorDict in-place.

erfc() → T: Computes the erfc() value of each element of the TensorDict.

erfc_() → T: Computes the erfc() value of each element of the TensorDict in-place.

exclude(*keys: NestedKey, inplace: bool = False) → T

Excludes the keys of the tensordict and returns a new tensordict without these entries.

The values are not copied: in-place modifications a tensor of either of the original or new tensordict will result in a change in both tensordicts.

Parameters:

*keys (str) – keys to exclude.
inplace (bool) – if True, the tensordict is pruned in place. Default is False.

Returns:

A new tensordict (or the same if inplace=True) without the excluded entries.

Examples

>>> from tensordict import TensorDict
>>> td = TensorDict({"a": 0, "b": {"c": 1, "d": 2}}, [])
>>> td.exclude("a", ("b", "c"))
TensorDict(
    fields={
        b: TensorDict(
            fields={
                d: Tensor(shape=torch.Size([]), device=cpu, dtype=torch.int64, is_shared=False)},
            batch_size=torch.Size([]),
            device=None,
            is_shared=False)},
    batch_size=torch.Size([]),
    device=None,
    is_shared=False)
>>> td.exclude("a", "b")
TensorDict(
    fields={
    },
    batch_size=torch.Size([]),
    device=None,
    is_shared=False)

exp() → T: Computes the exp() value of each element of the TensorDict.

exp_() → T: Computes the exp() value of each element of the TensorDict in-place.

expand(*args: int, inplace: bool = False) → T

Expands each tensor of the tensordict according to the expand() function, ignoring the feature dimensions.

Supports iterables to specify the shape.

Examples

>>> td = TensorDict({
...     'a': torch.zeros(3, 4, 5),
...     'b': torch.zeros(3, 4, 10)}, batch_size=[3, 4])
>>> td_expand = td.expand(10, 3, 4)
>>> assert td_expand.shape == torch.Size([10, 3, 4])
>>> assert td_expand.get("a").shape == torch.Size([10, 3, 4, 5])

expand_as(other: tensordict.base.TensorDictBase | torch.Tensor) → TensorDictBase

Broadcasts the shape of the tensordict to the shape of other and expands it accordingly.

If the input is a tensor collection (tensordict or tensorclass), the leaves will be expanded on a one-to-one basis.

Examples

>>> from tensordict import TensorDict
>>> import torch
>>> td0 = TensorDict({
...     "a": torch.ones(3, 1, 4),
...     "b": {"c": torch.ones(3, 2, 1, 4)}},
...     batch_size=[3],
... )
>>> td1 = TensorDict({
...     "a": torch.zeros(2, 3, 5, 4),
...     "b": {"c": torch.zeros(2, 3, 2, 6, 4)}},
...     batch_size=[2, 3],
... )
>>> expanded = td0.expand_as(td1)
>>> assert (expanded==1).all()
>>> print(expanded)
TensorDict(
    fields={
        a: Tensor(shape=torch.Size([2, 3, 5, 4]), device=cpu, dtype=torch.float32, is_shared=False),
        b: TensorDict(
            fields={
                c: Tensor(shape=torch.Size([2, 3, 2, 6, 4]), device=cpu, dtype=torch.float32, is_shared=False)},
            batch_size=torch.Size([2, 3]),
            device=None,
            is_shared=False)},
    batch_size=torch.Size([2, 3]),
    device=None,
    is_shared=False)

expm1() → T: Computes the expm1() value of each element of the TensorDict.

expm1_() → T: Computes the expm1() value of each element of the TensorDict in-place.

fill_(key: NestedKey, value: float | bool) → T

Fills a tensor pointed by the key with a given scalar value.

Parameters:

key (str or nested key) – entry to be filled.
value (Number or bool) – value to use for the filling.

Returns:

self

filter_empty_(): Filters out all empty tensordicts in-place.

filter_non_tensor_data() → T: Filters out all non-tensor-data.

flatten(start_dim=0, end_dim=- 1)

Flattens all the tensors of a tensordict.

Parameters:

start_dim (int) – the first dim to flatten
end_dim (int) – the last dim to flatten

Examples

>>> td = TensorDict({
...     "a": torch.arange(60).view(3, 4, 5),
...     "b": torch.arange(12).view(3, 4)}, batch_size=[3, 4])
>>> td_flat = td.flatten(0, 1)
>>> td_flat.batch_size
torch.Size([12])
>>> td_flat["a"]
tensor([[ 0,  1,  2,  3,  4],
        [ 5,  6,  7,  8,  9],
        [10, 11, 12, 13, 14],
        [15, 16, 17, 18, 19],
        [20, 21, 22, 23, 24],
        [25, 26, 27, 28, 29],
        [30, 31, 32, 33, 34],
        [35, 36, 37, 38, 39],
        [40, 41, 42, 43, 44],
        [45, 46, 47, 48, 49],
        [50, 51, 52, 53, 54],
        [55, 56, 57, 58, 59]])
>>> td_flat["b"]
tensor([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11])

flatten_keys(separator: str = '.', inplace: bool = False, is_leaf: Optional[Callable[[Type], bool]] = None) → T

Converts a nested tensordict into a flat one, recursively.

The TensorDict type will be lost and the result will be a simple TensorDict instance.

Parameters:

separator (str, optional) – the separator between the nested items.
inplace (bool, optional) – if True, the resulting tensordict will have the same identity as the one where the call has been made. Defaults to False.
is_leaf (callable, optional) –
a callable over a class type returning a bool indicating if this class has to be considered as a leaf.

Note

The purpose of is_leaf is not to prevent recursive calls into nested tensordicts, but rather to mark certain types as “leaves” for the purpose of filtering when leaves_only=True. Even if is_leaf(cls) returns True, the nested structure of the tensordict will still be traversed if include_nested=True. In other words, is_leaf does not control the recursion depth, but rather provides a way to filter out certain types from the result when leaves_only=True. This means that a node in the tree can be both a leaf and a node with children. In practice, the default value of is_leaf does exclude tensordict and tensorclass instances from the leaf set.

See also

is_leaf_nontensor() and default_is_leaf().

Examples

>>> data = TensorDict({"a": 1, ("b", "c"): 2, ("e", "f", "g"): 3}, batch_size=[])
>>> data.flatten_keys(separator=" - ")
TensorDict(
    fields={
        a: Tensor(shape=torch.Size([]), device=cpu, dtype=torch.int64, is_shared=False),
        b - c: Tensor(shape=torch.Size([]), device=cpu, dtype=torch.int64, is_shared=False),
        e - f - g: Tensor(shape=torch.Size([]), device=cpu, dtype=torch.int64, is_shared=False)},
    batch_size=torch.Size([]),
    device=None,
    is_shared=False)

This method and unflatten_keys() are particularly useful when handling state-dicts, as they make it possible to seamlessly convert flat dictionaries into data structures that mimic the structure of the model.

Examples

>>> model = torch.nn.Sequential(torch.nn.Linear(3 ,4))
>>> ddp_model = torch.ao.quantization.QuantWrapper(model)
>>> state_dict = TensorDict(ddp_model.state_dict(), batch_size=[]).unflatten_keys(".")
>>> print(state_dict)
TensorDict(
    fields={
        module: TensorDict(
            fields={
                0: TensorDict(
                    fields={
                        bias: Tensor(shape=torch.Size([4]), device=cpu, dtype=torch.float32, is_shared=False),
                        weight: Tensor(shape=torch.Size([4, 3]), device=cpu, dtype=torch.float32, is_shared=False)},
                    batch_size=torch.Size([]),
                    device=None,
                    is_shared=False)},
            batch_size=torch.Size([]),
            device=None,
            is_shared=False)},
    batch_size=torch.Size([]),
    device=None,
    is_shared=False)
>>> model_state_dict = state_dict.get("module")
>>> print(model_state_dict)
TensorDict(
    fields={
        0: TensorDict(
            fields={
                bias: Tensor(shape=torch.Size([4]), device=cpu, dtype=torch.float32, is_shared=False),
                weight: Tensor(shape=torch.Size([4, 3]), device=cpu, dtype=torch.float32, is_shared=False)},
            batch_size=torch.Size([]),
            device=None,
            is_shared=False)},
    batch_size=torch.Size([]),
    device=None,
    is_shared=False)
>>> model.load_state_dict(dict(model_state_dict.flatten_keys(".")))

float(): Casts all tensors to torch.float.

float16(): Casts all tensors to torch.float16.

float32(): Casts all tensors to torch.float32.

float64(): Casts all tensors to torch.float64.

floor() → T: Computes the floor() value of each element of the TensorDict.

floor_() → T: Computes the floor() value of each element of the TensorDict in-place.

frac() → T: Computes the frac() value of each element of the TensorDict.

frac_() → T: Computes the frac() value of each element of the TensorDict in-place.

classmethod from_any(obj, *, auto_batch_size: bool = False, batch_dims: Optional[int] = None, device: Optional[device] = None, batch_size: Optional[Size] = None)

Recursively converts any object to a TensorDict.

Note

from_any is less restrictive than the regular TensorDict constructor. It can cast data structures like dataclasses or tuples to a tensordict using custom heuristics. This approach may incur some extra overhead and involves more opinionated choices in terms of mapping strategies.

Note

This method recursively converts the input object to a TensorDict. If the object is already a TensorDict (or any similar tensor collection object), it will be returned as is.

Parameters:

obj – The object to be converted.

Keyword Arguments:

auto_batch_size (bool, optional) – if True, the batch size will be computed automatically. Defaults to False.
batch_dims (int, optional) – If auto_batch_size is True, defines how many dimensions the output tensordict should have. Defaults to None (full batch-size at each level).
device (torch.device, optional) – The device on which the TensorDict will be created.
batch_size (torch.Size, optional) – The batch size of the TensorDict. Exclusive with auto_batch_size.

Returns:

A TensorDict representation of the input object.

Supported objects:

Dataclasses through from_dataclass() (dataclasses will be converted to TensorDict instances, not tensorclasses).
Namedtuples through from_namedtuple().
Dictionaries through from_dict().
Tuples through from_tuple().
NumPy’s structured arrays through from_struct_array().
HDF5 objects through from_h5().

classmethod from_dataclass(dataclass, *, auto_batch_size: bool = False, batch_dims: Optional[int] = None, as_tensorclass: bool = False, device: Optional[device] = None, batch_size: Optional[Size] = None)

Converts a dataclass into a TensorDict instance.

Parameters:

dataclass – The dataclass instance to be converted.

Keyword Arguments:

auto_batch_size (bool, optional) – If True, automatically determines and applies batch size to the resulting TensorDict. Defaults to False.
batch_dims (int, optional) – If auto_batch_size is True, defines how many dimensions the output tensordict should have. Defaults to None (full batch-size at each level).
as_tensorclass (bool, optional) – If True, delegates the conversion to the free function from_dataclass() and returns a tensor-compatible class (tensorclass()) or instance instead of a TensorDict. Defaults to False.
device (torch.device, optional) – The device on which the TensorDict will be created. Defaults to None.
batch_size (torch.Size, optional) – The batch size of the TensorDict. Defaults to None.

Returns:

A TensorDict instance derived from the provided dataclass, unless as_tensorclass is True, in which case a tensor-compatible class or instance is returned.

Raises:

TypeError – If the provided input is not a dataclass instance.

Warning

This method is distinct from the free function from_dataclass and serves a different purpose. While the free function returns a tensor-compatible class or instance, this method returns a TensorDict instance.

classmethod from_dict(input_dict: List[Dict[NestedKey, Any]], *other, auto_batch_size: bool = False, batch_size=None, device=None, batch_dims=None, stack_dim_name=None, stack_dim=0)

Returns a TensorDict created from a dictionary or another TensorDict.

If batch_size is not specified, returns the maximum batch size possible.

This function works on nested dictionaries too, or can be used to determine the batch-size of a nested tensordict.

Parameters:

input_dict (dictionary, optional) – a dictionary to use as a data source (nested keys compatible).

Keyword Arguments:

auto_batch_size (bool, optional) – if True, the batch size will be computed automatically. Defaults to False.
batch_size (iterable of int, optional) – a batch size for the tensordict.
device (torch.device or compatible type, optional) – a device for the TensorDict.
batch_dims (int, optional) – the batch_dims (ie number of leading dimensions to be considered for batch_size). Exclusinve with batch_size. Note that this is the __maximum__ number of batch dims of the tensordict, a smaller number is tolerated.
names (list of str, optional) – the dimension names of the tensordict.

Examples

>>> input_dict = {"a": torch.randn(3, 4), "b": torch.randn(3)}
>>> print(TensorDict.from_dict(input_dict))
TensorDict(
    fields={
        a: Tensor(shape=torch.Size([3, 4]), device=cpu, dtype=torch.float32, is_shared=False),
        b: Tensor(shape=torch.Size([3]), device=cpu, dtype=torch.float32, is_shared=False)},
    batch_size=torch.Size([3]),
    device=None,
    is_shared=False)
>>> # nested dict: the nested TensorDict can have a different batch-size
>>> # as long as its leading dims match.
>>> input_dict = {"a": torch.randn(3), "b": {"c": torch.randn(3, 4)}}
>>> print(TensorDict.from_dict(input_dict))
TensorDict(
    fields={
        a: Tensor(shape=torch.Size([3]), device=cpu, dtype=torch.float32, is_shared=False),
        b: TensorDict(
            fields={
                c: Tensor(shape=torch.Size([3, 4]), device=cpu, dtype=torch.float32, is_shared=False)},
            batch_size=torch.Size([3, 4]),
            device=None,
            is_shared=False)},
    batch_size=torch.Size([3]),
    device=None,
    is_shared=False)
>>> # we can also use this to work out the batch sie of a tensordict
>>> input_td = TensorDict({"a": torch.randn(3), "b": {"c": torch.randn(3, 4)}}, [])
>>> print(TensorDict.from_dict(input_td))
TensorDict(
    fields={
        a: Tensor(shape=torch.Size([3]), device=cpu, dtype=torch.float32, is_shared=False),
        b: TensorDict(
            fields={
                c: Tensor(shape=torch.Size([3, 4]), device=cpu, dtype=torch.float32, is_shared=False)},
            batch_size=torch.Size([3, 4]),
            device=None,
            is_shared=False)},
    batch_size=torch.Size([3]),
    device=None,
    is_shared=False)

from_dict_instance(input_dict, *others, auto_batch_size: Optional[bool] = None, batch_size=None, device=None, batch_dims=None, names=None)

Instance method version of from_dict().

Unlike from_dict(), this method will attempt to keep the tensordict types within the existing tree (for any existing leaf).

Examples

>>> from tensordict import TensorDict, tensorclass
>>> import torch
>>>
>>> @tensorclass
>>> class MyClass:
...     x: torch.Tensor
...     y: int
>>>
>>> td = TensorDict({"a": torch.randn(()), "b": MyClass(x=torch.zeros(()), y=1)})
>>> print(td.from_dict_instance(td.to_dict()))
TensorDict(
    fields={
        a: Tensor(shape=torch.Size([]), device=cpu, dtype=torch.float32, is_shared=False),
        b: MyClass(
            x=Tensor(shape=torch.Size([]), device=cpu, dtype=torch.float32, is_shared=False),
            y=Tensor(shape=torch.Size([]), device=cpu, dtype=torch.int64, is_shared=False),
            batch_size=torch.Size([]),
            device=None,
            is_shared=False)},
    batch_size=torch.Size([]),
    device=None,
    is_shared=False)
>>> print(td.from_dict(td.to_dict()))
TensorDict(
    fields={
        a: Tensor(shape=torch.Size([]), device=cpu, dtype=torch.float32, is_shared=False),
        b: TensorDict(
            fields={
                x: Tensor(shape=torch.Size([]), device=cpu, dtype=torch.float32, is_shared=False),
                y: Tensor(shape=torch.Size([]), device=cpu, dtype=torch.int64, is_shared=False)},
            batch_size=torch.Size([]),
            device=None,
            is_shared=False)},
    batch_size=torch.Size([]),
    device=None,
    is_shared=False)

classmethod from_h5(filename, *, mode: str = 'r', auto_batch_size: bool = False, batch_dims: Optional[int] = None, batch_size: Optional[Size] = None)

Creates a PersistentTensorDict from a h5 file.

Parameters:: filename (str) – The path to the h5 file.

Keword Arguments:

mode (str, optional): Reading mode. Defaults to "r". auto_batch_size (bool, optional): If True, the batch size will be computed automatically.

Defaults to False.

batch_dims (int, optional): If auto_batch_size is True, defines how many dimensions the output: tensordict should have. Defaults to None (full batch-size at each level).

batch_size (torch.Size, optional): The batch size of the TensorDict. Defaults to None.

Returns:: A PersistentTensorDict representation of the input h5 file.

Examples

>>> td = TensorDict.from_h5("path/to/file.h5")
>>> print(td)
PersistentTensorDict(
    fields={
        key1: Tensor(shape=torch.Size([3]), device=cpu, dtype=torch.float32, is_shared=False),
        key2: Tensor(shape=torch.Size([3]), device=cpu, dtype=torch.float32, is_shared=False)},
    batch_size=torch.Size([]),
    device=None,
    is_shared=False)

classmethod from_module(module, as_module: bool = False, lock: bool = True, use_state_dict: bool = False)

Copies the params and buffers of a module in a tensordict.

Parameters:

module (nn.Module) – the module to get the parameters from.
as_module (bool, optional) – if True, a TensorDictParams instance will be returned which can be used to store parameters within a torch.nn.Module. Defaults to False.
lock (bool, optional) – if True, the resulting tensordict will be locked. Defaults to True.
use_state_dict (bool, optional) –
if True, the state-dict from the module will be used and unflattened into a TensorDict with the tree structure of the model. Defaults to False.

Note

This is particularly useful when state-dict hooks have to be used.

Examples

>>> from torch import nn
>>> module = nn.TransformerDecoder(
...     decoder_layer=nn.TransformerDecoderLayer(nhead=4, d_model=4),
...     num_layers=1
... )
>>> params = TensorDict.from_module(module)
>>> print(params["layers", "0", "linear1"])
TensorDict(
    fields={
        bias: Parameter(shape=torch.Size([2048]), device=cpu, dtype=torch.float32, is_shared=False),
        weight: Parameter(shape=torch.Size([2048, 4]), device=cpu, dtype=torch.float32, is_shared=False)},
    batch_size=torch.Size([]),
    device=None,
    is_shared=False)

classmethod from_modules(*modules, as_module: bool = False, lock: bool = True, use_state_dict: bool = False, lazy_stack: bool = False, expand_identical: bool = False)

Retrieves the parameters of several modules for ensebmle learning/feature of expects applications through vmap.

Parameters:

modules (sequence of nn.Module) – the modules to get the parameters from. If the modules differ in their structure, a lazy stack is needed (see the lazy_stack argument below).

Keyword Arguments:

as_module (bool, optional) – if True, a TensorDictParams instance will be returned which can be used to store parameters within a torch.nn.Module. Defaults to False.
lock (bool, optional) – if True, the resulting tensordict will be locked. Defaults to True.
use_state_dict (bool, optional) –
if True, the state-dict from the module will be used and unflattened into a TensorDict with the tree structure of the model. Defaults to False.

Note

This is particularly useful when state-dict hooks have to be used.
lazy_stack (bool, optional) –
whether parameters should be densly or lazily stacked. Defaults to False (dense stack).

Note

lazy_stack and as_module are exclusive features.

Warning

There is a crucial difference between lazy and non-lazy outputs in that non-lazy output will reinstantiate parameters with the desired batch-size, while lazy_stack will just represent the parameters as lazily stacked. This means that whilst the original parameters can safely be passed to an optimizer when lazy_stack=True, the new parameters need to be passed when it is set to True.

Warning

Whilst it can be tempting to use a lazy stack to keep the orignal parameter references, remember that lazy stack perform a stack each time get() is called. This will require memory (N times the size of the parameters, more if a graph is built) and time to be computed. It also means that the optimizer(s) will contain more parameters, and operations like step() or zero_grad() will take longer to be executed. In general, lazy_stack should be reserved to very few use cases.
expand_identical (bool, optional) – if True and the same parameter (same identity) is being stacked to itself, an expanded version of this parameter will be returned instead. This argument is ignored when lazy_stack=True.

Examples

>>> from torch import nn
>>> from tensordict import TensorDict
>>> torch.manual_seed(0)
>>> empty_module = nn.Linear(3, 4, device="meta")
>>> n_models = 2
>>> modules = [nn.Linear(3, 4) for _ in range(n_models)]
>>> params = TensorDict.from_modules(*modules)
>>> print(params)
TensorDict(
    fields={
        bias: Parameter(shape=torch.Size([2, 4]), device=cpu, dtype=torch.float32, is_shared=False),
        weight: Parameter(shape=torch.Size([2, 4, 3]), device=cpu, dtype=torch.float32, is_shared=False)},
    batch_size=torch.Size([2]),
    device=None,
    is_shared=False)
>>> # example of batch execution
>>> def exec_module(params, x):
...     with params.to_module(empty_module):
...         return empty_module(x)
>>> x = torch.randn(3)
>>> y = torch.vmap(exec_module, (0, None))(params, x)
>>> assert y.shape == (n_models, 4)
>>> # since lazy_stack = False, backprop leaves the original params untouched
>>> y.sum().backward()
>>> assert params["weight"].grad.norm() > 0
>>> assert modules[0].weight.grad is None

With lazy_stack=True, things are slightly different:

>>> params = TensorDict.from_modules(*modules, lazy_stack=True)
>>> print(params)
LazyStackedTensorDict(
    fields={
        bias: Tensor(shape=torch.Size([2, 4]), device=cpu, dtype=torch.float32, is_shared=False),
        weight: Tensor(shape=torch.Size([2, 4, 3]), device=cpu, dtype=torch.float32, is_shared=False)},
    exclusive_fields={
    },
    batch_size=torch.Size([2]),
    device=None,
    is_shared=False,
    stack_dim=0)
>>> # example of batch execution
>>> y = torch.vmap(exec_module, (0, None))(params, x)
>>> assert y.shape == (n_models, 4)
>>> y.sum().backward()
>>> assert modules[0].weight.grad is not None

classmethod from_namedtuple(named_tuple, *, auto_batch_size: bool = False, batch_dims: Optional[int] = None, device: Optional[device] = None, batch_size: Optional[Size] = None)

Converts a namedtuple to a TensorDict recursively.

Parameters:

named_tuple – The namedtuple instance to be converted.

Keyword Arguments:

auto_batch_size (bool, optional) – if True, the batch size will be computed automatically. Defaults to False.
batch_dims (int, optional) – If auto_batch_size is True, defines how many dimensions the output tensordict should have. Defaults to None (full batch-size at each level).
device (torch.device, optional) – The device on which the TensorDict will be created. Defaults to None.
batch_size (torch.Size, optional) – The batch size of the TensorDict. Defaults to None.

Returns:

A TensorDict representation of the input namedtuple.

Examples

>>> from tensordict import TensorDict
>>> import torch
>>> data = TensorDict({
...     "a_tensor": torch.zeros((3)),
...     "nested": {"a_tensor": torch.zeros((3)), "a_string": "zero!"}}, [3])
>>> nt = data.to_namedtuple()
>>> print(nt)
GenericDict(a_tensor=tensor([0., 0., 0.]), nested=GenericDict(a_tensor=tensor([0., 0., 0.]), a_string='zero!'))
>>> TensorDict.from_namedtuple(nt, auto_batch_size=True)
TensorDict(
    fields={
        a_tensor: Tensor(shape=torch.Size([3]), device=cpu, dtype=torch.float32, is_shared=False),
        nested: TensorDict(
            fields={
                a_string: NonTensorData(data=zero!, batch_size=torch.Size([3]), device=None),
                a_tensor: Tensor(shape=torch.Size([3]), device=cpu, dtype=torch.float32, is_shared=False)},
            batch_size=torch.Size([3]),
            device=None,
            is_shared=False)},
    batch_size=torch.Size([3]),
    device=None,
    is_shared=False)

classmethod from_pytree(pytree, *, batch_size: Optional[Size] = None, auto_batch_size: bool = False, batch_dims: Optional[int] = None)

Converts a pytree to a TensorDict instance.

This method is designed to keep the pytree nested structure as much as possible.

Additional non-tensor keys are added to keep track of each level’s identity, providing a built-in pytree-to-tensordict bijective transform API.

Accepted classes currently include lists, tuples, named tuples and dict.

Note

For dictionaries, non-NestedKey keys are registered separately as NonTensorData instances.

Note

Tensor-castable types (such as int, float or np.ndarray) will be converted to torch.Tensor instances. Note that this transformation is surjective: transforming back the tensordict to a pytree will not recover the original types.

Examples

>>> # Create a pytree with tensor leaves, and one "weird"-looking dict key
>>> class WeirdLookingClass:
...     pass
...
>>> weird_key = WeirdLookingClass()
>>> # Make a pytree with tuple, lists, dict and namedtuple
>>> pytree = (
...     [torch.randint(10, (3,)), torch.zeros(2)],
...     {
...         "tensor": torch.randn(
...             2,
...         ),
...         "td": TensorDict({"one": 1}),
...         weird_key: torch.randint(10, (2,)),
...         "list": [1, 2, 3],
...     },
...     {"named_tuple": TensorDict({"two": torch.ones(1) * 2}).to_namedtuple()},
... )
>>> # Build a TensorDict from that pytree
>>> td = TensorDict.from_pytree(pytree)
>>> # Recover the pytree
>>> pytree_recon = td.to_pytree()
>>> # Check that the leaves match
>>> def check(v1, v2):
>>>     assert (v1 == v2).all()
>>>
>>> torch.utils._pytree.tree_map(check, pytree, pytree_recon)
>>> assert weird_key in pytree_recon[1]

classmethod from_struct_array(struct_array: ndarray, *, auto_batch_size: bool = False, batch_dims: Optional[int] = None, device: Optional[device] = None, batch_size: Optional[Size] = None) → T

Converts a structured numpy array to a TensorDict.

The resulting TensorDict will share the same memory content as the numpy array (it is a zero-copy operation). Changing values of the structured numpy array in-place will affect the content of the TensorDict.

Note

This method performs a zero-copy operation, meaning that the resulting TensorDict will share the same memory content as the input numpy array. Therefore, changing values of the numpy array in-place will affect the content of the TensorDict.

Parameters:

struct_array (np.ndarray) – The structured numpy array to be converted.

Keyword Arguments:

auto_batch_size (bool, optional) – If True, the batch size will be computed automatically. Defaults to False.
batch_dims (int, optional) – If auto_batch_size is True, defines how many dimensions the output tensordict should have. Defaults to None (full batch-size at each level).
device (torch.device, optional) –
The device on which the TensorDict will be created. Defaults to None.

Note

Changing the device (i.e., specifying any device other than None or "cpu") will transfer the data, resulting in a change to the memory location of the returned data.
batch_size (torch.Size, optional) – The batch size of the TensorDict. Defaults to None.

Returns:

A TensorDict representation of the input structured numpy array.

Examples

>>> x = np.array(
...     [("Rex", 9, 81.0), ("Fido", 3, 27.0)],
...     dtype=[("name", "U10"), ("age", "i4"), ("weight", "f4")],
... )
>>> td = TensorDict.from_struct_array(x)
>>> x_recon = td.to_struct_array()
>>> assert (x_recon == x).all()
>>> assert x_recon.shape == x.shape
>>> # Try modifying x age field and check effect on td
>>> x["age"] += 1
>>> assert (td["age"] == np.array([10, 4])).all()

classmethod from_tuple(obj, *, auto_batch_size: bool = False, batch_dims: Optional[int] = None, device: Optional[device] = None, batch_size: Optional[Size] = None)

Converts a tuple to a TensorDict.

Parameters:

obj – The tuple instance to be converted.

Keyword Arguments:

auto_batch_size (bool, optional) – If True, the batch size will be computed automatically. Defaults to False.
batch_dims (int, optional) – If auto_batch_size is True, defines how many dimensions the output tensordict should have. Defaults to None (full batch-size at each level).
device (torch.device, optional) – The device on which the TensorDict will be created. Defaults to None.
batch_size (torch.Size, optional) – The batch size of the TensorDict. Defaults to None.

Returns:

A TensorDict representation of the input tuple.

Examples

>>> my_tuple = (1, 2, 3)
>>> td = TensorDict.from_tuple(my_tuple)
>>> print(td)
TensorDict(
    fields={
        0: Tensor(shape=torch.Size([]), device=cpu, dtype=torch.int64, is_shared=False),
        1: Tensor(shape=torch.Size([]), device=cpu, dtype=torch.int64, is_shared=False),
        2: Tensor(shape=torch.Size([]), device=cpu, dtype=torch.int64, is_shared=False)},
    batch_size=torch.Size([]),
    device=None,
    is_shared=False)

classmethod fromkeys(keys: List[NestedKey], value: Any = 0)

Creates a tensordict from a list of keys and a single value.

Parameters:

keys (list of NestedKey) – An iterable specifying the keys of the new dictionary.
value (compatible type, optional) – The value for all keys. Defaults to 0.

gather(dim: int, index: Tensor, out: Optional[T] = None) → T

Gathers values along an axis specified by dim.

Parameters:

dim (int) – the dimension along which collect the elements
index (torch.Tensor) – a long tensor which number of dimension matches the one of the tensordict with only one dimension differring between the two (the gathering dimension). Its elements refer to the index to be gathered along the required dimension.
out (TensorDictBase, optional) – a destination tensordict. It must have the same shape as the index.

Examples

>>> td = TensorDict(
...     {"a": torch.randn(3, 4, 5),
...      "b": TensorDict({"c": torch.zeros(3, 4, 5)}, [3, 4, 5])},
...     [3, 4])
>>> index = torch.randint(4, (3, 2))
>>> td_gather = td.gather(dim=1, index=index)
>>> print(td_gather)
TensorDict(
    fields={
        a: Tensor(shape=torch.Size([3, 2, 5]), device=cpu, dtype=torch.float32, is_shared=False),
        b: TensorDict(
            fields={
                c: Tensor(shape=torch.Size([3, 2, 5]), device=cpu, dtype=torch.float32, is_shared=False)},
            batch_size=torch.Size([3, 2, 5]),
            device=None,
            is_shared=False)},
    batch_size=torch.Size([3, 2]),
    device=None,
    is_shared=False)

Gather keeps the dimension names.

Examples

>>> td.names = ["a", "b"]
>>> td_gather = td.gather(dim=1, index=index)
>>> td_gather.names
["a", "b"]

gather_and_stack(dst: int, group: 'torch.distributed.ProcessGroup' | None = None) → T | None

Gathers tensordicts from various workers and stacks them onto self in the destination worker.

Parameters:

dst (int) – the rank of the destination worker where gather_and_stack() will be called.
group (torch.distributed.ProcessGroup, optional) – if set, the specified process group will be used for communication. Otherwise, the default process group will be used. Defaults to None.

Example

>>> from torch import multiprocessing as mp
>>> from tensordict import TensorDict
>>> import torch
>>>
>>> def client():
...     torch.distributed.init_process_group(
...         "gloo",
...         rank=1,
...         world_size=2,
...         init_method=f"tcp://localhost:10003",
...     )
...     # Create a single tensordict to be sent to server
...     td = TensorDict(
...         {("a", "b"): torch.randn(2),
...          "c": torch.randn(2)}, [2]
...     )
...     td.gather_and_stack(0)
...
>>> def server():
...     torch.distributed.init_process_group(
...         "gloo",
...         rank=0,
...         world_size=2,
...         init_method=f"tcp://localhost:10003",
...     )
...     # Creates the destination tensordict on server.
...     # The first dim must be equal to world_size-1
...     td = TensorDict(
...         {("a", "b"): torch.zeros(2),
...          "c": torch.zeros(2)}, [2]
...     ).expand(1, 2).contiguous()
...     td.gather_and_stack(0)
...     assert td["a", "b"] != 0
...     print("yuppie")
...
>>> if __name__ == "__main__":
...     mp.set_start_method("spawn")
...
...     main_worker = mp.Process(target=server)
...     secondary_worker = mp.Process(target=client)
...
...     main_worker.start()
...     secondary_worker.start()
...
...     main_worker.join()
...     secondary_worker.join()

get(key: NestedKey, *args, **kwargs) → Tensor

Gets the value stored with the input key.

Parameters:

key (str, tuple of str) – key to be queried. If tuple of str it is equivalent to chained calls of getattr.
default –
default value if the key is not found in the tensordict. Defaults to None.

Warning

Previously, if a key was not present in the tensordict and no default was passed, a KeyError was raised. From v0.7, this behaviour has been changed and a None value is returned instead (in accordance with the what dict.get behavior). To adopt the old behavior, set the environment variable export TD_GET_DEFAULTS_TO_NONE=’0’ or call :func`~tensordict.set_get_defaults_to_none(False)`.

Examples

>>> td = TensorDict({"x": 1}, batch_size=[])
>>> td.get("x")
tensor(1)
>>> td.get("y")
None

get_at(key: NestedKey, *args, **kwargs) → Tensor

Get the value of a tensordict from the key key at the index idx.

Parameters:

key (str, tuple of str) – key to be retrieved.
index (int, slice, torch.Tensor, iterable) – index of the tensor.
default (torch.Tensor) – default value to return if the key is not present in the tensordict.

Returns:

indexed tensor.

Examples

>>> td = TensorDict({"x": torch.arange(3)}, batch_size=[])
>>> td.get_at("x", index=1)
tensor(1)

get_item_shape(key)

Gets the shape of an item in the lazy stack.

Heterogeneous dimensions are returned as -1.

This implementation is inefficient as it will attempt to stack the items to compute their shape, and should only be used for printing.

get_nestedtensor(key: NestedKey, default: Any = _NoDefault.ZERO, *, layout: Optional[layout] = None) → Tensor

Returns a nested tensor when stacking cannot be achieved.

Parameters:

key (NestedKey) – the entry to nest.
default (Any, optiona) –
the default value to return in case the key isn’t in all sub-tensordicts.

Note

In case the default is a tensor, this method will attempt the construction of a nestedtensor with it. Otherwise, the default value will be returned.

Keyword Arguments:

layout (torch.layout, optional) – the layout for the nested tensor.

Examples

>>> td0 = TensorDict({"a": torch.zeros(4), "b": torch.zeros(4)}, [])
>>> td1 = TensorDict({"a": torch.ones(5)}, [])
>>> td = torch.stack([td0, td1], 0)
>>> a = td.get_nestedtensor("a")
>>> # using a tensor as default uses this default to build the nested tensor
>>> b = td.get_nestedtensor("b", default=torch.ones(4))
>>> assert (a == b).all()
>>> # using anything else as default returns the default
>>> b2 = td.get_nestedtensor("b", None)
>>> assert b2 is None

get_non_tensor(key: NestedKey, default=_NoDefault.ZERO)

Gets a non-tensor value, if it exists, or default if the non-tensor value is not found.

This method is robust to tensor/TensorDict values, meaning that if the value gathered is a regular tensor it will be returned too (although this method comes with some overhead and should not be used out of its natural scope).

See set_non_tensor() for more information on how to set non-tensor values in a tensordict.

Parameters:

key (NestedKey) – the location of the NonTensorData object.
default (Any, optional) – the value to be returned if the key cannot be found.

Returns: the content of the tensordict.tensorclass.NonTensorData,: or the entry corresponding to the key if it isn’t a tensordict.tensorclass.NonTensorData (or default if the entry cannot be found).

Examples

>>> data = TensorDict({}, batch_size=[])
>>> data.set_non_tensor(("nested", "the string"), "a string!")
>>> assert data.get_non_tensor(("nested", "the string")) == "a string!"
>>> # regular `get` works but returns a NonTensorData object
>>> data.get(("nested", "the string"))
NonTensorData(
    data='a string!',
    batch_size=torch.Size([]),
    device=None,
    is_shared=False)

property grad: Returns a tensordict containing the .grad attributes of the leaf tensors.

half(): Casts all tensors to torch.half.

insert(index: int, tensordict: T) → None

Insert a TensorDict into the stack at the specified index.

Analogous to list.insert. The inserted TensorDict must have compatible batch_size and device. Insertion is in-place, nothing is returned.

Parameters:

index (int) – The index at which the new TensorDict should be inserted.
tensordict (TensorDictBase) – The TensorDict to be inserted into the stack.

int(): Casts all tensors to torch.int.

int16(): Casts all tensors to torch.int16.

int32(): Casts all tensors to torch.int32.

int64(): Casts all tensors to torch.int64.

int8(): Casts all tensors to torch.int8.

irecv(src: int, *, group: 'torch.distributed.ProcessGroup' | None = None, return_premature: bool = False, init_tag: int = 0, pseudo_rand: bool = False) → tuple[int, list[torch.Future]] | list[torch.Future] | None

Receives the content of a tensordict and updates content with it asynchronously.

Check the example in the isend() method for context.

Parameters:

src (int) – the rank of the source worker.

Keyword Arguments:

group (torch.distributed.ProcessGroup, optional) – if set, the specified process group will be used for communication. Otherwise, the default process group will be used. Defaults to None.
return_premature (bool) – if True, returns a list of futures to wait upon until the tensordict is updated. Defaults to False, i.e. waits until update is completed withing the call.
init_tag (int) – the init_tag used by the source worker.
pseudo_rand (bool) – if True, the sequence of tags will be pseudo- random, allowing to send multiple data from different nodes without overlap. Notice that the generation of these pseudo-random numbers is expensive (1e-5 sec/number), meaning that it could slow down the runtime of your algorithm. This value must match the one passed to isend(). Defaults to False.

Returns:

if return_premature=True, a list of futures to wait: upon until the tensordict is updated.

is_consolidated(): Checks if a TensorDict has a consolidated storage.

is_contiguous() → bool: Returns a boolean indicating if all the tensors are contiguous.

is_empty() → bool: Checks if the tensordict contains any leaf.

is_memmap() → bool

Checks if tensordict is memory-mapped.

If a TensorDict instance is memory-mapped, it is locked (entries cannot be renamed, removed or added). If a TensorDict is created with tensors that are all memory-mapped, this does __not__ mean that is_memmap will return True (as a new tensor may or may not be memory-mapped). Only if one calls tensordict.memmap_() will the tensordict be considered as memory-mapped.

This is always True for tensordicts on a CUDA device.

is_shared() → bool

Checks if tensordict is in shared memory.

If a TensorDict instance is in shared memory, it is locked (entries cannot be renamed, removed or added). If a TensorDict is created with tensors that are all in shared memory, this does __not__ mean that is_shared will return True (as a new tensor may or may not be in shared memory). Only if one calls tensordict.share_memory_() or places the tensordict on a device where the content is shared by default (eg, "cuda") will the tensordict be considered in shared memory.

This is always True for tensordicts on a CUDA device.

isend(dst: int, *, group: 'torch.distributed.ProcessGroup' | None = None, init_tag: int = 0, pseudo_rand: bool = False) → int

Sends the content of the tensordict asynchronously.

Parameters:

dst (int) – the rank of the destination worker where the content should be sent.

Keyword Arguments:

group (torch.distributed.ProcessGroup, optional) – if set, the specified process group will be used for communication. Otherwise, the default process group will be used. Defaults to None.
init_tag (int) – the initial tag to be used to mark the tensors. Note that this will be incremented by as much as the number of tensors contained in the TensorDict.
pseudo_rand (bool) – if True, the sequence of tags will be pseudo- random, allowing to send multiple data from different nodes without overlap. Notice that the generation of these pseudo-random numbers is expensive (1e-5 sec/number), meaning that it could slow down the runtime of your algorithm. Defaults to False.

Example

>>> import torch
>>> from tensordict import TensorDict
>>> from torch import multiprocessing as mp
>>> def client():
...     torch.distributed.init_process_group(
...         "gloo",
...         rank=1,
...         world_size=2,
...         init_method=f"tcp://localhost:10003",
...     )
...
...     td = TensorDict(
...         {
...             ("a", "b"): torch.randn(2),
...             "c": torch.randn(2, 3),
...             "_": torch.ones(2, 1, 5),
...         },
...         [2],
...     )
...     td.isend(0)
...
>>>
>>> def server(queue, return_premature=True):
...     torch.distributed.init_process_group(
...         "gloo",
...         rank=0,
...         world_size=2,
...         init_method=f"tcp://localhost:10003",
...     )
...     td = TensorDict(
...         {
...             ("a", "b"): torch.zeros(2),
...             "c": torch.zeros(2, 3),
...             "_": torch.zeros(2, 1, 5),
...         },
...         [2],
...     )
...     out = td.irecv(1, return_premature=return_premature)
...     if return_premature:
...         for fut in out:
...             fut.wait()
...     assert (td != 0).all()
...     queue.put("yuppie")
...
>>>
>>> if __name__ == "__main__":
...     queue = mp.Queue(1)
...     main_worker = mp.Process(
...         target=server,
...         args=(queue, )
...         )
...     secondary_worker = mp.Process(target=client)
...
...     main_worker.start()
...     secondary_worker.start()
...     out = queue.get(timeout=10)
...     assert out == "yuppie"
...     main_worker.join()
...     secondary_worker.join()

isfinite() → T

Returns a new tensordict with boolean elements representing if each element is finite or not.

Real values are finite when they are not NaN, negative infinity, or infinity. Complex values are finite when both their real and imaginary parts are finite.

isnan() → T

Returns a new tensordict with boolean elements representing if each element of input is NaN or not.

Complex values are considered NaN when either their real and/or imaginary part is NaN.

isneginf() → T: Tests if each element of input is negative infinity or not.

isposinf() → T: Tests if each element of input is negative infinity or not.

isreal() → T: Returns a new tensordict with boolean elements representing if each element of input is real-valued or not.

items(include_nested=False, leaves_only=False, is_leaf=None, *, sort: bool = False)

Returns a generator of key-value pairs for the tensordict.

Parameters:

include_nested (bool, optional) – if True, nested values will be returned. Defaults to False.
leaves_only (bool, optional) – if False, only leaves will be returned. Defaults to False.
is_leaf (callable, optional) –
a callable over a class type returning a bool indicating if this class has to be considered as a leaf.

Note

The purpose of is_leaf is not to prevent recursive calls into nested tensordicts, but rather to mark certain types as “leaves” for the purpose of filtering when leaves_only=True. Even if is_leaf(cls) returns True, the nested structure of the tensordict will still be traversed if include_nested=True. In other words, is_leaf does not control the recursion depth, but rather provides a way to filter out certain types from the result when leaves_only=True. This means that a node in the tree can be both a leaf and a node with children. In practice, the default value of is_leaf does exclude tensordict and tensorclass instances from the leaf set.

See also

is_leaf_nontensor() and default_is_leaf().

Keyword Arguments:

sort (bool, optional) – whether the keys should be sorted. For nested keys, the keys are sorted according to their joined name (ie, ("a", "key") will be counted as "a.key" for sorting). Be mindful that sorting may incur significant overhead when dealing with large tensordicts. Defaults to False.

keys(include_nested: bool = False, leaves_only: bool = False, is_leaf: Optional[Callable[[Type], bool]] = None, *, sort: bool = False) → _LazyStackedTensorDictKeysView

Returns a generator of tensordict keys.

Warning

TensorDict keys() method returns a lazy view of the keys. If the keys are queried but not iterated over and then the tensordict is modified, iterating over the keys later will return the new configuration of the keys.

Parameters:

include_nested (bool, optional) – if True, nested values will be returned. Defaults to False.
leaves_only (bool, optional) – if False, only leaves will be returned. Defaults to False.
is_leaf (callable, optional) –
a callable over a class type returning a bool indicating if this class has to be considered as a leaf.

Note

The purpose of is_leaf is not to prevent recursive calls into nested tensordicts, but rather to mark certain types as “leaves” for the purpose of filtering when leaves_only=True. Even if is_leaf(cls) returns True, the nested structure of the tensordict will still be traversed if include_nested=True. In other words, is_leaf does not control the recursion depth, but rather provides a way to filter out certain types from the result when leaves_only=True. This means that a node in the tree can be both a leaf and a node with children. In practice, the default value of is_leaf does exclude tensordict and tensorclass instances from the leaf set.

See also

is_leaf_nontensor() and default_is_leaf().

Keyword Arguments:

sort (bool, optional) – whether the keys shoulbe sorted. For nested keys, the keys are sorted according to their joined name (ie, ("a", "key") will be counted as "a.key" for sorting). Be mindful that sorting may incur significant overhead when dealing with large tensordicts. Defaults to False.

Examples

>>> from tensordict import TensorDict
>>> data = TensorDict({"0": 0, "1": {"2": 2}}, batch_size=[])
>>> data.keys()
['0', '1']
>>> list(data.keys(leaves_only=True))
['0']
>>> list(data.keys(include_nested=True, leaves_only=True))
['0', '1', ('1', '2')]

classmethod lazy_stack(items: Sequence[TensorDictBase], dim: int = 0, *, device: Optional[Union[device, str, int]] = None, out: Optional[T] = None, stack_dim_name: Optional[str] = None, strict_shape: bool = False) → T

Stacks tensordicts in a LazyStackedTensorDict.

Parameters:

items (Sequence of TensorDictBase instances) – A sequence of TensorDictBase instances to stack.
dim (int, optional) – the dim along which to perform the lazy stack. Defaults to 0.

Keyword Arguments:

device (torch.device, optional) – a device to set in the LazyStackedTensorDict in case it cannot be inferred from the tensordict list (e.g., the list is empty).
out (TensorDictBase, optional) – a LazyStackedTensorDict where to write the data.
stack_dim_name (str, optional) – a name for the stacked dimension.
strict_shape (bool, optional) – if True, every tensordict’s shapes must match. Defaults to False.

lerp(end: tensordict.base.TensorDictBase | torch.Tensor, weight: tensordict.base.TensorDictBase | torch.Tensor | float)

Does a linear interpolation of two tensors start (given by self) and end based on a scalar or tensor weight.

{out}_{i} = {start}_{i} + {weight}_{i} \times ({end}_{i} - {start}_{i})

The shapes of start and end must be broadcastable. If weight is a tensor, then the shapes of weight, start, and end must be broadcastable.

Parameters:

end (TensorDict) – the tensordict with the ending points.
weight (TensorDict, tensor or float) – the weight for the interpolation formula.

lerp_(end: tensordict.base.TensorDictBase | torch.Tensor | float, weight: tensordict.base.TensorDictBase | torch.Tensor | float): In-place version of lerp().

lgamma() → T: Computes the lgamma() value of each element of the TensorDict.

lgamma_() → T: Computes the lgamma() value of each element of the TensorDict in-place.

classmethod load(prefix: str | pathlib.Path, *args, **kwargs) → T

Loads a tensordict from disk.

This class method is a proxy to load_memmap().

load_(prefix: str | pathlib.Path, *args, **kwargs)

Loads a tensordict from disk within the current tensordict.

This class method is a proxy to load_memmap_().

classmethod load_memmap(prefix: str | pathlib.Path, device: Optional[device] = None, non_blocking: bool = False, *, out: Optional[TensorDictBase] = None) → T

Loads a memory-mapped tensordict from disk.

Parameters:

prefix (str or Path to folder) – the path to the folder where the saved tensordict should be fetched.
device (torch.device or equivalent, optional) – if provided, the data will be asynchronously cast to that device. Supports “meta” device, in which case the data isn’t loaded but a set of empty “meta” tensors are created. This is useful to get a sense of the total model size and structure without actually opening any file.
non_blocking (bool, optional) – if True, synchronize won’t be called after loading tensors on device. Defaults to False.
out (TensorDictBase, optional) – optional tensordict where the data should be written.

Examples

>>> from tensordict import TensorDict
>>> td = TensorDict.fromkeys(["a", "b", "c", ("nested", "e")], 0)
>>> td.memmap("./saved_td")
>>> td_load = TensorDict.load_memmap("./saved_td")
>>> assert (td == td_load).all()

This method also allows loading nested tensordicts.

Examples

>>> nested = TensorDict.load_memmap("./saved_td/nested")
>>> assert nested["e"] == 0

A tensordict can also be loaded on “meta” device or, alternatively, as a fake tensor.

Examples

>>> import tempfile
>>> td = TensorDict({"a": torch.zeros(()), "b": {"c": torch.zeros(())}})
>>> with tempfile.TemporaryDirectory() as path:
...     td.save(path)
...     td_load = TensorDict.load_memmap(path, device="meta")
...     print("meta:", td_load)
...     from torch._subclasses import FakeTensorMode
...     with FakeTensorMode():
...         td_load = TensorDict.load_memmap(path)
...         print("fake:", td_load)
meta: TensorDict(
    fields={
        a: Tensor(shape=torch.Size([]), device=meta, dtype=torch.float32, is_shared=False),
        b: TensorDict(
            fields={
                c: Tensor(shape=torch.Size([]), device=meta, dtype=torch.float32, is_shared=False)},
            batch_size=torch.Size([]),
            device=meta,
            is_shared=False)},
    batch_size=torch.Size([]),
    device=meta,
    is_shared=False)
fake: TensorDict(
    fields={
        a: FakeTensor(shape=torch.Size([]), device=cpu, dtype=torch.float32, is_shared=False),
        b: TensorDict(
            fields={
                c: FakeTensor(shape=torch.Size([]), device=cpu, dtype=torch.float32, is_shared=False)},
            batch_size=torch.Size([]),
            device=cpu,
            is_shared=False)},
    batch_size=torch.Size([]),
    device=cpu,
    is_shared=False)

load_memmap_(prefix: str | pathlib.Path)

Loads the content of a memory-mapped tensordict within the tensordict where load_memmap_ is called.

See load_memmap() for more info.

load_state_dict(state_dict: OrderedDict[str, Any], strict=True, assign=False, from_flatten=False) → T

Loads a state-dict, formatted as in state_dict(), into the tensordict.

Parameters:

state_dict (OrderedDict) – the state_dict of to be copied.
strict (bool, optional) – whether to strictly enforce that the keys in state_dict match the keys returned by this tensordict’s torch.nn.Module.state_dict() function. Default: True
assign (bool, optional) – whether to assign items in the state dictionary to their corresponding keys in the tensordict instead of copying them inplace into the tensordict’s current tensors. When False, the properties of the tensors in the current module are preserved while when True, the properties of the Tensors in the state dict are preserved. Default: False
from_flatten (bool, optional) – if True, the input state_dict is assumed to be flattened. Defaults to False.

Examples

>>> data = TensorDict({"1": 1, "2": 2, "3": {"3": 3}}, [])
>>> data_zeroed = TensorDict({"1": 0, "2": 0, "3": {"3": 0}}, [])
>>> sd = data.state_dict()
>>> data_zeroed.load_state_dict(sd)
>>> print(data_zeroed["3", "3"])
tensor(3)
>>> # with flattening
>>> data_zeroed = TensorDict({"1": 0, "2": 0, "3": {"3": 0}}, [])
>>> data_zeroed.load_state_dict(data.state_dict(flatten=True), from_flatten=True)
>>> print(data_zeroed["3", "3"])
tensor(3)

lock_() → T

Locks a tensordict for non in-place operations.

Functions such as set(), __setitem__(), update(), rename_key_() or other operations that add or remove entries will be blocked.

This method can be used as a decorator.

Example

>>> from tensordict import TensorDict
>>> td = TensorDict({"a": 1, "b": 2, "c": 3}, batch_size=[])
>>> with td.lock_():
...     assert td.is_locked
...     try:
...         td.set("d", 0) # error!
...     except RuntimeError:
...         print("td is locked!")
...     try:
...         del td["d"]
...     except RuntimeError:
...         print("td is locked!")
...     try:
...         td.rename_key_("a", "d")
...     except RuntimeError:
...         print("td is locked!")
...     td.set("a", 0, inplace=True)  # No storage is added, moved or removed
...     td.set_("a", 0) # No storage is added, moved or removed
...     td.update({"a": 0}, inplace=True)  # No storage is added, moved or removed
...     td.update_({"a": 0})  # No storage is added, moved or removed
>>> assert not td.is_locked

log() → T: Computes the log() value of each element of the TensorDict.

log10() → T: Computes the log10() value of each element of the TensorDict.

log10_() → T: Computes the log10() value of each element of the TensorDict in-place.

log1p() → T: Computes the log1p() value of each element of the TensorDict.

log1p_() → T: Computes the log1p() value of each element of the TensorDict in-place.

log2() → T: Computes the log2() value of each element of the TensorDict.

log2_() → T: Computes the log2() value of each element of the TensorDict in-place.

log_() → T: Computes the log() value of each element of the TensorDict in-place.

logical_and(other: tensordict.base.TensorDictBase | torch.Tensor, *, default: str | torch.Tensor | None = None) → TensorDictBase

Performs a logical AND operation between self and other.

{out}_{i} = {input}_{i} \land {other}_{i}

Parameters:: other (TensorDictBase or torch.Tensor) – the tensor or TensorDict to perform the logical AND with.
Keyword Arguments:: default (torch.Tensor or str, optional) – the default value to use for exclusive entries. If none is provided, the two tensordicts key list must match exactly. If default="intersection" is passed, only the intersecting key sets will be considered and other keys will be ignored. In all other cases, default will be used for all missing entries on both sides of the operation.

logsumexp(dim=None, keepdim=False, *, out=None)

Returns the log of summed exponentials of each row of the input tensordict in the given dimension dim. The computation is numerically stabilized.

If keepdim is True, the output tensor is of the same size as input except in the dimension(s) dim where it is of size 1. Otherwise, dim is squeezed (see squeeze()), resulting in the output tensor having 1 (or len(dim)) fewer dimension(s).

Parameters:

dim (int or tuple of ints, optional) – the dimension or dimensions to reduce. If None, all batch dimensions of the tensordict are reduced.
keepdim (bool) – whether the output tensordict has dim retained or not.

Keyword Arguments:

out (TensorDictBase, optional) – the output tensordict.

make_memmap(key: NestedKey, shape: torch.Size | torch.Tensor, *, dtype: Optional[dtype] = None) → MemoryMappedTensor

Creates an empty memory-mapped tensor given a shape and possibly a dtype.

Warning

This method is not lock-safe by design. A memory-mapped TensorDict instance present on multiple nodes will need to be updated using the method memmap_refresh_().

Writing an existing entry will result in an error.

Parameters:

key (NestedKey) – the key of the new entry to write. If the key is already present in the tensordict, an exception is raised.
shape (torch.Size or equivalent, torch.Tensor for nested tensors) – the shape of the tensor to write.

Keyword Arguments:

dtype (torch.dtype, optional) – the dtype of the new tensor.

Returns:

A new memory mapped tensor.

make_memmap_from_storage(key: NestedKey, storage: UntypedStorage, shape: torch.Size | torch.Tensor, *, dtype: Optional[dtype] = None) → MemoryMappedTensor

Creates an empty memory-mapped tensor given a storage, a shape and possibly a dtype.

Warning

This method is not lock-safe by design. A memory-mapped TensorDict instance present on multiple nodes will need to be updated using the method memmap_refresh_().

Note

If the storage has a filename associated, it must match the new filename for the file. If it has not a filename associated but the tensordict has an associated path, this will result in an exception.

Parameters:

key (NestedKey) – the key of the new entry to write. If the key is already present in the tensordict, an exception is raised.
storage (torch.UntypedStorage) – the storage to use for the new MemoryMappedTensor. Must be a physical memory storage.
shape (torch.Size or equivalent, torch.Tensor for nested tensors) – the shape of the tensor to write.

Keyword Arguments:

dtype (torch.dtype, optional) – the dtype of the new tensor.

Returns:

A new memory mapped tensor with the given storage.

make_memmap_from_tensor(key: NestedKey, tensor: Tensor, *, copy_data: bool = True) → MemoryMappedTensor

Creates an empty memory-mapped tensor given a tensor.

Warning

This method is not lock-safe by design. A memory-mapped TensorDict instance present on multiple nodes will need to be updated using the method memmap_refresh_().

This method always copies the storage content if copy_data is True (i.e., the storage is not shared).

Parameters:

key (NestedKey) – the key of the new entry to write. If the key is already present in the tensordict, an exception is raised.
tensor (torch.Tensor) – the tensor to replicate on physical memory.

Keyword Arguments:

copy_data (bool, optionaL) – if False, the new tensor will share the metadata of the input such as shape and dtype, but the content will be empty. Defaults to True.

Returns:

A new memory mapped tensor with the given storage.

map(fn: Callable[[TensorDictBase], TensorDictBase | None], dim: int = 0, num_workers: int | None = None, *, out: TensorDictBase | None = None, chunksize: int | None = None, num_chunks: int | None = None, pool: mp.Pool | None = None, generator: torch.Generator | None = None, max_tasks_per_child: int | None = None, worker_threads: int = 1, index_with_generator: bool = False, pbar: bool = False, mp_start_method: str | None = None)

Maps a function to splits of the tensordict across one dimension.

This method will apply a function to a tensordict instance by chunking it in tensordicts of equal size and dispatching the operations over the desired number of workers.

The function signature should be Callabe[[TensorDict], Union[TensorDict, Tensor]]. The output must support the torch.cat() operation. The function must be serializable.

Note

This method is particularly useful when working with large datasets stored on disk (e.g. memory-mapped tensordicts) where chunks will be zero-copied slices of the original data which can be passed to the processes with virtually zero-cost. This allows to tread very large datasets (eg. over a Tb big) to be processed at little cost.

Parameters:

fn (callable) – function to apply to the tensordict. Signatures similar to Callabe[[TensorDict], Union[TensorDict, Tensor]] are supported.
dim (int, optional) – the dim along which the tensordict will be chunked.
num_workers (int, optional) – the number of workers. Exclusive with pool. If none is provided, the number of workers will be set to the number of cpus available.

Keyword Arguments:

out (TensorDictBase, optional) – an optional container for the output. Its batch-size along the dim provided must match self.ndim. If it is shared or memmap (is_shared() or is_memmap() returns True) it will be populated within the remote processes, avoiding data inward transfers. Otherwise, the data from the self slice will be sent to the process, collected on the current process and written inplace into out.
chunksize (int, optional) – The size of each chunk of data. A chunksize of 0 will unbind the tensordict along the desired dimension and restack it after the function is applied, whereas chunksize>0 will split the tensordict and call torch.cat() on the resulting list of tensordicts. If none is provided, the number of chunks will equate the number of workers. For very large tensordicts, such large chunks may not fit in memory for the operation to be done and more chunks may be needed to make the operation practically doable. This argument is exclusive with num_chunks.
num_chunks (int, optional) – the number of chunks to split the tensordict into. If none is provided, the number of chunks will equate the number of workers. For very large tensordicts, such large chunks may not fit in memory for the operation to be done and more chunks may be needed to make the operation practically doable. This argument is exclusive with chunksize.
pool (mp.Pool, optional) – a multiprocess Pool instance to use to execute the job. If none is provided, a pool will be created within the map method.
generator (torch.Generator, optional) –
a generator to use for seeding. A base seed will be generated from it, and each worker of the pool will be seeded with the provided seed incremented by a unique integer from 0 to num_workers. If no generator is provided, a random integer will be used as seed. To work with unseeded workers, a pool should be created separately and passed to map() directly.

Note

Caution should be taken when providing a low-valued seed as this can cause autocorrelation between experiments, example: if 8 workers are asked and the seed is 4, the workers seed will range from 4 to 11. If the seed is 5, the workers seed will range from 5 to 12. These two experiments will have an overlap of 7 seeds, which can have unexpected effects on the results.

Note

The goal of seeding the workers is to have independent seed on each worker, and NOT to have reproducible results across calls of the map method. In other words, two experiments may and probably will return different results as it is impossible to know which worker will pick which job. However, we can make sure that each worker has a different seed and that the pseudo-random operations on each will be uncorrelated.
max_tasks_per_child (int, optional) – the maximum number of jobs picked by every child process. Defaults to None, i.e., no restriction on the number of jobs.
worker_threads (int, optional) – the number of threads for the workers. Defaults to 1.
index_with_generator (bool, optional) – if True, the splitting / chunking of the tensordict will be done during the query, sparing init time. Note that chunk() and split() are much more efficient than indexing (which is used within the generator) so a gain of processing time at init time may have a negative impact on the total runtime. Defaults to False.
pbar (bool, optional) – if True, a progress bar will be displayed. Requires tqdm to be available. Defaults to False.
mp_start_method (str, optional) – the start method for multiprocessing. If not provided, the default start method will be used. Accepted strings are "fork" and "spawn". Keep in mind that "cuda" tensors cannot be shared between processes with the "fork" start method. This is without effect if the pool is passed to the map method.

Examples

>>> import torch
>>> from tensordict import TensorDict
>>>
>>> def process_data(data):
...     data.set("y", data.get("x") + 1)
...     return data
>>> if __name__ == "__main__":
...     data = TensorDict({"x": torch.zeros(1, 1_000_000)}, [1, 1_000_000]).memmap_()
...     data = data.map(process_data, dim=1)
...     print(data["y"][:, :10])
...
tensor([[1., 1., 1., 1., 1., 1., 1., 1., 1., 1.]])

map_iter(fn: Callable[[TensorDictBase], TensorDictBase | None], dim: int = 0, num_workers: int | None = None, *, shuffle: bool = False, chunksize: int | None = None, num_chunks: int | None = None, pool: mp.Pool | None = None, generator: torch.Generator | None = None, max_tasks_per_child: int | None = None, worker_threads: int = 1, index_with_generator: bool = True, pbar: bool = False, mp_start_method: str | None = None)

Maps a function to splits of the tensordict across one dimension iteratively.

This is the iterable version of map().

This method will apply a function to a tensordict instance by chunking it in tensordicts of equal size and dispatching the operations over the desired number of workers. It will yield the results one at a time.

The function signature should be Callabe[[TensorDict], Union[TensorDict, Tensor]]. The function must be serializable.

Note

This method is particularly useful when working with large datasets stored on disk (e.g. memory-mapped tensordicts) where chunks will be zero-copied slices of the original data which can be passed to the processes with virtually zero-cost. This allows to tread very large datasets (eg. over a Tb big) to be processed at little cost.

Note

This function be used to represent a dataset and load from it, in a dataloader-like fashion.

Parameters:

fn (callable) – function to apply to the tensordict. Signatures similar to Callabe[[TensorDict], Union[TensorDict, Tensor]] are supported.
dim (int, optional) – the dim along which the tensordict will be chunked.
num_workers (int, optional) – the number of workers. Exclusive with pool. If none is provided, the number of workers will be set to the number of cpus available.

Keyword Arguments:

shuffle (bool, optional) – whether the indices should be globally shuffled. If True, each batch will contain non-contiguous samples. If index_with_generator=False and shuffle=True`, an error will be raised. Defaults to False.
chunksize (int, optional) – The size of each chunk of data. A chunksize of 0 will unbind the tensordict along the desired dimension and restack it after the function is applied, whereas chunksize>0 will split the tensordict and call torch.cat() on the resulting list of tensordicts. If none is provided, the number of chunks will equate the number of workers. For very large tensordicts, such large chunks may not fit in memory for the operation to be done and more chunks may be needed to make the operation practically doable. This argument is exclusive with num_chunks.
num_chunks (int, optional) – the number of chunks to split the tensordict into. If none is provided, the number of chunks will equate the number of workers. For very large tensordicts, such large chunks may not fit in memory for the operation to be done and more chunks may be needed to make the operation practically doable. This argument is exclusive with chunksize.
pool (mp.Pool, optional) – a multiprocess Pool instance to use to execute the job. If none is provided, a pool will be created within the map method.
generator (torch.Generator, optional) –
a generator to use for seeding. A base seed will be generated from it, and each worker of the pool will be seeded with the provided seed incremented by a unique integer from 0 to num_workers. If no generator is provided, a random integer will be used as seed. To work with unseeded workers, a pool should be created separately and passed to map() directly.

Note

Caution should be taken when providing a low-valued seed as this can cause autocorrelation between experiments, example: if 8 workers are asked and the seed is 4, the workers seed will range from 4 to 11. If the seed is 5, the workers seed will range from 5 to 12. These two experiments will have an overlap of 7 seeds, which can have unexpected effects on the results.

Note

The goal of seeding the workers is to have independent seed on each worker, and NOT to have reproducible results across calls of the map method. In other words, two experiments may and probably will return different results as it is impossible to know which worker will pick which job. However, we can make sure that each worker has a different seed and that the pseudo-random operations on each will be uncorrelated.
max_tasks_per_child (int, optional) – the maximum number of jobs picked by every child process. Defaults to None, i.e., no restriction on the number of jobs.
worker_threads (int, optional) – the number of threads for the workers. Defaults to 1.
index_with_generator (bool, optional) –
if True, the splitting / chunking of the tensordict will be done during the query, sparing init time. Note that chunk() and split() are much more efficient than indexing (which is used within the generator) so a gain of processing time at init time may have a negative impact on the total runtime. Defaults to True.

Note

The default value of index_with_generator differs for map_iter and map and the former assumes that it is prohibitively expensive to store a split version of the TensorDict in memory.
pbar (bool, optional) – if True, a progress bar will be displayed. Requires tqdm to be available. Defaults to False.
mp_start_method (str, optional) – the start method for multiprocessing. If not provided, the default start method will be used. Accepted strings are "fork" and "spawn". Keep in mind that "cuda" tensors cannot be shared between processes with the "fork" start method. This is without effect if the pool is passed to the map method.

Examples

>>> import torch
>>> from tensordict import TensorDict
>>>
>>> def process_data(data):
...     data.unlock_()
...     data.set("y", data.get("x") + 1)
...     return data
>>> if __name__ == "__main__":
...     data = TensorDict({"x": torch.zeros(1, 1_000_000)}, [1, 1_000_000]).memmap_()
...     for sample in data.map_iter(process_data, dim=1, chunksize=5):
...         print(sample["y"])
...         break
...
tensor([[1., 1., 1., 1., 1.]])

masked_fill(mask: Tensor, value: float | bool) → T

Out-of-place version of masked_fill.

Parameters:

mask (boolean torch.Tensor) – mask of values to be filled. Shape must match the tensordict batch-size.
value – value to used to fill the tensors.

Returns:

self

Examples

>>> td = TensorDict(source={'a': torch.zeros(3, 4)},
...     batch_size=[3])
>>> mask = torch.tensor([True, False, False])
>>> td1 = td.masked_fill(mask, 1.0)
>>> td1.get("a")
tensor([[1., 1., 1., 1.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.]])

masked_fill_(mask: Tensor, value: float | bool) → T

Fills the values corresponding to the mask with the desired value.

Parameters:

mask (boolean torch.Tensor) – mask of values to be filled. Shape must match the tensordict batch-size.
value – value to used to fill the tensors.

Returns:

self

Examples

>>> td = TensorDict(source={'a': torch.zeros(3, 4)},
...     batch_size=[3])
>>> mask = torch.tensor([True, False, False])
>>> td.masked_fill_(mask, 1.0)
>>> td.get("a")
tensor([[1., 1., 1., 1.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.]])

masked_select(mask: Tensor) → T

Masks all tensors of the TensorDict and return a new TensorDict instance with similar keys pointing to masked values.

Parameters:: mask (torch.Tensor) – boolean mask to be used for the tensors. Shape must match the TensorDict batch_size.

Examples

>>> td = TensorDict(source={'a': torch.zeros(3, 4)},
...    batch_size=[3])
>>> mask = torch.tensor([True, False, False])
>>> td_mask = td.masked_select(mask)
>>> td_mask.get("a")
tensor([[0., 0., 0., 0.]])

max(dim: int | NO_DEFAULT = _NoDefault.ZERO, keepdim: bool = False, *, reduce: bool | None = None, return_indices: bool = True) → TensorDictBase | torch.Tensor

Returns the maximum values of all elements in the input tensordict.

Parameters:

dim (int, optional) – if None, returns a dimensionless tensordict containing the max value of all leaves (if this can be computed). If integer, max is called upon the dimension specified if and only if this dimension is compatible with the tensordict shape.
keepdim (bool) – whether the output tensor has dim retained or not.

Keyword Arguments:

reduce (bool, optional) – if True, the reduciton will occur across all TensorDict values and a single reduced tensor will be returned. Defaults to False.
return_argmins (bool, optional) – max() returns a named tuple with values and indices when the dim argument is passed. The TensorDict equivalent of this is to return a tensorclass with entries "values" and "indices" with idendical structure within. Defaults to True.

Examples

>>> from tensordict import TensorDict
>>> import torch
>>> td = TensorDict(
...     a=torch.randn(3, 4, 5),
...     b=TensorDict(
...         c=torch.randn(3, 4, 5, 6),
...         d=torch.randn(3, 4, 5),
...         batch_size=(3, 4, 5),
...     ),
...     batch_size=(3, 4)
... )
>>> td.max(dim=0)
max(
    indices=TensorDict(
        fields={
            a: Tensor(shape=torch.Size([4, 5]), device=cpu, dtype=torch.int64, is_shared=False),
            b: TensorDict(
                fields={
                    c: Tensor(shape=torch.Size([4, 5, 6]), device=cpu, dtype=torch.int64, is_shared=False),
                    d: Tensor(shape=torch.Size([4, 5]), device=cpu, dtype=torch.int64, is_shared=False)},
                batch_size=torch.Size([4]),
                device=None,
                is_shared=False)},
        batch_size=torch.Size([4]),
        device=None,
        is_shared=False),
    vals=TensorDict(
        fields={
            a: Tensor(shape=torch.Size([4, 5]), device=cpu, dtype=torch.float32, is_shared=False),
            b: TensorDict(
                fields={
                    c: Tensor(shape=torch.Size([4, 5, 6]), device=cpu, dtype=torch.float32, is_shared=False),
                    d: Tensor(shape=torch.Size([4, 5]), device=cpu, dtype=torch.float32, is_shared=False)},
                batch_size=torch.Size([4]),
                device=None,
                is_shared=False)},
        batch_size=torch.Size([4]),
        device=None,
        is_shared=False),
    batch_size=torch.Size([4]),
    device=None,
    is_shared=False)
>>> td.max()
TensorDict(
    fields={
        a: Tensor(shape=torch.Size([]), device=cpu, dtype=torch.float32, is_shared=False),
        b: TensorDict(
            fields={
                c: Tensor(shape=torch.Size([]), device=cpu, dtype=torch.float32, is_shared=False),
                d: Tensor(shape=torch.Size([]), device=cpu, dtype=torch.float32, is_shared=False)},
            batch_size=torch.Size([]),
            device=None,
            is_shared=False)},
    batch_size=torch.Size([]),
    device=None,
    is_shared=False)
>>> td.max(reduce=True)
tensor(3.2942)

maximum(other: tensordict.base.TensorDictBase | torch.Tensor, *, default: str | torch.Tensor | None = None) → T

Computes the element-wise maximum of self and other.

Parameters:: other (TensorDict or Tensor) – the other input tensordict or tensor.
Keyword Arguments:: default (torch.Tensor or str, optional) – the default value to use for exclusive entries. If none is provided, the two tensordicts key list must match exactly. If default="intersection" is passed, only the intersecting key sets will be considered and other keys will be ignored. In all other cases, default will be used for all missing entries on both sides of the operation.

maximum_(other: tensordict.base.TensorDictBase | torch.Tensor) → T: In-place version of maximum().

Note

Inplace maximum does not support default keyword argument.

classmethod maybe_dense_stack(items: Sequence[TensorDictBase], dim: int = 0, out: Optional[T] = None, strict: bool = False) → T

Stacks tensors or tensordicts densly if possible, or onto a LazyStackedTensorDict otherwise.

Examples

>>> td0 = TensorDict({"a": 0}, [])
>>> td1 = TensorDict({"b": 0}, [])
>>> LazyStackedTensorDict.maybe_dense_stack([td0, td0])  # returns a TensorDict with shape [2]
>>> LazyStackedTensorDict.maybe_dense_stack([td0, td1])  # returns a LazyStackedTensorDict with shape [2]
>>> LazyStackedTensorDict.maybe_dense_stack(list(torch.randn(2)))  # returns a torch.Tensor with shape [2]

mean(dim: Union[int, Tuple[int], Literal['feature']] = _NoDefault.ZERO, keepdim: bool = _NoDefault.ZERO, *, dtype: Optional[dtype] = None, reduce: Optional[bool] = None) → tensordict.base.TensorDictBase | torch.Tensor

Returns the mean value of all elements in the input tensordict.

Parameters:

dim (int, tuple of int, str, optional) – if None, returns a dimensionless tensordict containing the mean value of all leaves (if this can be computed). If integer or tuple of integers, mean is called upon the dimension specified if and only if this dimension is compatible with the tensordict shape. Only the “feature” string is currently permitted. Using dim=”feature” will achieve the reduction over all feature dimensions. If reduce=True, a tensor of the shape of the TensorDict’s batch-size will be returned. Otherwise, a new tensordict with the same structure as self with reduced feature dimensions will be returned.
keepdim (bool) – whether the output tensor has dim retained or not.

Keyword Arguments:

dtype (torch.dtype, optional) – the desired data type of returned tensor. If specified, the input tensor is casted to dtype before the operation is performed. This is useful for preventing data type overflows. Default: None.
reduce (bool, optional) – if True, the reduciton will occur across all TensorDict values and a single reduced tensor will be returned. Defaults to False.

Examples

>>> from tensordict import TensorDict
>>> import torch
>>> td = TensorDict(
...     a=torch.randn(3, 4, 5),
...     b=TensorDict(
...         c=torch.randn(3, 4, 5, 6),
...         d=torch.randn(3, 4, 5),
...         batch_size=(3, 4, 5),
...     ),
...     batch_size=(3, 4)
... )
>>> td.mean(dim=0)
TensorDict(
    fields={
        a: Tensor(shape=torch.Size([4, 5]), device=cpu, dtype=torch.float32, is_shared=False),
        b: TensorDict(
            fields={
                c: Tensor(shape=torch.Size([4, 5, 6]), device=cpu, dtype=torch.float32, is_shared=False),
                d: Tensor(shape=torch.Size([4, 5]), device=cpu, dtype=torch.float32, is_shared=False)},
            batch_size=torch.Size([4, 5]),
            device=None,
            is_shared=False)},
    batch_size=torch.Size([4]),
    device=None,
    is_shared=False)
>>> td.mean()
TensorDict(
    fields={
        a: Tensor(shape=torch.Size([]), device=cpu, dtype=torch.float32, is_shared=False),
        b: TensorDict(
            fields={
                c: Tensor(shape=torch.Size([]), device=cpu, dtype=torch.float32, is_shared=False),
                d: Tensor(shape=torch.Size([]), device=cpu, dtype=torch.float32, is_shared=False)},
            batch_size=torch.Size([]),
            device=None,
            is_shared=False)},
    batch_size=torch.Size([]),
    device=None,
    is_shared=False)
>>> td.mean(reduce=True)
tensor(-0.0547)
>>> td.mean(dim="feature")
TensorDict(
    fields={
        a: Tensor(shape=torch.Size([3, 4]), device=cpu, dtype=torch.float32, is_shared=False),
        b: TensorDict(
            fields={
                c: Tensor(shape=torch.Size([3, 4, 5]), device=cpu, dtype=torch.float32, is_shared=False),
                d: Tensor(shape=torch.Size([3, 4, 5]), device=cpu, dtype=torch.float32, is_shared=False)},
            batch_size=torch.Size([3, 4, 5]),
            device=None,
            is_shared=False)},
    batch_size=torch.Size([3, 4]),
    device=None,
    is_shared=False)
>>> td = TensorDict(
...     a=torch.ones(3, 4, 5),
...     b=TensorDict(
...         c=torch.ones(3, 4, 5),
...         d=torch.ones(3, 4, 5),
...         batch_size=(3, 4, 5),
...     ),
...     batch_size=(3, 4)
... )
>>> td.mean(reduce=True, dim="feature")
tensor([[1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]])
>>> td.mean(reduce=True, dim=0)
tensor([[1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1.]])

memmap(prefix: Optional[str] = None, copy_existing: bool = False, *, num_threads: int = 0, return_early: bool = False, share_non_tensor: bool = False, existsok: bool = True) → T

Writes all tensors onto a corresponding memory-mapped Tensor in a new tensordict.

Parameters:

prefix (str) – directory prefix where the memory-mapped tensors will be stored. The directory tree structure will mimic the tensordict’s.
copy_existing (bool) – If False (default), an exception will be raised if an entry in the tensordict is already a tensor stored on disk with an associated file, but is not saved in the correct location according to prefix. If True, any existing Tensor will be copied to the new location.

Keyword Arguments:

num_threads (int, optional) – the number of threads used to write the memmap tensors. Defaults to 0.
return_early (bool, optional) – if True and num_threads>0, the method will return a future of the tensordict.
share_non_tensor (bool, optional) – if True, the non-tensor data will be shared between the processes and writing operation (such as inplace update or set) on any of the workers within a single node will update the value on all other workers. If the number of non-tensor leaves is high (e.g., sharing large stacks of non-tensor data) this may result in OOM or similar errors. Defaults to False.
existsok (bool, optional) – if False, an exception will be raised if a tensor already exists in the same path. Defaults to True.

The TensorDict is then locked, meaning that any writing operations that isn’t in-place will throw an exception (eg, rename, set or remove an entry). Once the tensordict is unlocked, the memory-mapped attribute is turned to False, because cross-process identity is not guaranteed anymore.

Returns:: A new tensordict with the tensors stored on disk if return_early=False, otherwise a TensorDictFuture instance.

Note

Serialising in this fashion might be slow with deeply nested tensordicts, so it is not recommended to call this method inside a training loop.

memmap_(prefix: Optional[str] = None, copy_existing: bool = False, *, num_threads: int = 0, return_early: bool = False, share_non_tensor: bool = False, existsok: bool = True) → T

Writes all tensors onto a corresponding memory-mapped Tensor, in-place.

Parameters:

prefix (str) – directory prefix where the memory-mapped tensors will be stored. The directory tree structure will mimic the tensordict’s.
copy_existing (bool) – If False (default), an exception will be raised if an entry in the tensordict is already a tensor stored on disk with an associated file, but is not saved in the correct location according to prefix. If True, any existing Tensor will be copied to the new location.

Keyword Arguments:

num_threads (int, optional) – the number of threads used to write the memmap tensors. Defaults to 0.
return_early (bool, optional) – if True and num_threads>0, the method will return a future of the tensordict. The resulting tensordict can be queried using future.result().
share_non_tensor (bool, optional) – if True, the non-tensor data will be shared between the processes and writing operation (such as inplace update or set) on any of the workers within a single node will update the value on all other workers. If the number of non-tensor leaves is high (e.g., sharing large stacks of non-tensor data) this may result in OOM or similar errors. Defaults to False.
existsok (bool, optional) – if False, an exception will be raised if a tensor already exists in the same path. Defaults to True.

The TensorDict is then locked, meaning that any writing operations that isn’t in-place will throw an exception (eg, rename, set or remove an entry). Once the tensordict is unlocked, the memory-mapped attribute is turned to False, because cross-process identity is not guaranteed anymore.

Returns:: self if return_early=False, otherwise a TensorDictFuture instance.

Note

Serialising in this fashion might be slow with deeply nested tensordicts, so it is not recommended to call this method inside a training loop.

memmap_like(prefix: Optional[str] = None, copy_existing: bool = False, *, existsok: bool = True, num_threads: int = 0, return_early: bool = False, share_non_tensor: bool = False) → T

Creates a contentless Memory-mapped tensordict with the same shapes as the original one.

Parameters:

prefix (str) – directory prefix where the memory-mapped tensors will be stored. The directory tree structure will mimic the tensordict’s.
copy_existing (bool) – If False (default), an exception will be raised if an entry in the tensordict is already a tensor stored on disk with an associated file, but is not saved in the correct location according to prefix. If True, any existing Tensor will be copied to the new location.

Keyword Arguments:

num_threads (int, optional) – the number of threads used to write the memmap tensors. Defaults to 0.
return_early (bool, optional) – if True and num_threads>0, the method will return a future of the tensordict.
share_non_tensor (bool, optional) – if True, the non-tensor data will be shared between the processes and writing operation (such as inplace update or set) on any of the workers within a single node will update the value on all other workers. If the number of non-tensor leaves is high (e.g., sharing large stacks of non-tensor data) this may result in OOM or similar errors. Defaults to False.
existsok (bool, optional) – if False, an exception will be raised if a tensor already exists in the same path. Defaults to True.

The TensorDict is then locked, meaning that any writing operations that isn’t in-place will throw an exception (eg, rename, set or remove an entry). Once the tensordict is unlocked, the memory-mapped attribute is turned to False, because cross-process identity is not guaranteed anymore.

Returns:: A new TensorDict instance with data stored as memory-mapped tensors if return_early=False, otherwise a TensorDictFuture instance.

Note

This is the recommended method to write a set of large buffers on disk, as memmap_() will copy the information, which can be slow for large content.

Examples

>>> td = TensorDict({
...     "a": torch.zeros((3, 64, 64), dtype=torch.uint8),
...     "b": torch.zeros(1, dtype=torch.int64),
... }, batch_size=[]).expand(1_000_000)  # expand does not allocate new memory
>>> buffer = td.memmap_like("/path/to/dataset")

memmap_refresh_()

Refreshes the content of the memory-mapped tensordict if it has a saved_path.

This method will raise an exception if no path is associated with it.

min(dim: int | NO_DEFAULT = _NoDefault.ZERO, keepdim: bool = False, *, reduce: bool | None = None, return_indices: bool = True) → TensorDictBase | torch.Tensor

Returns the minimum values of all elements in the input tensordict.

Parameters:

dim (int, optional) – if None, returns a dimensionless tensordict containing the min value of all leaves (if this can be computed). If integer, min is called upon the dimension specified if and only if this dimension is compatible with the tensordict shape.
keepdim (bool) – whether the output tensor has dim retained or not.

Keyword Arguments:

reduce (bool, optional) – if True, the reduciton will occur across all TensorDict values and a single reduced tensor will be returned. Defaults to False.
return_argmins (bool, optional) – min() returns a named tuple with values and indices when the dim argument is passed. The TensorDict equivalent of this is to return a tensorclass with entries "values" and "indices" with idendical structure within. Defaults to True.

Examples

>>> from tensordict import TensorDict
>>> import torch
>>> td = TensorDict(
...     a=torch.randn(3, 4, 5),
...     b=TensorDict(
...         c=torch.randn(3, 4, 5, 6),
...         d=torch.randn(3, 4, 5),
...         batch_size=(3, 4, 5),
...     ),
...     batch_size=(3, 4)
... )
>>> td.min(dim=0)
min(
    indices=TensorDict(
        fields={
            a: Tensor(shape=torch.Size([4, 5]), device=cpu, dtype=torch.int64, is_shared=False),
            b: TensorDict(
                fields={
                    c: Tensor(shape=torch.Size([4, 5, 6]), device=cpu, dtype=torch.int64, is_shared=False),
                    d: Tensor(shape=torch.Size([4, 5]), device=cpu, dtype=torch.int64, is_shared=False)},
                batch_size=torch.Size([4]),
                device=None,
                is_shared=False)},
        batch_size=torch.Size([4]),
        device=None,
        is_shared=False),
    vals=TensorDict(
        fields={
            a: Tensor(shape=torch.Size([4, 5]), device=cpu, dtype=torch.float32, is_shared=False),
            b: TensorDict(
                fields={
                    c: Tensor(shape=torch.Size([4, 5, 6]), device=cpu, dtype=torch.float32, is_shared=False),
                    d: Tensor(shape=torch.Size([4, 5]), device=cpu, dtype=torch.float32, is_shared=False)},
                batch_size=torch.Size([4]),
                device=None,
                is_shared=False)},
        batch_size=torch.Size([4]),
        device=None,
        is_shared=False),
    batch_size=torch.Size([4]),
    device=None,
    is_shared=False)
>>> td.min()
TensorDict(
    fields={
        a: Tensor(shape=torch.Size([]), device=cpu, dtype=torch.float32, is_shared=False),
        b: TensorDict(
            fields={
                c: Tensor(shape=torch.Size([]), device=cpu, dtype=torch.float32, is_shared=False),
                d: Tensor(shape=torch.Size([]), device=cpu, dtype=torch.float32, is_shared=False)},
            batch_size=torch.Size([]),
            device=None,
            is_shared=False)},
    batch_size=torch.Size([]),
    device=None,
    is_shared=False)
>>> td.min(reduce=True)
tensor(-2.9953)

minimum(other: tensordict.base.TensorDictBase | torch.Tensor, *, default: str | torch.Tensor | None = None) → T

Computes the element-wise minimum of self and other.

Parameters:: other (TensorDict or Tensor) – the other input tensordict or tensor.
Keyword Arguments:: default (torch.Tensor or str, optional) – the default value to use for exclusive entries. If none is provided, the two tensordicts key list must match exactly. If default="intersection" is passed, only the intersecting key sets will be considered and other keys will be ignored. In all other cases, default will be used for all missing entries on both sides of the operation.

minimum_(other: tensordict.base.TensorDictBase | torch.Tensor) → T: In-place version of minimum().

Note

Inplace minimum does not support default keyword argument.

mul(other: tensordict.base.TensorDictBase | torch.Tensor, *, default: str | torch.Tensor | None = None) → T

Multiplies other to self.

{out}_{i} = {input}_{i} \times {other}_{i}

Supports broadcasting, type promotion, and integer, float, and complex inputs.

Parameters:: other (TensorDict, Tensor or Number) – the tensor or number to subtract from self.
Keyword Arguments:: default (torch.Tensor or str, optional) – the default value to use for exclusive entries. If none is provided, the two tensordicts key list must match exactly. If default="intersection" is passed, only the intersecting key sets will be considered and other keys will be ignored. In all other cases, default will be used for all missing entries on both sides of the operation.

mul_(other: tensordict.base.TensorDictBase | torch.Tensor) → T: In-place version of mul().

Note

Inplace mul does not support default keyword argument.

named_apply(fn: Callable, *others: T, nested_keys: bool = False, batch_size: Optional[Sequence[int]] = None, device: torch.device | None = _NoDefault.ZERO, names: Optional[Sequence[str]] = _NoDefault.ZERO, inplace: bool = False, default: Any = _NoDefault.ZERO, filter_empty: Optional[bool] = None, propagate_lock: bool = False, call_on_nested: bool = False, out: Optional[TensorDictBase] = None, **constructor_kwargs) → Optional[T]

Applies a key-conditioned callable to all values stored in the tensordict and sets them in a new atensordict.

The callable signature must be Callable[Tuple[str, Tensor, ...], Optional[Union[Tensor, TensorDictBase]]].

Parameters:

fn (Callable) – function to be applied to the (name, tensor) pairs in the tensordict. For each leaf, only its leaf name will be used (not the full NestedKey).
*others (TensorDictBase instances, optional) – if provided, these tensordict instances should have a structure matching the one of self. The fn argument should receive as many unnamed inputs as the number of tensordicts, including self. If other tensordicts have missing entries, a default value can be passed through the default keyword argument.
nested_keys (bool, optional) – if True, the complete path to the leaf will be used. Defaults to False, i.e. only the last string is passed to the function.
batch_size (sequence of int, optional) – if provided, the resulting TensorDict will have the desired batch_size. The batch_size argument should match the batch_size after the transformation. This is a keyword only argument.
device (torch.device, optional) – the resulting device, if any.
names (list of str, optional) – the new dimension names, in case the batch_size is modified.
inplace (bool, optional) – if True, changes are made in-place. Default is False. This is a keyword only argument.
default (Any, optional) – default value for missing entries in the other tensordicts. If not provided, missing entries will raise a KeyError.
filter_empty (bool, optional) – if True, empty tensordicts will be filtered out. This also comes with a lower computational cost as empty data structures won’t be created and destroyed. Defaults to False for backward compatibility.
propagate_lock (bool, optional) – if True, a locked tensordict will produce another locked tensordict. Defaults to False.

call_on_nested (bool, optional) –

if True, the function will be called on first-level tensors and containers (TensorDict or tensorclass). In this scenario, func is responsible of propagating its calls to nested levels. This allows a fine-grained behaviour when propagating the calls to nested tensordicts. If False, the function will only be called on leaves, and apply will take care of dispatching the function to all leaves.

>>> td = TensorDict({"a": {"b": [0.0, 1.0]}, "c": [1.0, 2.0]})
>>> def mean_tensor_only(val):
...     if is_tensor_collection(val):
...         raise RuntimeError("Unexpected!")
...     return val.mean()
>>> td_mean = td.apply(mean_tensor_only)
>>> def mean_any(val):
...     if is_tensor_collection(val):
...         # Recurse
...         return val.apply(mean_any, call_on_nested=True)
...     return val.mean()
>>> td_mean = td.apply(mean_any, call_on_nested=True)

out (TensorDictBase, optional) –
a tensordict where to write the results. This can be used to avoid creating a new tensordict:
```
>>> td = TensorDict({"a": 0})
>>> td.apply(lambda x: x+1, out=td)
>>> assert (td==1).all()
```
Warning

If the operation executed on the tensordict requires multiple keys to be accessed for a single computation, providing an out argument equal to self can cause the operation to provide silently wrong results. For instance:
```
>>> td = TensorDict({"a": 1, "b": 1})
>>> td.apply(lambda x: x+td["a"])["b"] # Right!
tensor(2)
>>> td.apply(lambda x: x+td["a"], out=td)["b"] # Wrong!
tensor(3)
```
**constructor_kwargs – additional keyword arguments to be passed to the TensorDict constructor.

Returns:

a new tensordict with transformed_in tensors.

Example

>>> td = TensorDict({
...     "a": -torch.ones(3),
...     "nested": {"a": torch.ones(3), "b": torch.zeros(3)}},
...     batch_size=[3])
>>> def name_filter(name, tensor):
...     if name == "a":
...         return tensor
>>> td.named_apply(name_filter)
TensorDict(
    fields={
        a: Tensor(shape=torch.Size([3]), device=cpu, dtype=torch.float32, is_shared=False),
        nested: TensorDict(
            fields={
                a: Tensor(shape=torch.Size([3]), device=cpu, dtype=torch.float32, is_shared=False)},
            batch_size=torch.Size([3]),
            device=None,
            is_shared=False)},
    batch_size=torch.Size([3]),
    device=None,
    is_shared=False)
>>> def name_filter(name, *tensors):
...     if name == "a":
...         r = 0
...         for tensor in tensors:
...             r = r + tensor
...         return tensor
>>> out = td.named_apply(name_filter, td)
>>> print(out)
TensorDict(
    fields={
        a: Tensor(shape=torch.Size([3]), device=cpu, dtype=torch.float32, is_shared=False),
        nested: TensorDict(
            fields={
                a: Tensor(shape=torch.Size([3]), device=cpu, dtype=torch.float32, is_shared=False)},
            batch_size=torch.Size([3]),
            device=None,
            is_shared=False)},
    batch_size=torch.Size([3]),
    device=None,
    is_shared=False)
>>> print(out["a"])
tensor([-1., -1., -1.])

Note

If None is returned by the function, the entry is ignored. This can be used to filter the data in the tensordict:

>>> td = TensorDict({"1": 1, "2": 2, "b": {"2": 2, "1": 1}}, [])
>>> def name_filter(name, tensor):
...     if name == "1":
...         return tensor
>>> td.named_apply(name_filter)
TensorDict(
    fields={
        1: Tensor(shape=torch.Size([]), device=cpu, dtype=torch.int64, is_shared=False),
        b: TensorDict(
            fields={
                1: Tensor(shape=torch.Size([]), device=cpu, dtype=torch.int64, is_shared=False)},
            batch_size=torch.Size([]),
            device=None,
            is_shared=False)},
    batch_size=torch.Size([]),
    device=None,
    is_shared=False)

property names

The dimension names of the tensordict.

The names can be set at construction time using the names argument.

See also refine_names() for details on how to set the names after construction.

nanmean(dim: Union[int, Tuple[int], Literal['feature']] = _NoDefault.ZERO, keepdim: bool = _NoDefault.ZERO, *, dtype: Optional[dtype] = None, reduce: Optional[bool] = None) → tensordict.base.TensorDictBase | torch.Tensor

Returns the mean of all non-NaN elements in the input tensordict.

Parameters:

dim (int, tuple of int, optional) – if None, returns a dimensionless tensordict containing the mean value of all leaves (if this can be computed). If integer or tuple of integers, mean is called upon the dimension specified if and only if this dimension is compatible with the tensordict shape. Only the “feature” string is currently permitted. Using dim=”feature” will achieve the reduction over all feature dimensions. If reduce=True, a tensor of the shape of the TensorDict’s batch-size will be returned. Otherwise, a new tensordict with the same structure as self with reduced feature dimensions will be returned.
keepdim (bool) – whether the output tensor has dim retained or not.

Keyword Arguments:

dtype (torch.dtype, optional) – the desired data type of returned tensor. If specified, the input tensor is casted to dtype before the operation is performed. This is useful for preventing data type overflows. Default: None.
reduce (bool, optional) – if True, the reduciton will occur across all TensorDict values and a single reduced tensor will be returned. Defaults to False.

Examples

>>> from tensordict import TensorDict
>>> import torch
>>> td = TensorDict(
...     a=torch.randn(3, 4, 5),
...     b=TensorDict(
...         c=torch.randn(3, 4, 5, 6),
...         d=torch.randn(3, 4, 5),
...         batch_size=(3, 4, 5),
...     ),
...     batch_size=(3, 4)
... )
>>> td.nanmean(dim=0)
TensorDict(
    fields={
        a: Tensor(shape=torch.Size([4, 5]), device=cpu, dtype=torch.float32, is_shared=False),
        b: TensorDict(
            fields={
                c: Tensor(shape=torch.Size([4, 5, 6]), device=cpu, dtype=torch.float32, is_shared=False),
                d: Tensor(shape=torch.Size([4, 5]), device=cpu, dtype=torch.float32, is_shared=False)},
            batch_size=torch.Size([4, 5]),
            device=None,
            is_shared=False)},
    batch_size=torch.Size([4]),
    device=None,
    is_shared=False)
>>> td.nanmean()
TensorDict(
    fields={
        a: Tensor(shape=torch.Size([]), device=cpu, dtype=torch.float32, is_shared=False),
        b: TensorDict(
            fields={
                c: Tensor(shape=torch.Size([]), device=cpu, dtype=torch.float32, is_shared=False),
                d: Tensor(shape=torch.Size([]), device=cpu, dtype=torch.float32, is_shared=False)},
            batch_size=torch.Size([]),
            device=None,
            is_shared=False)},
    batch_size=torch.Size([]),
    device=None,
    is_shared=False)
>>> td.nanmean(reduce=True)
tensor(-0.0547)
>>> td.nanmean(dim="feature")
TensorDict(
    fields={
        a: Tensor(shape=torch.Size([3, 4]), device=cpu, dtype=torch.float32, is_shared=False),
        b: TensorDict(
            fields={
                c: Tensor(shape=torch.Size([3, 4, 5]), device=cpu, dtype=torch.float32, is_shared=False),
                d: Tensor(shape=torch.Size([3, 4, 5]), device=cpu, dtype=torch.float32, is_shared=False)},
            batch_size=torch.Size([3, 4, 5]),
            device=None,
            is_shared=False)},
    batch_size=torch.Size([3, 4]),
    device=None,
    is_shared=False)
>>> td = TensorDict(
...     a=torch.ones(3, 4, 5),
...     b=TensorDict(
...         c=torch.ones(3, 4, 5),
...         d=torch.ones(3, 4, 5),
...         batch_size=(3, 4, 5),
...     ),
...     batch_size=(3, 4)
... )
>>> td.nanmean(reduce=True, dim="feature")
tensor([[1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]])
>>> td.nanmean(reduce=True, dim=0)
tensor([[1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1.]])

nansum(dim: Union[int, Tuple[int], Literal['feature']] = _NoDefault.ZERO, keepdim: bool = _NoDefault.ZERO, *, dtype: Optional[dtype] = None, reduce: Optional[bool] = None) → tensordict.base.TensorDictBase | torch.Tensor

Returns the sum of all non-NaN elements in the input tensordict.

Parameters:

dim (int, tuple of int, optional) – if None, returns a dimensionless tensordict containing the sum value of all leaves (if this can be computed). If integer or tuple of integers, sum is called upon the dimension specified if and only if this dimension is compatible with the tensordict shape. Only the “feature” string is currently permitted. Using dim=”feature” will achieve the reduction over all feature dimensions. If reduce=True, a tensor of the shape of the TensorDict’s batch-size will be returned. Otherwise, a new tensordict with the same structure as self with reduced feature dimensions will be returned.
keepdim (bool) – whether the output tensor has dim retained or not.

Keyword Arguments:

dtype (torch.dtype, optional) – the desired data type of returned tensor. If specified, the input tensor is casted to dtype before the operation is performed. This is useful for preventing data type overflows. Default: None.
reduce (bool, optional) – if True, the reduciton will occur across all TensorDict values and a single reduced tensor will be returned. Defaults to False.

Examples

>>> from tensordict import TensorDict
>>> import torch
>>> td = TensorDict(
...     a=torch.randn(3, 4, 5),
...     b=TensorDict(
...         c=torch.randn(3, 4, 5, 6),
...         d=torch.randn(3, 4, 5),
...         batch_size=(3, 4, 5),
...     ),
...     batch_size=(3, 4)
... )
>>> td.nansum(dim=0)
TensorDict(
    fields={
        a: Tensor(shape=torch.Size([4, 5]), device=cpu, dtype=torch.float32, is_shared=False),
        b: TensorDict(
            fields={
                c: Tensor(shape=torch.Size([4, 5, 6]), device=cpu, dtype=torch.float32, is_shared=False),
                d: Tensor(shape=torch.Size([4, 5]), device=cpu, dtype=torch.float32, is_shared=False)},
            batch_size=torch.Size([4, 5]),
            device=None,
            is_shared=False)},
    batch_size=torch.Size([4]),
    device=None,
    is_shared=False)
>>> td.nansum()
TensorDict(
    fields={
        a: Tensor(shape=torch.Size([]), device=cpu, dtype=torch.float32, is_shared=False),
        b: TensorDict(
            fields={
                c: Tensor(shape=torch.Size([]), device=cpu, dtype=torch.float32, is_shared=False),
                d: Tensor(shape=torch.Size([]), device=cpu, dtype=torch.float32, is_shared=False)},
            batch_size=torch.Size([]),
            device=None,
            is_shared=False)},
    batch_size=torch.Size([]),
    device=None,
    is_shared=False)
>>> td.nansum(reduce=True)
tensor(-0.)
>>> td.nansum(dim="feature")
TensorDict(
    fields={
        a: Tensor(shape=torch.Size([3, 4]), device=cpu, dtype=torch.float32, is_shared=False),
        b: TensorDict(
            fields={
                c: Tensor(shape=torch.Size([3, 4, 5]), device=cpu, dtype=torch.float32, is_shared=False),
                d: Tensor(shape=torch.Size([3, 4, 5]), device=cpu, dtype=torch.float32, is_shared=False)},
            batch_size=torch.Size([3, 4, 5]),
            device=None,
            is_shared=False)},
    batch_size=torch.Size([3, 4]),
    device=None,
    is_shared=False)
>>> td = TensorDict(
...     a=torch.ones(3, 4, 5),
...     b=TensorDict(
...         c=torch.ones(3, 4, 5),
...         d=torch.ones(3, 4, 5),
...         batch_size=(3, 4, 5),
...     ),
...     batch_size=(3, 4)
... )
>>> td.nansum(reduce=True, dim="feature")
tensor([[15., 15., 15., 15.],
        [15., 15., 15., 15.],
        [15., 15., 15., 15.]])
>>> td.nansum(reduce=True, dim=0)
tensor([[9., 9., 9., 9., 9.],
        [9., 9., 9., 9., 9.],
        [9., 9., 9., 9., 9.],
        [9., 9., 9., 9., 9.]])

property ndim: int: See batch_dims().

ndimension() → int: See batch_dims().

neg() → T: Computes the neg() value of each element of the TensorDict.

neg_() → T: Computes the neg() value of each element of the TensorDict in-place.

new_empty(*size: Size, dtype: Optional[dtype] = None, device: Union[device, str, int] = _NoDefault.ZERO, requires_grad: bool = False, layout: layout = torch.strided, pin_memory: Optional[bool] = None)

Returns a TensorDict of size size with emtpy tensors.

By default, the returned TensorDict has the same torch.dtype and torch.device as this tensordict.

Parameters:

size (int...) – a list, tuple, or torch.Size of integers defining the shape of the output tensor.

Keyword Arguments:

dtype (torch.dtype, optional) – the desired type of returned tensordict. Default: if None, the torch.dtype will be unchanged.
device (torch.device, optional) – the desired device of returned tensordict. Default: if None, the torch.device will be unchanged.
requires_grad (bool, optional) – If autograd should record operations on the returned tensors. Default: False.
layout (torch.layout, optional) – the desired layout of returned TensorDict values. Default: torch.strided.
pin_memory (bool, optional) – If set, returned tensor would be allocated in the pinned memory. Works only for CPU tensors. Default: False.

new_full(size: Size, fill_value, *, dtype: Optional[dtype] = None, device: Union[device, str, int] = _NoDefault.ZERO, requires_grad: bool = False, layout: layout = torch.strided, pin_memory: Optional[bool] = None)

Returns a TensorDict of size size filled with 1.

By default, the returned TensorDict has the same torch.dtype and torch.device as this tensordict.

Parameters:

size (sequence of int) – a list, tuple, or torch.Size of integers defining the shape of the output tensor.
fill_value (scalar) – the number to fill the output tensor with.

Keyword Arguments:

dtype (torch.dtype, optional) – the desired type of returned tensordict. Default: if None, the torch.dtype will be unchanged.
device (torch.device, optional) – the desired device of returned tensordict. Default: if None, the torch.device will be unchanged.
requires_grad (bool, optional) – If autograd should record operations on the returned tensors. Default: False.
layout (torch.layout, optional) – the desired layout of returned TensorDict values. Default: torch.strided.
pin_memory (bool, optional) – If set, returned tensor would be allocated in the pinned memory. Works only for CPU tensors. Default: False.

new_ones(*size: Size, dtype: Optional[dtype] = None, device: Union[device, str, int] = _NoDefault.ZERO, requires_grad: bool = False, layout: layout = torch.strided, pin_memory: Optional[bool] = None)

Returns a TensorDict of size size filled with 1.

By default, the returned TensorDict has the same torch.dtype and torch.device as this tensordict.

Parameters:

size (int...) – a list, tuple, or torch.Size of integers defining the shape of the output tensor.

Keyword Arguments:

dtype (torch.dtype, optional) – the desired type of returned tensordict. Default: if None, the torch.dtype will be unchanged.
device (torch.device, optional) – the desired device of returned tensordict. Default: if None, the torch.device will be unchanged.
requires_grad (bool, optional) – If autograd should record operations on the returned tensors. Default: False.
layout (torch.layout, optional) – the desired layout of returned TensorDict values. Default: torch.strided.
pin_memory (bool, optional) – If set, returned tensor would be allocated in the pinned memory. Works only for CPU tensors. Default: False.

new_tensor(data: torch.Tensor | tensordict.base.TensorDictBase, *, dtype: Optional[dtype] = None, device: Union[device, str, int] = _NoDefault.ZERO, requires_grad: bool = False, pin_memory: Optional[bool] = None)

Returns a new TensorDict with data as the tensor data.

By default, the returned TensorDict values have the same torch.dtype and torch.device as this tensor.

The data can also be a tensor collection (TensorDict or tensorclass), in which case the new_tensor method iterates over the tensor pairs of self and data.

Parameters:

data (torch.Tensor or TensorDictBase) – the data to be copied.

Keyword Arguments:

dtype (torch.dtype, optional) – the desired type of returned tensordict. Default: if None, the torch.dtype will be unchanged.
device (torch.device, optional) – the desired device of returned tensordict. Default: if None, the torch.device will be unchanged.
requires_grad (bool, optional) – If autograd should record operations on the returned tensors. Default: False.
pin_memory (bool, optional) – If set, returned tensor would be allocated in the pinned memory. Works only for CPU tensors. Default: False.

new_zeros(*size: Size, dtype: Optional[dtype] = None, device: Union[device, str, int] = _NoDefault.ZERO, requires_grad: bool = False, layout: layout = torch.strided, pin_memory: Optional[bool] = None)

Returns a TensorDict of size size filled with 0.

By default, the returned TensorDict has the same torch.dtype and torch.device as this tensordict.

Parameters:

size (int...) – a list, tuple, or torch.Size of integers defining the shape of the output tensor.

Keyword Arguments:

dtype (torch.dtype, optional) – the desired type of returned tensordict. Default: if None, the torch.dtype will be unchanged.
device (torch.device, optional) – the desired device of returned tensordict. Default: if None, the torch.device will be unchanged.
requires_grad (bool, optional) – If autograd should record operations on the returned tensors. Default: False.
layout (torch.layout, optional) – the desired layout of returned TensorDict values. Default: torch.strided.
pin_memory (bool, optional) – If set, returned tensor would be allocated in the pinned memory. Works only for CPU tensors. Default: False.

non_tensor_items(include_nested: bool = False): Returns all non-tensor leaves, maybe recursively.

norm(*, out=None, dtype: torch.dtype | None = None)

Computes the norm of each tensor in the tensordict.

Keyword Arguments:

out (TensorDict, optional) – the output tensordict.
dtype (torch.dtype, optional) – the output dtype (torch>=2.4).

numel() → int

Total number of elements in the batch.

Lower-bounded to 1, as a stack of two tensordict with empty shape will have two elements, therefore we consider that a tensordict is at least 1-element big.

numpy()

Converts a tensordict to a (possibly nested) dictionary of numpy arrays.

Non-tensor data is exposed as such.

Examples

>>> from tensordict import TensorDict
>>> import torch
>>> data = TensorDict({"a": {"b": torch.zeros(()), "c": "a string!"}})
>>> print(data)
TensorDict(
    fields={
        a: TensorDict(
            fields={
                b: Tensor(shape=torch.Size([]), device=cpu, dtype=torch.float32, is_shared=False),
                c: NonTensorData(data=a string!, batch_size=torch.Size([]), device=None)},
            batch_size=torch.Size([]),
            device=None,
            is_shared=False)},
    batch_size=torch.Size([]),
    device=None,
    is_shared=False)
>>> print(data.numpy())
{'a': {'b': array(0., dtype=float32), 'c': 'a string!'}}

param_count(*, count_duplicates: bool = True) → int

Counts the number of parameters (total number of indexable items), accounting for tensors only.

Keyword Arguments:: count_duplicates (bool) – Whether to count duplicated tensor as independent or not. If False, only strictly identical tensors will be discarded (same views but different ids from a common base tensor will be counted twice). Defaults to True (each tensor is assumed to be a single copy).

permute(*args, **kwargs)

Returns a view of a tensordict with the batch dimensions permuted according to dims.

Parameters:

*dims_list (int) – the new ordering of the batch dims of the tensordict. Alternatively, a single iterable of integers can be provided.
dims (list of int) – alternative way of calling permute(…).

Returns:

a new tensordict with the batch dimensions in the desired order.

Examples

>>> tensordict = TensorDict({"a": torch.randn(3, 4, 5)}, [3, 4])
>>> print(tensordict.permute([1, 0]))
PermutedTensorDict(
    source=TensorDict(
        fields={
            a: Tensor(torch.Size([3, 4, 5]), dtype=torch.float32)},
        batch_size=torch.Size([3, 4]),
        device=cpu,
        is_shared=False),
    op=permute(dims=[1, 0]))
>>> print(tensordict.permute(1, 0))
PermutedTensorDict(
    source=TensorDict(
        fields={
            a: Tensor(torch.Size([3, 4, 5]), dtype=torch.float32)},
        batch_size=torch.Size([3, 4]),
        device=cpu,
        is_shared=False),
    op=permute(dims=[1, 0]))
>>> print(tensordict.permute(dims=[1, 0]))
PermutedTensorDict(
    source=TensorDict(
        fields={
            a: Tensor(torch.Size([3, 4, 5]), dtype=torch.float32)},
        batch_size=torch.Size([3, 4]),
        device=cpu,
        is_shared=False),
    op=permute(dims=[1, 0]))

pin_memory(num_threads: Optional[int] = None, inplace: bool = False) → T

Calls pin_memory() on the stored tensors.

Parameters:

num_threads (int or str) – if provided, the number of threads to use to call pin_memory on the leaves. Defaults to None, which sets a high number of threads in ThreadPoolExecutor(max_workers=None). To execute all the calls to pin_memory() on the main thread, pass num_threads=0.
inplace (bool, optional) – if True, the tensordict is modified in-place. Defaults to False.

pin_memory_(num_threads: int | str = 0) → T

Calls pin_memory() on the stored tensors and returns the TensorDict modifies in-place.

Parameters:: num_threads (int or str) – if provided, the number of threads to use to call pin_memory on the leaves. If "auto" is passed, the number of threads is automatically determined.

pop(key: NestedKey, default: Any = _NoDefault.ZERO) → Tensor

Removes and returns a value from a tensordict.

If the value is not present and no default value is provided, a KeyError is thrown.

Parameters:

key (str or nested key) – the entry to look for.
default (Any, optional) – the value to return if the key cannot be found.

Examples

>>> td = TensorDict({"1": 1}, [])
>>> one = td.pop("1")
>>> assert one == 1
>>> none = td.pop("1", default=None)
>>> assert none is None

popitem() → Tuple[NestedKey, Tensor]

Removes the item that was last inserted into the TensorDict.

popitem will only return non-nested values.

pow(other: tensordict.base.TensorDictBase | torch.Tensor, *, default: str | torch.Tensor | None = None) → T

Takes the power of each element in self with other and returns a tensor with the result.

other can be either a single float number, a Tensor or a TensorDict.

When other is a tensor, the shapes of input and other must be broadcastable.

Parameters:: other (float, tensor or tensordict) – the exponent value
Keyword Arguments:: default (torch.Tensor or str, optional) – the default value to use for exclusive entries. If none is provided, the two tensordicts key list must match exactly. If default="intersection" is passed, only the intersecting key sets will be considered and other keys will be ignored. In all other cases, default will be used for all missing entries on both sides of the operation.

pow_(other: tensordict.base.TensorDictBase | torch.Tensor) → T: In-place version of pow().

Note

Inplace pow does not support default keyword argument.

prod(dim: Union[int, Tuple[int], Literal['feature']] = _NoDefault.ZERO, keepdim: bool = _NoDefault.ZERO, *, dtype: Optional[dtype] = None, reduce: Optional[bool] = None) → tensordict.base.TensorDictBase | torch.Tensor

Returns the produce of values of all elements in the input tensordict.

Parameters:

dim (int, tuple of int, optional) – if None, returns a dimensionless tensordict containing the prod value of all leaves (if this can be computed). If integer or tuple of integers, prod is called upon the dimension specified if and only if this dimension is compatible with the tensordict shape. Only the “feature” string is currently permitted. Using dim=”feature” will achieve the reduction over all feature dimensions. If reduce=True, a tensor of the shape of the TensorDict’s batch-size will be returned. Otherwise, a new tensordict with the same structure as self with reduced feature dimensions will be returned.
keepdim (bool) – whether the output tensor has dim retained or not.

Keyword Arguments:

dtype (torch.dtype, optional) – the desired data type of returned tensor. If specified, the input tensor is casted to dtype before the operation is performed. This is useful for preventing data type overflows. Default: None.
reduce (bool, optional) – if True, the reduciton will occur across all TensorDict values and a single reduced tensor will be returned. Defaults to False.

Examples

>>> from tensordict import TensorDict
>>> import torch
>>> td = TensorDict(
...     a=torch.randn(3, 4, 5),
...     b=TensorDict(
...         c=torch.randn(3, 4, 5, 6),
...         d=torch.randn(3, 4, 5),
...         batch_size=(3, 4, 5),
...     ),
...     batch_size=(3, 4)
... )
>>> td.prod(dim=0)
TensorDict(
    fields={
        a: Tensor(shape=torch.Size([4, 5]), device=cpu, dtype=torch.float32, is_shared=False),
        b: TensorDict(
            fields={
                c: Tensor(shape=torch.Size([4, 5, 6]), device=cpu, dtype=torch.float32, is_shared=False),
                d: Tensor(shape=torch.Size([4, 5]), device=cpu, dtype=torch.float32, is_shared=False)},
            batch_size=torch.Size([4, 5]),
            device=None,
            is_shared=False)},
    batch_size=torch.Size([4]),
    device=None,
    is_shared=False)
>>> td.prod()
TensorDict(
    fields={
        a: Tensor(shape=torch.Size([]), device=cpu, dtype=torch.float32, is_shared=False),
        b: TensorDict(
            fields={
                c: Tensor(shape=torch.Size([]), device=cpu, dtype=torch.float32, is_shared=False),
                d: Tensor(shape=torch.Size([]), device=cpu, dtype=torch.float32, is_shared=False)},
            batch_size=torch.Size([]),
            device=None,
            is_shared=False)},
    batch_size=torch.Size([]),
    device=None,
    is_shared=False)
>>> td.prod(reduce=True)
tensor(-0.)
>>> td.prod(dim="feature")
TensorDict(
    fields={
        a: Tensor(shape=torch.Size([3, 4]), device=cpu, dtype=torch.float32, is_shared=False),
        b: TensorDict(
            fields={
                c: Tensor(shape=torch.Size([3, 4, 5]), device=cpu, dtype=torch.float32, is_shared=False),
                d: Tensor(shape=torch.Size([3, 4, 5]), device=cpu, dtype=torch.float32, is_shared=False)},
            batch_size=torch.Size([3, 4, 5]),
            device=None,
            is_shared=False)},
    batch_size=torch.Size([3, 4]),
    device=None,
    is_shared=False)
>>> td = TensorDict(
...     a=torch.ones(3, 4, 5),
...     b=TensorDict(
...         c=torch.ones(3, 4, 5),
...         d=torch.ones(3, 4, 5),
...         batch_size=(3, 4, 5),
...     ),
...     batch_size=(3, 4)
... )
>>> td.prod(reduce=True, dim="feature")
tensor([[1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]])
>>> td.prod(reduce=True, dim=0)
tensor([[1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1.]])

qint32(): Casts all tensors to torch.qint32.

qint8(): Casts all tensors to torch.qint8.

quint4x2(): Casts all tensors to torch.quint4x2.

quint8(): Casts all tensors to torch.quint8.

reciprocal() → T: Computes the reciprocal() value of each element of the TensorDict.

reciprocal_() → T: Computes the reciprocal() value of each element of the TensorDict in-place.

record_stream(stream: Stream) → T

Marks the tensordict as having been used by this stream.

When the tensordict is deallocated, ensure the tensor memory is not reused for other tensors until all work queued on stream at the time of deallocation is complete.

See record_stream() for more information.`

recv(src: int, *, group: 'torch.distributed.ProcessGroup' | None = None, init_tag: int = 0, pseudo_rand: bool = False) → int

Receives the content of a tensordict and updates content with it.

Check the example in the send method for context.

Parameters:

src (int) – the rank of the source worker.

Keyword Arguments:

group (torch.distributed.ProcessGroup, optional) – if set, the specified process group will be used for communication. Otherwise, the default process group will be used. Defaults to None.
init_tag (int) – the init_tag used by the source worker.
pseudo_rand (bool) – if True, the sequence of tags will be pseudo- random, allowing to send multiple data from different nodes without overlap. Notice that the generation of these pseudo-random numbers is expensive (1e-5 sec/number), meaning that it could slow down the runtime of your algorithm. This value must match the one passed to send(). Defaults to False.

reduce(dst, op=None, async_op=False, return_premature=False, group=None) → None

Reduces the tensordict across all machines.

Only the process with rank dst is going to receive the final result.

refine_names(*names) → T

Refines the dimension names of self according to names.

Refining is a special case of renaming that “lifts” unnamed dimensions. A None dim can be refined to have any name; a named dim can only be refined to have the same name.

Because named tensors can coexist with unnamed tensors, refining names gives a nice way to write named-tensor-aware code that works with both named and unnamed tensors.

names may contain up to one Ellipsis (…). The Ellipsis is expanded greedily; it is expanded in-place to fill names to the same length as self.dim() using names from the corresponding indices of self.names.

Returns: the same tensordict with dimensions named according to the input.

Examples

>>> td = TensorDict({}, batch_size=[3, 4, 5, 6])
>>> tdr = td.refine_names(None, None, None, "d")
>>> assert tdr.names == [None, None, None, "d"]
>>> tdr = td.refine_names("a", None, None, "d")
>>> assert tdr.names == ["a", None, None, "d"]

rename(*names, **rename_map)

Returns a clone of the tensordict with dimensions renamed.

Examples

>>> td = TensorDict({}, batch_size=[1, 2, 3 ,4])
>>> td.names = list("abcd")
>>> td_rename = td.rename(c="g")
>>> assert td_rename.names == list("abgd")

rename_(*names, **rename_map)

Same as rename(), but executes the renaming in-place.

Examples

>>> td = TensorDict({}, batch_size=[1, 2, 3 ,4])
>>> td.names = list("abcd")
>>> assert td.rename_(c="g")
>>> assert td.names == list("abgd")

rename_key_(old_key: NestedKey, new_key: NestedKey, safe: bool = False) → T

Renames a key with a new string and returns the same tensordict with the updated key name.

Parameters:

old_key (str or nested key) – key to be renamed.
new_key (str or nested key) – new name of the entry.
safe (bool, optional) – if True, an error is thrown when the new key is already present in the TensorDict.

Returns:

self

repeat(*repeats: int) → TensorDictBase

Repeats this tensor along the specified dimensions.

Unlike expand(), this function copies the tensor’s data.

Warning

repeat() behaves differently from repeat(), but is more similar to numpy.tile(). For the operator similar to numpy.repeat(), see repeat_interleave().

Parameters:: repeat (torch.Size, int..., tuple of int or list of int) – The number of times to repeat this tensor along each dimension.

Examples

>>> import torch
>>>
>>> from tensordict import TensorDict
>>>
>>> td = TensorDict(
...     {
...         "a": torch.randn(3, 4, 5),
...         "b": TensorDict({
...             "c": torch.randn(3, 4, 10, 1),
...             "a string": "a string!",
...         }, batch_size=[3, 4, 10])
...     }, batch_size=[3, 4],
... )
>>> print(td.repeat(1, 2))
TensorDict(
    fields={
        a: Tensor(shape=torch.Size([3, 8, 5]), device=cpu, dtype=torch.float32, is_shared=False),
        b: TensorDict(
            fields={
                a string: NonTensorData(data=a string!, batch_size=torch.Size([3, 8, 10]), device=None),
                c: Tensor(shape=torch.Size([3, 8, 10, 1]), device=cpu, dtype=torch.float32, is_shared=False)},
            batch_size=torch.Size([3, 8, 10]),
            device=None,
            is_shared=False)},
    batch_size=torch.Size([3, 8]),
    device=None,
    is_shared=False)

repeat_interleave(repeats: torch.Tensor | int, dim: Optional[int] = None, *, output_size: Optional[int] = None) → TensorDictBase

Repeat elements of a TensorDict.

Warning

This is different from repeat() but similar to numpy.repeat().

Parameters:

repeats (torch.Tensor or int) – The number of repetitions for each element. repeats is broadcast to fit the shape of the given axis.
dim (int, optional) – The dimension along which to repeat values. By default, use the flattened input array, and return a flat output array.

Keyword Arguments:

output_size (int, optional) – Total output size for the given axis (e.g. sum of repeats). If given, it will avoid stream synchronization needed to calculate output shape of the tensordict.

Returns:

Repeated TensorDict which has the same shape as input, except along the given axis.

Examples

>>> import torch
>>>
>>> from tensordict import TensorDict
>>>
>>> td = TensorDict(
...     {
...         "a": torch.randn(3, 4, 5),
...         "b": TensorDict({
...             "c": torch.randn(3, 4, 10, 1),
...             "a string": "a string!",
...         }, batch_size=[3, 4, 10])
...     }, batch_size=[3, 4],
... )
>>> print(td.repeat_interleave(2, dim=0))
TensorDict(
    fields={
        a: Tensor(shape=torch.Size([6, 4, 5]), device=cpu, dtype=torch.float32, is_shared=False),
        b: TensorDict(
            fields={
                a string: NonTensorData(data=a string!, batch_size=torch.Size([6, 4, 10]), device=None),
                c: Tensor(shape=torch.Size([6, 4, 10, 1]), device=cpu, dtype=torch.float32, is_shared=False)},
            batch_size=torch.Size([6, 4, 10]),
            device=None,
            is_shared=False)},
    batch_size=torch.Size([6, 4]),
    device=None,
    is_shared=False)

replace(*args, **kwargs)

Creates a shallow copy of the tensordict where entries have been replaced.

Accepts one unnamed argument which must be a dictionary of a TensorDictBase subclass. Additionally, first-level entries can be updated with the named keyword arguments.

Returns:: a copy of self with updated entries if the input is non-empty. If an empty dict or no dict is provided and the kwargs are empty, self is returned.

requires_grad_(requires_grad=True) → T

Change if autograd should record operations on this tensor: sets this tensor’s requires_grad attribute in-place.

Returns this tensordict.

Parameters:: requires_grad (bool, optional) – whether or not autograd should record operations on this tensordict. Defaults to True.

reshape(*args, **kwargs) → T

Returns a contiguous, reshaped tensor of the desired shape.

Parameters:: *shape (int) – new shape of the resulting tensordict.
Returns:: A TensorDict with reshaped keys

Examples

>>> td = TensorDict({
...     'x': torch.arange(12).reshape(3, 4),
... }, batch_size=[3, 4])
>>> td = td.reshape(12)
>>> print(td['x'])
torch.Tensor([0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11])

round() → T: Computes the round() value of each element of the TensorDict.

round_() → T: Computes the round() value of each element of the TensorDict in-place.

save(prefix: Optional[str] = None, copy_existing: bool = False, *, num_threads: int = 0, return_early: bool = False, share_non_tensor: bool = False) → T

Saves the tensordict to disk.

This function is a proxy to memmap().

property saved_path

Returns the path where a memmap saved TensorDict is being stored.

This argument valishes as soon as is_memmap() returns False (e.g., when the tensordict is unlocked).

select(*keys: NestedKey, inplace: bool = False, strict: bool = True) → T

Selects the keys of the tensordict and returns a new tensordict with only the selected keys.

The values are not copied: in-place modifications a tensor of either of the original or new tensordict will result in a change in both tensordicts.

Parameters:

*keys (str) – keys to select
inplace (bool) – if True, the tensordict is pruned in place. Default is False.
strict (bool, optional) – whether selecting a key that is not present will return an error or not. Default: True.

Returns:

A new tensordict (or the same if inplace=True) with the selected keys only.

Note

To select keys in a tensordict and return a version of this tensordict deprived of these keys, see the split_keys() method.

Examples

>>> from tensordict import TensorDict
>>> td = TensorDict({"a": 0, "b": {"c": 1, "d": 2}}, [])
>>> td.select("a", ("b", "c"))
TensorDict(
    fields={
        a: Tensor(shape=torch.Size([]), device=cpu, dtype=torch.int64, is_shared=False),
        b: TensorDict(
            fields={
                c: Tensor(shape=torch.Size([]), device=cpu, dtype=torch.int64, is_shared=False)},
            batch_size=torch.Size([]),
            device=None,
            is_shared=False)},
    batch_size=torch.Size([]),
    device=None,
    is_shared=False)
>>> td.select("a", "b")
TensorDict(
    fields={
        a: Tensor(shape=torch.Size([]), device=cpu, dtype=torch.int64, is_shared=False),
        b: TensorDict(
            fields={
                c: Tensor(shape=torch.Size([]), device=cpu, dtype=torch.int64, is_shared=False),
                d: Tensor(shape=torch.Size([]), device=cpu, dtype=torch.int64, is_shared=False)},
            batch_size=torch.Size([]),
            device=None,
            is_shared=False)},
    batch_size=torch.Size([]),
    device=None,
    is_shared=False)
>>> td.select("this key does not exist", strict=False)
TensorDict(
    fields={
    },
    batch_size=torch.Size([]),
    device=None,
    is_shared=False)

send(dst: int, *, group: 'torch.distributed.ProcessGroup' | None = None, init_tag: int = 0, pseudo_rand: bool = False) → None

Sends the content of a tensordict to a distant worker.

Parameters:

dst (int) – the rank of the destination worker where the content should be sent.

Keyword Arguments:

group (torch.distributed.ProcessGroup, optional) – if set, the specified process group will be used for communication. Otherwise, the default process group will be used. Defaults to None.
init_tag (int) – the initial tag to be used to mark the tensors. Note that this will be incremented by as much as the number of tensors contained in the TensorDict.
pseudo_rand (bool) – if True, the sequence of tags will be pseudo- random, allowing to send multiple data from different nodes without overlap. Notice that the generation of these pseudo-random numbers is expensive (1e-5 sec/number), meaning that it could slow down the runtime of your algorithm. Defaults to False.

Example

>>> from torch import multiprocessing as mp
>>> from tensordict import TensorDict
>>> import torch
>>>
>>>
>>> def client():
...     torch.distributed.init_process_group(
...         "gloo",
...         rank=1,
...         world_size=2,
...         init_method=f"tcp://localhost:10003",
...     )
...
...     td = TensorDict(
...         {
...             ("a", "b"): torch.randn(2),
...             "c": torch.randn(2, 3),
...             "_": torch.ones(2, 1, 5),
...         },
...         [2],
...     )
...     td.send(0)
...
>>>
>>> def server(queue):
...     torch.distributed.init_process_group(
...         "gloo",
...         rank=0,
...         world_size=2,
...         init_method=f"tcp://localhost:10003",
...     )
...     td = TensorDict(
...         {
...             ("a", "b"): torch.zeros(2),
...             "c": torch.zeros(2, 3),
...             "_": torch.zeros(2, 1, 5),
...         },
...         [2],
...     )
...     td.recv(1)
...     assert (td != 0).all()
...     queue.put("yuppie")
...
>>>
>>> if __name__=="__main__":
...     queue = mp.Queue(1)
...     main_worker = mp.Process(target=server, args=(queue,))
...     secondary_worker = mp.Process(target=client)
...
...     main_worker.start()
...     secondary_worker.start()
...     out = queue.get(timeout=10)
...     assert out == "yuppie"
...     main_worker.join()
...     secondary_worker.join()

separates(*keys: NestedKey, default: Any = _NoDefault.ZERO, strict: bool = True, filter_empty: bool = True) → T

Separates the specified keys from the tensordict in-place.

LazyStackedTensorDict

Docs

Tutorials

Resources