Managing Tensor Memory in C++¶
Author: Anthony Shoumikhin
Tensors are fundamental data structures in ExecuTorch, representing multi-dimensional arrays used in computations for neural networks and other numerical algorithms. In ExecuTorch, the Tensor class doesn’t own its metadata (sizes, strides, dim_order) or data, keeping the runtime lightweight. Users are responsible for supplying all these memory buffers and ensuring that the metadata and data outlive the Tensor
instance. While this design is lightweight and flexible, especially for tiny embedded systems, it places a significant burden on the user. If your environment requires minimal dynamic allocations, a small binary footprint, or limited C++ standard library support, you’ll need to accept that trade-off and stick with the regular Tensor
type.
Imagine you’re working with a Module
interface, and you need to pass a Tensor
to the forward()
method. You would need to declare and maintain at least the sizes array and data separately, sometimes the strides too, often leading to the following pattern:
#include <executorch/extension/module/module.h>
using namespace executorch::aten;
using namespace executorch::extension;
SizesType sizes[] = {2, 3};
DimOrderType dim_order[] = {0, 1};
StridesType strides[] = {3, 1};
float data[] = {1.0f, 2.0f, 3.0f, 4.0f, 5.0f, 6.0f};
TensorImpl tensor_impl(
ScalarType::Float,
std::size(sizes),
sizes,
data,
dim_order,
strides);
// ...
module.forward(Tensor(&tensor_impl));
You must ensure sizes
, dim_order
, strides
, and data
stay valid. This makes code maintenance difficult and error-prone. Users have struggled to manage lifetimes, and many have created their own ad-hoc managed tensor abstractions to hold all the pieces together, leading to a fragmented and inconsistent ecosystem.
Introducing TensorPtr¶
To alleviate these issues, ExecuTorch provides TensorPtr
, a smart pointer that manages the lifecycle of both the tensor’s data and its dynamic metadata.
With TensorPtr
, users no longer need to worry about metadata lifetimes separately. Data ownership is determined based on whether it is passed by pointer or moved into the TensorPtr
as an std::vector
. Everything is bundled in one place and managed automatically, enabling you to focus on actual computations.
Here’s how you can use it:
#include <executorch/extension/module/module.h>
#include <executorch/extension/tensor/tensor.h>
using namespace executorch::extension;
auto tensor = make_tensor_ptr(
{2, 3}, // sizes
{1.0f, 2.0f, 3.0f, 4.0f, 5.0f, 6.0f}); // data
// ...
module.forward(tensor);
The data is now owned by the tensor instance because it’s provided as a vector. To create a non-owning TensorPtr
, just pass the data by pointer. The type
is deduced automatically based on the data vector (float
). strides
and dim_order
are computed automatically to default values based on the sizes
if not specified explicitly as additional arguments.
EValue
in Module::forward()
accepts TensorPtr
directly, ensuring seamless integration. EValue
can now be constructed implicitly with a smart pointer to any type that it can hold. This allows TensorPtr
to be dereferenced implicitly when passed to forward()
, and EValue
will hold the Tensor
that the TensorPtr
points to.
API Overview¶
TensorPtr
is literally an alias for std::shared_ptr<Tensor>
, so you can work with it easily without duplicating the data and metadata. Each Tensor
instance may either own its data or reference external data.
Creating Tensors¶
There are several ways to create a TensorPtr
.
Creating Scalar Tensors¶
You can create a scalar tensor, i.e. a tensor with zero dimensions or with one of the sizes being zero.
Providing A Single Data Value
auto tensor = make_tensor_ptr(3.14);
The resulting tensor will contain a single value 3.14
of type double, which is deduced automatically.
Providing A Single Data Value with a Type
auto tensor = make_tensor_ptr(42, ScalarType::Float);
Now the integer 42
will be cast to float and the tensor will contain a single value 42
of type float.
Owning Data from a Vector¶
When you provide sizes and data vectors, TensorPtr
takes ownership of both the data and the sizes.
Providing Data Vector
auto tensor = make_tensor_ptr(
{2, 3}, // sizes
{1.0f, 2.0f, 3.0f, 4.0f, 5.0f, 6.0f}); // data (float)
The type is deduced automatically as ScalarType::Float
from the data vector.
Providing Data Vector with a Type
If you provide data of one type but specify a different scalar type, the data will be cast to the given type.
auto tensor = make_tensor_ptr(
{1, 2, 3, 4, 5, 6}, // data (int)
ScalarType::Double); // double scalar type
In this example, even though the data vector contains integers, we specify the scalar type as Double
. The integers are cast to double, and the new data vector is owned by the TensorPtr
. Since the sizes
argument is skipped in this example, the tensor is one-dimensional with a size equal to the length of the data vector. Note that the reverse cast, from a floating-point type to an integral type, is not allowed because that loses precision. Similarly, casting other types to Bool
is disallowed.
Providing Data Vector as std::vector<uint8_t>
You can also provide raw data in the form of a std::vector<uint8_t>
, specifying the sizes and scalar type. The data will be reinterpreted according to the provided type.
std::vector<uint8_t> data = /* raw data */;
auto tensor = make_tensor_ptr(
{2, 3}, // sizes
std::move(data), // data as uint8_t vector
ScalarType::Int); // int scalar type
The data
vector must be large enough to accommodate all the elements according to the provided sizes and scalar type.
Non-Owning Data from Raw Pointer¶
You can create a TensorPtr
that references existing data without taking ownership.
Providing Raw Data
float data[] = {1.0f, 2.0f, 3.0f, 4.0f, 5.0f, 6.0f};
auto tensor = make_tensor_ptr(
{2, 3}, // sizes
data, // raw data pointer
ScalarType::Float); // float scalar type
The TensorPtr
does not own the data, so you must ensure the data
remains valid.
Providing Raw Data with Custom Deleter
If you want the TensorPtr
to manage the lifetime of the data, you can provide a custom deleter.
auto* data = new double[6]{1.0, 2.0, 3.0, 4.0, 5.0, 6.0};
auto tensor = make_tensor_ptr(
{2, 3}, // sizes
data, // data pointer
ScalarType::Double, // double scalar type
TensorShapeDynamism::DYNAMIC_BOUND, // default dynamism
[](void* ptr) { delete[] static_cast<double*>(ptr); });
The TensorPtr
will call the custom deleter when it is destroyed, i.e., when the smart pointer is reset and no more references to the underlying Tensor
exist.
Sharing Existing Tensor¶
Since TensorPtr
is a std::shared_ptr<Tensor>
, you can easily create a TensorPtr
that shares an existing Tensor
. Any changes made to the shared data are reflected across all instances sharing the same data.
Sharing Existing TensorPtr
auto tensor = make_tensor_ptr({2, 3}, {1.0f, 2.0f, 3.0f, 4.0f, 5.0f, 6.0f});
auto tensor_copy = tensor;
Now tensor
and tensor_copy
point to the same data and metadata.
Viewing Existing Tensor¶
You can create a TensorPtr
from an existing Tensor
, copying its properties and referencing the same data.
Viewing Existing Tensor
Tensor original_tensor = /* some existing tensor */;
auto tensor = make_tensor_ptr(original_tensor);
Now the newly created TensorPtr
references the same data as the original tensor, but has its own metadata copy, so it can interpret or “view” the data differently, but any modifications to the data will be reflected in the original Tensor
as well.
Cloning Tensors¶
To create a new TensorPtr
that owns a copy of the data from an existing tensor:
Tensor original_tensor = /* some existing tensor */;
auto tensor = clone_tensor_ptr(original_tensor);
The newly created TensorPtr
has its own copy of the data, so it can modify and manage it independently.
Likewise, you can create a clone of an existing TensorPtr
.
auto original_tensor = make_tensor_ptr(/* ... */);
auto tensor = clone_tensor_ptr(original_tensor);
Note that, regardless of whether the original TensorPtr
owns the data or not, the newly created TensorPtr
will own a copy of the data.
Resizing Tensors¶
The TensorShapeDynamism
enum specifies the mutability of a tensor’s shape:
STATIC
: The tensor’s shape cannot be changed.DYNAMIC_BOUND
: The tensor’s shape can be changed but cannot contain more elements than it originally had at creation based on the initial sizes.DYNAMIC
: The tensor’s shape can be changed arbitrarily. Currently,DYNAMIC
is an alias forDYNAMIC_BOUND
.
When resizing a tensor, you must respect its dynamism setting. Resizing is only allowed for tensors with DYNAMIC
or DYNAMIC_BOUND
shapes, and you cannot resize DYNAMIC_BOUND
tensors to contain more elements than they had initially.
auto tensor = make_tensor_ptr(
{2, 3}, // sizes
{1, 2, 3, 4, 5, 6}, // data
ScalarType::Int,
TensorShapeDynamism::DYNAMIC_BOUND);
// Initial sizes: {2, 3}
// Number of elements: 6
resize_tensor_ptr(tensor, {2, 2});
// The tensor sizes are now {2, 2}
// Number of elements is 4 < initial 6
resize_tensor_ptr(tensor, {1, 3});
// The tensor sizes are now {1, 3}
// Number of elements is 3 < initial 6
resize_tensor_ptr(tensor, {3, 2});
// The tensor sizes are now {3, 2}
// Number of elements is 6 == initial 6
resize_tensor_ptr(tensor, {6, 1});
// The tensor sizes are now {6, 1}
// Number of elements is 6 == initial 6
Convenience Helpers¶
ExecuTorch provides several helper functions to create tensors conveniently.
Creating Non-Owning Tensors with for_blob
and from_blob
¶
These helpers allow you to create tensors that do not own the data.
Using from_blob()
float data[] = {1.0f, 2.0f, 3.0f};
auto tensor = from_blob(
data, // data pointer
{3}, // sizes
ScalarType::Float); // float scalar type
Using for_blob()
with Fluent Syntax
double data[] = {1.0, 2.0, 3.0, 4.0, 5.0, 6.0};
auto tensor = for_blob(data, {2, 3}, ScalarType::Double)
.strides({3, 1})
.dynamism(TensorShapeDynamism::STATIC)
.make_tensor_ptr();
Using Custom Deleter with from_blob()
int* data = new int[3]{1, 2, 3};
auto tensor = from_blob(
data, // data pointer
{3}, // sizes
ScalarType::Int, // int scalar type
[](void* ptr) { delete[] static_cast<int*>(ptr); });
The TensorPtr
will call the custom deleter when it is destroyed.
Creating Empty Tensors¶
empty()
creates an uninitialized tensor with sizes specified.
auto tensor = empty({2, 3});
empty_like()
creates an uninitialized tensor with the same sizes as an existing TensorPtr
.
TensorPtr original_tensor = /* some existing tensor */;
auto tensor = empty_like(original_tensor);
And empty_strided()
creates an uninitialized tensor with sizes and strides specified.
auto tensor = empty_strided({2, 3}, {3, 1});
Creating Tensors Filled with Specific Values¶
full()
, zeros()
and ones()
create a tensor filled with a provided value, zeros or ones respectively.
auto tensor_full = full({2, 3}, 42.0f);
auto tensor_zeros = zeros({2, 3});
auto tensor_ones = ones({3, 4});
Similarly to empty()
, there are extra helper functions full_like()
, full_strided()
, zeros_like()
, zeros_strided()
, ones_like()
and ones_strided()
to create filled tensors with the same properties as an existing TensorPtr
or with custom strides.
Creating Random Tensors¶
rand()
creates a tensor filled with random values between 0 and 1.
auto tensor_rand = rand({2, 3});
randn()
creates a tensor filled with random values from a normal distribution.
auto tensor_randn = randn({2, 3});
randint()
creates a tensor filled with random integers between min (inclusive) and max (exclusive) integers specified.
auto tensor_randint = randint(0, 10, {2, 3});
Creating Scalar Tensors¶
In addition to make_tensor_ptr()
with a single data value, you can create a scalar tensor with scalar_tensor()
.
auto tensor = scalar_tensor(3.14f);
Note that the scalar_tensor()
function expects a value of type Scalar
. In ExecuTorch, Scalar
can represent bool
, int
, or floating-point types, but not types like Half
or BFloat16
, etc. for which you’d need to use make_tensor_ptr()
to skip the Scalar
type.
Notes on EValue and Lifetime Management¶
The Module
interface expects data in the form of EValue
, a variant type that can hold a Tensor
or other scalar types. When you pass a TensorPtr
to a function expecting an EValue
, you can dereference the TensorPtr
to get the underlying Tensor
.
TensorPtr tensor = /* create a TensorPtr */
//...
module.forward(tensor);
Or even a vector of EValues
for multiple parameters.
TensorPtr tensor = /* create a TensorPtr */
TensorPtr tensor2 = /* create another TensorPtr */
//...
module.forward({tensor, tensor2});
However, be cautious: EValue
will not hold onto the dynamic data and metadata from the TensorPtr
. It merely holds a regular Tensor
, which does not own the data or metadata but refers to them using raw pointers. You need to ensure that the TensorPtr
remains valid for as long as the EValue
is in use.
This also applies when using functions like set_input()
or set_output()
that expect EValue
.
Interoperability with ATen¶
If your code is compiled with the preprocessor flag USE_ATEN_LIB
enabled, all the TensorPtr
APIs will use at::
APIs under the hood. E.g. TensorPtr
becomes a std::shared_ptr<at::Tensor>
. This allows for seamless integration with PyTorch ATen library.
API Equivalence Table¶
Here’s a table matching TensorPtr
creation functions with their corresponding ATen APIs:
ATen |
ExecuTorch |
---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Best Practices¶
Manage Lifetimes Carefully: Even though
TensorPtr
handles memory management, ensure that any non-owned data (e.g., when usingfrom_blob()
) remains valid while the tensor is in use.Use Convenience Functions: Utilize helper functions for common tensor creation patterns to write cleaner and more readable code.
Be Aware of Data Ownership: Know whether your tensor owns its data or references external data to avoid unintended side effects or memory leaks.
Ensure
TensorPtr
OutlivesEValue
: When passing tensors to modules that expectEValue
, ensure that theTensorPtr
remains valid as long as theEValue
is in use.
Conclusion¶
The TensorPtr
in ExecuTorch simplifies tensor memory management by bundling the data and dynamic metadata into a smart pointer. This design eliminates the need for users to manage multiple pieces of data and ensures safer and more maintainable code.
By providing interfaces similar to PyTorch’s ATen library, ExecuTorch simplifies the adoption of the new API, allowing developers to transition without a steep learning curve.