• Docs >
  • Runtime API Reference
Shortcuts

Runtime API Reference

The ExecuTorch C++ API provides an on-device execution framework for exported PyTorch models.

For a tutorial style introduction to the runtime API, check out the runtime tutorial and its simplified version.

For detailed information on how APIs evolve and the deprecation process, please refer to the ExecuTorch API Life Cycle and Deprecation Policy.

Model Loading and Execution

Warning

doxygenclass: Cannot find class “executorch::runtime::Program” in doxygen xml output for project “ExecuTorch” from directory: ../build/xml/

Warning

doxygenclass: Cannot find class “executorch::runtime::Method” in doxygen xml output for project “ExecuTorch” from directory: ../build/xml/

Warning

doxygenclass: Cannot find class “executorch::runtime::MethodMeta” in doxygen xml output for project “ExecuTorch” from directory: ../build/xml/

class DataLoader

Loads from a data source.

See //executorch/extension/data_loader for common implementations.

Public Functions

virtual ET_NODISCARD Result< FreeableBuffer > load (size_t offset, size_t size, const SegmentInfo &segment_info) const =0

Loads data from the underlying data source.

NOTE: This must be thread-safe. If this call modifies common state, the implementation must do its own locking.

Parameters
  • offset – The byte offset in the data source to start loading from.

  • size – The number of bytes to load.

  • segment_info – Information about the segment being loaded.

Returns

a FreeableBuffer that owns the loaded data.

inline virtual ET_NODISCARD Error load_into (size_t offset, size_t size, const SegmentInfo &segment_info, void *buffer) const

Loads data from the underlying data source into the provided buffer.

NOTE: This must be thread-safe. If this call modifies common state, the implementation must do its own locking.

Parameters
  • offset – The byte offset in the data source to start loading from.

  • size – The number of bytes to load.

  • segment_info – Information about the segment being loaded.

  • buffer – The buffer to load data into. Must point to at least size bytes of memory.

Returns

an Error indicating if the load was successful.

virtual ET_NODISCARD Result< size_t > size () const =0

Returns the length of the underlying data source, typically the file size.

struct SegmentInfo

Describes the content of the segment.

Public Types

enum class Type

Represents the purpose of the segment.

Values:

enumerator Program

Data for the actual program.

enumerator Constant

Holds constant tensor data.

enumerator Backend

Data used for initializing a backend.

enumerator Mutable

Data used for initializing mutable tensors.

enumerator External

Data used for initializing external tensors.

Public Members

Type segment_type

Type of the segment.

size_t segment_index

Index of the segment within the segment list. Undefined for program segments.

const char *descriptor

An optional, null-terminated string describing the segment. For Backend segments, this is the backend ID. Null for other segment types.

class MemoryAllocator

A class that does simple allocation based on a size and returns the pointer to the memory address. It bookmarks a buffer with certain size. The allocation is simply checking space and growing the cur_ pointer with each allocation request.

Simple example:

// User allocates a 100 byte long memory in the heap. uint8_t* memory_pool = malloc(100 * sizeof(uint8_t)); MemoryAllocator allocator(100, memory_pool) // Pass allocator object in the Executor

Underneath the hood, ExecuTorch will call allocator.allocate() to keep iterating cur_ pointer

Public Functions

inline MemoryAllocator(uint32_t size, uint8_t *base_address)

Constructs a new memory allocator of a given size, starting at the provided base_address.

Parameters
  • size[in] The size in bytes of the buffer at base_address.

  • base_address[in] The buffer to allocate from. Does not take ownership of this buffer, so it must be valid for the lifetime of of the MemoryAllocator.

inline virtual void *allocate(size_t size, size_t alignment = kDefaultAlignment)

Allocates size bytes of memory.

Parameters
  • size[in] Number of bytes to allocate.

  • alignment[in] Minimum alignment for the returned pointer. Must be a power of 2.

Return values

nullptr – Not enough memory, or alignment was not a power of 2.

Returns

Aligned pointer to the allocated memory on success.

template<typename T>
inline T *allocateInstance(size_t alignment = alignof(T))

Allocates a buffer large enough for an instance of type T. Note that the memory will not be initialized.

Example:

auto p = memory_allocator->allocateInstance<MyType>();

Parameters

alignment[in] Minimum alignment for the returned pointer. Must be a power of 2. Defaults to the natural alignment of T.

Return values

nullptr – Not enough memory, or alignment was not a power of 2.

Returns

Aligned pointer to the allocated memory on success.

template<typename T>
inline T *allocateList(size_t size, size_t alignment = alignof(T))

Allocates size number of chunks of type T, where each chunk is of size equal to sizeof(T) bytes.

Parameters
  • size[in] Number of memory chunks to allocate.

  • alignment[in] Minimum alignment for the returned pointer. Must be a power of 2. Defaults to the natural alignment of T.

Return values

nullptr – Not enough memory, or alignment was not a power of 2.

Returns

Aligned pointer to the allocated memory on success.

Public Static Attributes

static constexpr size_t kDefaultAlignment = alignof(void*)

Default alignment of memory returned by this class. Ensures that pointer fields of structs will be aligned. Larger types like long double may not be, however, depending on the toolchain and architecture.

class HierarchicalAllocator

A group of buffers that can be used to represent a device’s memory hierarchy.

Public Functions

inline explicit HierarchicalAllocator(Span<Span<uint8_t>> buffers)

Constructs a new hierarchical allocator with the given array of buffers.

  • Memory IDs are based on the index into buffers: buffers[N] will have a memory ID of N.

  • buffers.size() must be >= MethodMeta::num_non_const_buffers().

  • buffers[N].size() must be >= MethodMeta::non_const_buffer_size(N).

inline ET_DEPRECATED HierarchicalAllocator(uint32_t n_allocators, MemoryAllocator *allocators)

DEPRECATED: Use spans instead.

inline ET_NODISCARD Result< void * > get_offset_address (uint32_t memory_id, size_t offset_bytes, size_t size_bytes)

Returns the address at the byte offset offset_bytes from the given buffer’s base address, which points to at least size_bytes of memory.

Parameters
  • memory_id[in] The ID of the buffer in the hierarchy.

  • offset_bytes[in] The offset in bytes into the specified buffer.

  • size_bytes[in] The amount of memory that should be available at the offset.

Returns

On success, the address of the requested byte offset into the specified buffer. On failure, a non-Ok Error.

class MemoryManager

A container class for allocators used during Method load and execution.

This class consolidates all dynamic memory needs for Method load and execution. This can allow for heap-based as well as heap-less execution (relevant to some embedded scenarios), and overall provides more control over memory use.

This class, however, cannot ensure all allocation is accounted for since kernel and backend implementations are free to use a separate way to allocate memory (e.g., for things like scratch space). But we do suggest that backends and kernels use these provided allocators whenever possible.

Public Functions

inline explicit MemoryManager(MemoryAllocator *method_allocator, HierarchicalAllocator *planned_memory = nullptr, MemoryAllocator *temp_allocator = nullptr)

Constructs a new MemoryManager.

Parameters
  • method_allocator[in] The allocator to use when loading a Method and allocating its internal structures. Must outlive the Method that uses it.

  • planned_memory[in] The memory-planned buffers to use for mutable tensor data when executing a Method. Must outlive the Method that uses it. May be nullptr if the Method does not use any memory-planned tensor data. The sizes of the buffers in this HierarchicalAllocator must agree with the corresponding MethodMeta::num_memory_planned_buffers() and MethodMeta::memory_planned_buffer_size(N) values, which are embedded in the Program.

  • temp_allocator[in] The allocator to use when allocating temporary data during kernel or delegate execution. Must outlive the Method that uses it. May be nullptr if the Method does not use kernels or delegates that allocate temporary data. This allocator will be reset after every kernel or delegate call during execution.

inline ET_DEPRECATED MemoryManager(MemoryAllocator *constant_allocator, HierarchicalAllocator *non_constant_allocator, MemoryAllocator *runtime_allocator, MemoryAllocator *temporary_allocator)

DEPRECATED: Use the constructor without constant_allocator instead.

TODO(T162089316): Remove this once all users migrate to the new ctor.

inline MemoryAllocator *method_allocator() const

Returns the allocator that the runtime will use to allocate internal structures while loading a Method. Must not be used after its associated Method has been loaded.

inline HierarchicalAllocator *planned_memory() const

Returns the memory-planned buffers to use for mutable tensor data.

inline MemoryAllocator *temp_allocator() const

Returns the allocator to use for allocating temporary data during kernel or delegate execution.

This allocator will be reset after every kernel or delegate call during execution.

Values

struct EValue

Public Functions

inline EValue(executorch::aten::Scalar s)

Construct an EValue using the implicit value of a Scalar.

template<typename T>
inline executorch::aten::optional<T> toOptional() const

Converts the EValue to an optional object that can represent both T and an uninitialized state.

union Payload
union TriviallyCopyablePayload
class Tensor

A minimal Tensor type whose API is a source compatible subset of at::Tensor.

NOTE: Instances of this class do not own the TensorImpl given to it, which means that the caller must guarantee that the TensorImpl lives longer than any Tensor instances that point to it.

See the documention on TensorImpl for details about the return/parameter types used here and how they relate to at::Tensor.

Public Types

using SizesType = TensorImpl::SizesType

The type used for elements of sizes().

using DimOrderType = TensorImpl::DimOrderType

The type used for elements of dim_order().

using StridesType = TensorImpl::StridesType

The type used for elements of strides().

Public Functions

inline TensorImpl *unsafeGetTensorImpl() const

Returns a pointer to the underlying TensorImpl.

NOTE: Clients should be wary of operating on the TensorImpl directly instead of the Tensor. It is easy to break things.

inline size_t nbytes() const

Returns the size of the tensor in bytes.

NOTE: Only the alive space is returned not the total capacity of the underlying data blob.

inline ssize_t size(ssize_t dim) const

Returns the size of the tensor at the given dimension.

NOTE: that size() intentionally does not return SizeType even though it returns an element of an array of SizeType. This is to help make calls of this method more compatible with at::Tensor, and more consistent with the rest of the methods on this class and in ETensor.

inline ssize_t dim() const

Returns the tensor’s number of dimensions.

inline ssize_t numel() const

Returns the number of elements in the tensor.

inline ScalarType scalar_type() const

Returns the type of the elements in the tensor (int32, float, bool, etc).

inline ssize_t element_size() const

Returns the size in bytes of one element of the tensor.

inline const ArrayRef<SizesType> sizes() const

Returns the sizes of the tensor at each dimension.

inline const ArrayRef<DimOrderType> dim_order() const

Returns the order the dimensions are laid out in memory.

inline const ArrayRef<StridesType> strides() const

Returns the strides of the tensor at each dimension.

inline TensorShapeDynamism shape_dynamism() const

Returns the mutability of the shape of the tensor.

template<typename T>
inline const T *const_data_ptr() const

Returns a pointer of type T to the constant underlying data blob.

inline const void *const_data_ptr() const

Returns a pointer to the constant underlying data blob.

template<typename T>
inline T *mutable_data_ptr() const

Returns a pointer of type T to the mutable underlying data blob.

inline void *mutable_data_ptr() const

Returns a pointer to the mutable underlying data blob.

template<typename T> inline ET_DEPRECATED T * data_ptr () const

DEPRECATED: Use const_data_ptr or mutable_data_ptr instead.

inline ET_DEPRECATED void * data_ptr () const

DEPRECATED: Use const_data_ptr or mutable_data_ptr instead.

inline ET_DEPRECATED void set_data (void *ptr) const

DEPRECATED: Changes the data_ptr the tensor aliases. Does not free the previously pointed to data, does not assume ownership semantics of the new ptr. This api does not exist in at::Tensor so kernel developers should avoid it.

Docs

Access comprehensive developer documentation for PyTorch

View Docs

Tutorials

Get in-depth tutorials for beginners and advanced developers

View Tutorials

Resources

Find development resources and get your questions answered

View Resources