ExecuTorch Runtime API Reference¶
The ExecuTorch C++ API provides an on-device execution framework for exported PyTorch models.
For a tutorial style introduction to the runtime API, check out the runtime tutorial and its simplified version.
For detailed information on how APIs evolve and the deprecation process, please refer to the ExecuTorch API Life Cycle and Deprecation Policy.
Model Loading and Execution¶
-
class Program¶
A deserialized ExecuTorch program binary.
Public Types
-
enum class Verification : uint8_t¶
Types of validation that the Program can do before parsing the data.
Values:
-
enumerator Minimal¶
Do minimal verification of the data, ensuring that the header appears correct.
Has minimal runtime overhead.
-
enumerator InternalConsistency¶
Do full verification of the data, ensuring that internal pointers are self-consistent and that the data has not been truncated or obviously corrupted. May not catch all types of corruption, but should guard against illegal memory operations during parsing.
Will have higher runtime overhead, scaling with the complexity of the proram data.
-
enumerator Minimal¶
-
enum HeaderStatus¶
Describes the presence of an ExecuTorch program header.
Values:
-
enumerator CompatibleVersion¶
An ExecuTorch program header is present, and its version is compatible with this version of the runtime.
-
enumerator IncompatibleVersion¶
An ExecuTorch program header is present, but its version is not compatible with this version of the runtime.
-
enumerator NotPresent¶
An ExecuTorch program header is not present.
-
enumerator ShortData¶
The data provided was too short to find the program header.
-
enumerator CompatibleVersion¶
Public Functions
-
Result<const void*> get_constant_buffer_data(size_t buffer_idx, size_t nbytes) const¶
Get the constant buffer inside Program with index buffer_idx.
- Parameters
buffer_idx – [in] the index of the buffer in the constant_buffer.
nbytes – [in] the number of bytes to read from the buffer.
- Returns
The buffer with corresponding index.
-
size_t num_methods() const¶
Returns the number of methods in the program.
-
Result<const char*> get_method_name(size_t method_index) const¶
Returns the name of the method at particular index.
- Parameters
method_index – [in] The index of the method name to retrieve. Must be less than the value returned by
num_methods()
.- Returns
The name of the requested method. The pointer is owned by the Program, and has the same lifetime as the Program.
-
Result<Method> load_method(const char *method_name, MemoryManager *memory_manager, EventTracer *event_tracer = nullptr) const¶
Loads the named method and prepares it for execution.
- Parameters
method_name – [in] The name of the method to load.
memory_manager – [in] The allocators to use during initialization and execution of the loaded method. If
memory_manager.temp_allocator()
is null, the runtime will allocate temp memory usinget_pal_allocate()
.event_tracer – [in] The event tracer to use for this method run.
- Returns
The loaded method on success, or an error on failure.
-
Result<MethodMeta> method_meta(const char *method_name) const¶
Gathers metadata for the named method.
- Parameters
method_name – [in] The name of the method to get metadata for.
- ET_DEPRECATED Result< const char * > get_output_flattening_encoding (const char *method_name="forward") const
DEPRECATED: Get the pytree encoding string for the output. Deprecated as this functionality will eventually move out of the core program into a higher level structure, but that does not exist at this time.
- Parameters
method_name – [in] The name of the method to get the encoding for.
- Returns
The pytree encoding string for the output
Public Static Functions
- static ET_NODISCARD Result< Program > load (DataLoader *loader, Verification verification=Verification::Minimal)
Loads a Program from the provided loader. The Program will hold a pointer to the loader, which must outlive the returned Program instance.
- static inline ET_DEPRECATED ET_NODISCARD Result< Program > Load (DataLoader *loader, Verification verification=Verification::Minimal)
DEPRECATED: Use the lowercase
load()
instead.
-
static HeaderStatus check_header(const void *data, size_t size)¶
Looks for an ExecuTorch program header in the provided data.
- Parameters
data – [in] The data from the beginning of a file that might contain an ExecuTorch program.
size – [in] The size of
data
in bytes. Must be >=kMinHeadBytes
.
- Returns
A value describing the presence of a header in the data.
Public Static Attributes
-
static constexpr size_t kMinHeadBytes = 64¶
The minimum number of bytes necessary for calls to
check_header
.
-
enum class Verification : uint8_t¶
-
class Method¶
An executable method of an executorch program. Maps to a python method like
forward()
on the original nn.Module.Public Functions
-
inline Method(Method &&rhs) noexcept¶
Move ctor. Takes ownership of resources previously owned by
rhs
, and leavesrhs
in an uninitialized state.
- ET_NODISCARD Error set_input (const EValue &input_evalue, size_t input_idx)
Sets the internal input value to be equivalent to the to the provided value.
- Parameters
input_evalue – [in] The evalue to copy into the method input. If the evalue is a tensor, the data is copied in most cases, so the tensor passed in here does not always need to outlive this call. But there is a case where the Method will keep a pointer to the tensor’s data. Based on the memory plan of the method, the inputs may not have buffer space pre-allocated for them. In this case the executor will alias the memory of the tensors provided as inputs here rather then deepcopy the input into the memory planned arena.
input_idx – [in] Zero-based index of the input to set. Must be less than the value returned by inputs_size().
- Returns
Error::Ok on success, non-Ok on failure.
- ET_NODISCARD Error set_inputs (const executorch::aten::ArrayRef< EValue > &input_evalues)
Sets the values of all method inputs.
See set_input() for a more detailed description of the behavior.
- Parameters
input_evalues – [in] The new values for all of the method inputs. The type of each element must match the type of corresponding input. If the value of an element is a tensor, attempts to allow dynamic shape, but the dtype must always agree.
- Returns
Error::Ok on success, non-Ok on failure.
- ET_NODISCARD Error set_output_data_ptr (void *buffer, size_t size, size_t output_idx)
Sets the data buffer of the specified method output to the provided value.
NOTE: Based on the memory plan of the method, the output tensors may not have buffer space pre-allocated for them, in this case the executor will point those tensors to the buffer provided here, so the user should take care that the life span of this memory outlasts the executor forward.
- Parameters
buffer – [in] The block of memory to point the specified tensor at.
size – [in] the length of buffer in bytes, must be >= the nbytes of the specified tensor.
output_idx – [in] The index of the output to set the data_ptr for. Must correspond to a tensor, and that tensor must not have had a buffer allocated by the memory plan.
- Returns
Error::Ok on success, non-Ok on failure.
- ET_NODISCARD Error get_outputs (EValue *output_evalues, size_t length)
Copies the method’s outputs into the provided array.
WARNING: The output contains shallow copies of internal tensor outputs. Please do not mutate returned Tensor elements.
TODO(T139259264): Add checks to detect output mutation, or deep-copy outputs.
- Parameters
output_evalues – [in] The array to copy the outputs into. The first
outputs_size()
elements will be set to the corresponding output values. The rest of the array will be set to the EValue value None.length – [in] The size of the
output_evalues
array in elements. Must be greater than or equal tooutputs_size()
.
- Returns
Error::Ok on success, non-Ok on failure.
- ET_NODISCARD Error get_inputs (EValue *input_evalues, size_t length)
Copies the method’s inputs into the provided array.
WARNING: The input contains shallow copies of internal tensor inputs. Please do not mutate returned Tensor elements.
- Parameters
input_evalues – [in] The array to copy the inputs into. The first
inputs_size()
elements will be set to the corresponding input values. The rest of the array will be set to the EValue value None.length – [in] The size of the
input_evalues
array in elements. Must be greater than or equal toinputs_size()
.
- Returns
Error::Ok on success, non-Ok on failure.
- ET_NODISCARD Error execute ()
Execute the method.
NOTE: Will fail if the method has been partially executed using the
step()
api.- Returns
Error::Ok on success, non-Ok on failure.
- ET_EXPERIMENTAL ET_NODISCARD Error step ()
EXPERIMENTAL: Advances/executes a single instruction in the method.
- Return values
Error::Ok – step succeeded
non-Ok – step failed
Error::EndOfMethod – method finished executing successfully
- ET_DEPRECATED ET_NODISCARD Error experimental_step ()
DEPRECATED: Use
step()
instead.
- ET_EXPERIMENTAL ET_NODISCARD Error reset_execution ()
EXPERIMENTAL: Resets execution state to the start of the Method. For use with the
step()
API.
- ET_DEPRECATED ET_NODISCARD Error experimental_reset_execution ()
DEPRECATED: Use
reset_execution()
instead.
-
MethodMeta method_meta() const¶
Returns the MethodMeta that corresponds to the calling Method.
- ET_DEPRECATED const EValue & get_input (size_t i) const
DEPRECATED: Use MethodMeta instead to access metadata, and set_input to update Method inputs.
- ET_DEPRECATED EValue & mutable_input (size_t i)
DEPRECATED: Use MethodMeta instead to access metadata, and set_input to update Method inputs.
- ET_DEPRECATED EValue & mutable_output (size_t i)
DEPRECATED: Use MethodMeta instead to access metadata, and get_output to retrieve Method outputs.
-
inline Method(Method &&rhs) noexcept¶
-
class MethodMeta¶
Describes a a method in an ExecuTorch program.
The program used to create a MethodMeta object must outlive the MethodMeta. It is separate from Method so that this information can be accessed without paying the initialization cost of loading the full Method.
Public Functions
-
const char *name() const¶
Get the name of this method.
- Returns
The method name.
-
size_t num_inputs() const¶
Get the number of inputs to this method.
- Returns
The number of inputs.
-
Result<Tag> input_tag(size_t index) const¶
Get the tag of the specified input.
- Parameters
index – [in] The index of the input to look up.
- Returns
The tag of input, can only be [Tensor, Int, Bool, Double, String].
-
Result<TensorInfo> input_tensor_meta(size_t index) const¶
Get metadata about the specified input.
- Parameters
index – [in] The index of the input to look up.
- Returns
The metadata on success, or an error on failure. Only valid for tag::Tensor
-
size_t num_outputs() const¶
Get the number of outputs to this method.
- Returns
The number of outputs.
-
Result<Tag> output_tag(size_t index) const¶
Get the tag of the specified output.
- Parameters
index – [in] The index of the output to look up.
- Returns
The tag of output, can only be [Tensor, Int, Bool, Double, String].
-
Result<TensorInfo> output_tensor_meta(size_t index) const¶
Get metadata about the specified output.
- Parameters
index – [in] The index of the output to look up.
- Returns
The metadata on success, or an error on failure. Only valid for tag::Tensor
-
size_t num_memory_planned_buffers() const¶
Get the number of memory-planned buffers this method requires.
- Returns
The number of memory-planned buffers.
-
Result<int64_t> memory_planned_buffer_size(size_t index) const¶
Get the size in bytes of the specified memory-planned buffer.
- Parameters
index – [in] The index of the buffer to look up.
- Returns
The size in bytes on success, or an error on failure.
- inline ET_DEPRECATED size_t num_non_const_buffers () const
DEPRECATED: Use num_memory_planned_buffers() instead.
-
inline Result<int64_t> non_const_buffer_size(size_t index) const¶
DEPRECATED: Use memory_planned_buffer_size() instead.
-
const char *name() const¶
-
class DataLoader¶
Loads from a data source.
See //executorch/extension/data_loader for common implementations.
Public Functions
- virtual ET_NODISCARD Result< FreeableBuffer > load (size_t offset, size_t size, const SegmentInfo &segment_info) const =0
Loads data from the underlying data source.
NOTE: This must be thread-safe. If this call modifies common state, the implementation must do its own locking.
- Parameters
offset – The byte offset in the data source to start loading from.
size – The number of bytes to load.
segment_info – Information about the segment being loaded.
- Returns
a
FreeableBuffer
that owns the loaded data.
- inline virtual ET_NODISCARD Error load_into (size_t offset, size_t size, const SegmentInfo &segment_info, void *buffer) const
Loads data from the underlying data source into the provided buffer.
NOTE: This must be thread-safe. If this call modifies common state, the implementation must do its own locking.
- Parameters
offset – The byte offset in the data source to start loading from.
size – The number of bytes to load.
segment_info – Information about the segment being loaded.
buffer – The buffer to load data into. Must point to at least
size
bytes of memory.
- Returns
an Error indicating if the load was successful.
- virtual ET_NODISCARD Result< size_t > size () const =0
Returns the length of the underlying data source, typically the file size.
-
struct SegmentInfo¶
Describes the content of the segment.
Public Types
-
class MemoryAllocator¶
A class that does simple allocation based on a size and returns the pointer to the memory address. It bookmarks a buffer with certain size. The allocation is simply checking space and growing the cur_ pointer with each allocation request.
Simple example:
// User allocates a 100 byte long memory in the heap. uint8_t* memory_pool = malloc(100 * sizeof(uint8_t)); MemoryAllocator allocator(100, memory_pool) // Pass allocator object in the Executor
Underneath the hood, ExecuTorch will call allocator.allocate() to keep iterating cur_ pointer
Subclassed by executorch::runtime::internal::PlatformMemoryAllocator
Public Functions
-
inline MemoryAllocator(uint32_t size, uint8_t *base_address)¶
Constructs a new memory allocator of a given
size
, starting at the providedbase_address
.- Parameters
size – [in] The size in bytes of the buffer at
base_address
.base_address – [in] The buffer to allocate from. Does not take ownership of this buffer, so it must be valid for the lifetime of of the MemoryAllocator.
-
inline virtual void *allocate(size_t size, size_t alignment = kDefaultAlignment)¶
Allocates
size
bytes of memory.- Parameters
size – [in] Number of bytes to allocate.
alignment – [in] Minimum alignment for the returned pointer. Must be a power of 2.
- Return values
nullptr – Not enough memory, or
alignment
was not a power of 2.- Returns
Aligned pointer to the allocated memory on success.
-
template<typename T>
inline T *allocateInstance(size_t alignment = alignof(T))¶ Allocates a buffer large enough for an instance of type T. Note that the memory will not be initialized.
Example:
auto p = memory_allocator->allocateInstance<MyType>();
- Parameters
alignment – [in] Minimum alignment for the returned pointer. Must be a power of 2. Defaults to the natural alignment of T.
- Return values
nullptr – Not enough memory, or
alignment
was not a power of 2.- Returns
Aligned pointer to the allocated memory on success.
-
template<typename T>
inline T *allocateList(size_t size, size_t alignment = alignof(T))¶ Allocates
size
number of chunks of type T, where each chunk is of size equal to sizeof(T) bytes.- Parameters
size – [in] Number of memory chunks to allocate.
alignment – [in] Minimum alignment for the returned pointer. Must be a power of 2. Defaults to the natural alignment of T.
- Return values
nullptr – Not enough memory, or
alignment
was not a power of 2.- Returns
Aligned pointer to the allocated memory on success.
Public Static Attributes
-
static constexpr size_t kDefaultAlignment = alignof(void*)¶
Default alignment of memory returned by this class. Ensures that pointer fields of structs will be aligned. Larger types like
long double
may not be, however, depending on the toolchain and architecture.
-
inline MemoryAllocator(uint32_t size, uint8_t *base_address)¶
-
class HierarchicalAllocator¶
A group of buffers that can be used to represent a device’s memory hierarchy.
Public Functions
-
inline explicit HierarchicalAllocator(Span<Span<uint8_t>> buffers)¶
Constructs a new hierarchical allocator with the given array of buffers.
Memory IDs are based on the index into
buffers
:buffers[N]
will have a memory ID ofN
.buffers.size()
must be >=MethodMeta::num_non_const_buffers()
.buffers[N].size()
must be >=MethodMeta::non_const_buffer_size(N)
.
-
inline ET_DEPRECATED HierarchicalAllocator(uint32_t n_allocators, MemoryAllocator *allocators)¶
DEPRECATED: Use spans instead.
- inline ET_NODISCARD Result< void * > get_offset_address (uint32_t memory_id, size_t offset_bytes, size_t size_bytes)
Returns the address at the byte offset
offset_bytes
from the given buffer’s base address, which points to at leastsize_bytes
of memory.- Parameters
memory_id – [in] The ID of the buffer in the hierarchy.
offset_bytes – [in] The offset in bytes into the specified buffer.
size_bytes – [in] The amount of memory that should be available at the offset.
- Returns
On success, the address of the requested byte offset into the specified buffer. On failure, a non-Ok Error.
-
inline explicit HierarchicalAllocator(Span<Span<uint8_t>> buffers)¶
-
class MemoryManager¶
A container class for allocators used during Method load and execution.
This class consolidates all dynamic memory needs for Method load and execution. This can allow for heap-based as well as heap-less execution (relevant to some embedded scenarios), and overall provides more control over memory use.
This class, however, cannot ensure all allocation is accounted for since kernel and backend implementations are free to use a separate way to allocate memory (e.g., for things like scratch space). But we do suggest that backends and kernels use these provided allocators whenever possible.
Public Functions
-
inline explicit MemoryManager(MemoryAllocator *method_allocator, HierarchicalAllocator *planned_memory = nullptr, MemoryAllocator *temp_allocator = nullptr)¶
Constructs a new MemoryManager.
- Parameters
method_allocator – [in] The allocator to use when loading a Method and allocating its internal structures. Must outlive the Method that uses it.
planned_memory – [in] The memory-planned buffers to use for mutable tensor data when executing a Method. Must outlive the Method that uses it. May be
nullptr
if the Method does not use any memory-planned tensor data. The sizes of the buffers in this HierarchicalAllocator must agree with the correspondingMethodMeta::num_memory_planned_buffers()
andMethodMeta::memory_planned_buffer_size(N)
values, which are embedded in the Program.temp_allocator – [in] The allocator to use when allocating temporary data during kernel or delegate execution. Must outlive the Method that uses it. May be
nullptr
if the Method does not use kernels or delegates that allocate temporary data. This allocator will be reset after every kernel or delegate call during execution.
-
inline ET_DEPRECATED MemoryManager(MemoryAllocator *constant_allocator, HierarchicalAllocator *non_constant_allocator, MemoryAllocator *runtime_allocator, MemoryAllocator *temporary_allocator)¶
DEPRECATED: Use the constructor without
constant_allocator
instead.TODO(T162089316): Remove this once all users migrate to the new ctor.
-
inline MemoryAllocator *method_allocator() const¶
Returns the allocator that the runtime will use to allocate internal structures while loading a Method. Must not be used after its associated Method has been loaded.
-
inline HierarchicalAllocator *planned_memory() const¶
Returns the memory-planned buffers to use for mutable tensor data.
-
inline MemoryAllocator *temp_allocator() const¶
Returns the allocator to use for allocating temporary data during kernel or delegate execution.
This allocator will be reset after every kernel or delegate call during execution.
-
inline explicit MemoryManager(MemoryAllocator *method_allocator, HierarchicalAllocator *planned_memory = nullptr, MemoryAllocator *temp_allocator = nullptr)¶
Values¶
-
struct EValue¶
-
class Tensor¶
A minimal Tensor type whose API is a source compatible subset of at::Tensor.
NOTE: Instances of this class do not own the TensorImpl given to it, which means that the caller must guarantee that the TensorImpl lives longer than any Tensor instances that point to it.
See the documention on TensorImpl for details about the return/parameter types used here and how they relate to at::Tensor.
Public Types
-
using DimOrderType = TensorImpl::DimOrderType¶
The type used for elements of
dim_order()
.
Public Functions
-
inline TensorImpl *unsafeGetTensorImpl() const¶
Returns a pointer to the underlying TensorImpl.
NOTE: Clients should be wary of operating on the TensorImpl directly instead of the Tensor. It is easy to break things.
-
inline size_t nbytes() const¶
Returns the size of the tensor in bytes.
NOTE: Only the alive space is returned not the total capacity of the underlying data blob.
-
inline ssize_t size(ssize_t dim) const¶
Returns the size of the tensor at the given dimension.
NOTE: that size() intentionally does not return SizeType even though it returns an element of an array of SizeType. This is to help make calls of this method more compatible with at::Tensor, and more consistent with the rest of the methods on this class and in ETensor.
-
inline ssize_t dim() const¶
Returns the tensor’s number of dimensions.
-
inline ssize_t numel() const¶
Returns the number of elements in the tensor.
-
inline ScalarType scalar_type() const¶
Returns the type of the elements in the tensor (int32, float, bool, etc).
-
inline ssize_t element_size() const¶
Returns the size in bytes of one element of the tensor.
-
inline const ArrayRef<DimOrderType> dim_order() const¶
Returns the order the dimensions are laid out in memory.
-
inline const ArrayRef<StridesType> strides() const¶
Returns the strides of the tensor at each dimension.
-
inline TensorShapeDynamism shape_dynamism() const¶
Returns the mutability of the shape of the tensor.
-
template<typename T>
inline const T *const_data_ptr() const¶ Returns a pointer of type T to the constant underlying data blob.
-
inline const void *const_data_ptr() const¶
Returns a pointer to the constant underlying data blob.
-
template<typename T>
inline T *mutable_data_ptr() const¶ Returns a pointer of type T to the mutable underlying data blob.
-
inline void *mutable_data_ptr() const¶
Returns a pointer to the mutable underlying data blob.
- template<typename T> inline ET_DEPRECATED T * data_ptr () const
DEPRECATED: Use const_data_ptr or mutable_data_ptr instead.
- inline ET_DEPRECATED void * data_ptr () const
DEPRECATED: Use const_data_ptr or mutable_data_ptr instead.
- inline ET_DEPRECATED void set_data (void *ptr) const
DEPRECATED: Changes the data_ptr the tensor aliases. Does not free the previously pointed to data, does not assume ownership semantics of the new ptr. This api does not exist in at::Tensor so kernel developers should avoid it.
-
using DimOrderType = TensorImpl::DimOrderType¶