Torch Library API

The PyTorch C++ API provides capabilities for extending PyTorch’s core library of operators with user defined operators and data types. Extensions implemented using the Torch Library API are made available for use in both the PyTorch eager API as well as in TorchScript.

For a tutorial style introduction to the library API, check out the Extending TorchScript with Custom C++ Operators tutorial.



Macro for defining a function that will be run at static initialization time to define a library of operators in the namespace ns (must be a valid C++ identifier, no quotes).

Use this macro when you want to define a new set of custom operators that do not already exist in PyTorch.

Example usage:

TORCH_LIBRARY(myops, m) {
  // m is a torch::Library; methods on it will define
  // operators in the myops namespace
  m.def("add", add_impl);

The m argument is bound to a torch::Library that is used to register operators. There may only be one TORCH_LIBRARY() for any given namespace.


Macro for defining a function that will be run at static initialization time to define operator overrides for dispatch key k (must be an unqualified enum member of c10::DispatchKey) in namespace ns (must be a valid C++ identifer, no quotes).

Use this macro when you want to implement a preexisting set of custom operators on a new dispatch key (e.g., you want to provide CUDA implementations of already existing operators). One common usage pattern is to use TORCH_LIBRARY() to define schema for all new operators you want to define, and then use several TORCH_LIBRARY_IMPL() blocks to provide implementations of the operator for CPU, CUDA and Autograd.

In some cases, you need to define something that applies to all namespaces, not just one namespace (usually a fallback). In that case, use the reserved namespace _, e.g.,


Example usage:

  // m is a torch::Library; methods on it will define
  // CPU implementations of operators in the myops namespace.
  // It is NOT valid to call torch::Library::def()
  // in this context.
  m.impl("add", add_cpu_impl);

If add_cpu_impl is an overloaded function, use a static_cast to specify which overload you want (by providing the full type).


class Library

This object provides the API for defining operators and providing implementations at dispatch keys.

Typically, a torch::Library is not allocated directly; instead it is created by the TORCH_LIBRARY() or TORCH_LIBRARY_IMPL() macros.

Most methods on torch::Library return a reference to itself, supporting method chaining.

// Examples:

TORCH_LIBRARY(torchvision, m) {
   // m is a torch::Library
   m.def("roi_align", ...);

   // m is a torch::Library
   m.impl("add", ...);

Public Functions

template<typename Schema>
inline Library &def(Schema &&raw_schema, const std::vector<at::Tag> &tags = {}, _RegisterOrVerify rv = _RegisterOrVerify::REGISTER) &

Declare an operator with a schema, but don’t provide any implementations for it.

You’re expected to then provide implementations using the impl() method. All template arguments are inferred.

// Example:
TORCH_LIBRARY(myops, m) {
  m.def("add(Tensor self, Tensor other) -> Tensor");


raw_schema – The schema of the operator to be defined. Typically, this is a const char* string literal, but any type accepted by torch::schema() is accepted here.

inline Library &set_python_module(const char *pymodule, const char *context = "")

Declares that for all operators that are subsequently def’ed, their fake impls may be found in the given Python module (pymodule).

This registers some help text that is used if the fake impl cannot be found.


  • pymodule: the python module

  • context: We may include this in the error message.

inline Library &impl_abstract_pystub(const char *pymodule, const char *context = "")

Deprecated; use set_python_module instead.

template<typename NameOrSchema, typename Func>
inline Library &def(NameOrSchema &&raw_name_or_schema, Func &&raw_f, const std::vector<at::Tag> &tags = {}) &

Define an operator for a schema and then register an implementation for it.

This is typically what you would use if you aren’t planning on making use of the dispatcher to structure your operator implementation. It’s roughly equivalent to calling def() and then impl(), but if you omit the schema of the operator, we will infer it from the type of your C++ function. All template arguments are inferred.

// Example:
TORCH_LIBRARY(myops, m) {
  m.def("add", add_fn);

  • raw_name_or_schema – The schema of the operator to be defined, or just the name of the operator if the schema is to be inferred from raw_f. Typically a const char* literal.

  • raw_f – The C++ function that implements this operator. Any valid constructor of torch::CppFunction is accepted here; typically you provide a function pointer or lambda.

template<typename Name, typename Func>
inline Library &impl(Name name, Func &&raw_f, _RegisterOrVerify rv = _RegisterOrVerify::REGISTER) &

Register an implementation for an operator.

You may register multiple implementations for a single operator at different dispatch keys (see torch::dispatch()). Implementations must have a corresponding declaration (from def()), otherwise they are invalid. If you plan to register multiple implementations, DO NOT provide a function implementation when you def() the operator.

// Example:
  m.impl("add", add_cuda);

  • name – The name of the operator to implement. Do NOT provide schema here.

  • raw_f – The C++ function that implements this operator. Any valid constructor of torch::CppFunction is accepted here; typically you provide a function pointer or lambda.

template<typename Func>
inline Library &fallback(Func &&raw_f) &

Register a fallback implementation for all operators which will be used if there is not a specific implementation for an operator available.

There MUST be a DispatchKey associated with a fallback; e.g., only call this from TORCH_LIBRARY_IMPL() with namespace _.

// Example:

  // If there is not a kernel explicitly registered
  // for AutogradXLA, fallthrough to the next
  // available kernel

// See aten/src/ATen/core/dispatch/backend_fallback_test.cpp
// for a full example of boxed fallback


raw_f – The function that implements the fallback. Unboxed functions typically do not work as fallback functions, as fallback functions must work for every operator (even though they have varying type signatures). Typical arguments are CppFunction::makeFallthrough() or CppFunction::makeFromBoxedFunction()

class CppFunction

Represents a C++ function that implements an operator.

Most users won’t interact directly with this class, except via error messages: the constructors this function define the set of permissible “function”-like things you can bind via the interface.

This class erases the type of the passed in function, but durably records the type via an inferred schema for the function.

Public Functions

template<typename Func>
inline explicit CppFunction(Func *f, std::enable_if_t<c10::guts::is_function_type<Func>::value, std::nullptr_t> = nullptr)

This overload accepts function pointers, e.g., CppFunction(&add_impl)

template<typename FuncPtr>
inline explicit CppFunction(FuncPtr f, std::enable_if_t<c10::is_compile_time_function_pointer<FuncPtr>::value, std::nullptr_t> = nullptr)

This overload accepts compile time function pointers, e.g., CppFunction(TORCH_FN(add_impl))

template<typename Lambda>
inline explicit CppFunction(Lambda &&f, std::enable_if_t<c10::guts::is_functor<std::decay_t<Lambda>>::value, std::nullptr_t> = nullptr)

This overload accepts lambdas, e.g., CppFunction([](const Tensor& self) { ...


Public Static Functions

static inline CppFunction makeFallthrough()

This creates a fallthrough function.

Fallthrough functions immediately redispatch to the next available dispatch key, but are implemented more efficiently than a hand written function done in the same way.

template<c10::BoxedKernel::BoxedKernelFunction *func>
static inline CppFunction makeFromBoxedFunction()

Create a function from a boxed kernel function with signature void(const OperatorHandle&, Stack*); i.e., they receive a stack of arguments in a boxed calling convention, rather than in the native C++ calling convention.

Boxed functions are typically only used to register backend fallbacks via torch::Library::fallback().

template<class KernelFunctor>
static inline CppFunction makeFromBoxedFunctor(std::unique_ptr<KernelFunctor> kernelFunctor)

Create a function from a boxed kernel functor which defines operator()(const OperatorHandle&, DispatchKeySet, Stack*) (receiving arguments from boxed calling convention) and inherits from c10::OperatorKernel.

Unlike makeFromBoxedFunction, functions registered in this way can also carry additional state which is managed by the functor; this is useful if you’re writing an adapter to some other implementation, e.g., a Python callable, which is dynamically associated with the registered kernel.

template<typename FuncPtr, std::enable_if_t<c10::guts::is_function_type<FuncPtr>::value, std::nullptr_t> = nullptr>
static inline CppFunction makeFromUnboxedFunction(FuncPtr *f)

Create a function from an unboxed kernel function.

This is typically used to register common operators.

template<typename FuncPtr, std::enable_if_t<c10::is_compile_time_function_pointer<FuncPtr>::value, std::nullptr_t> = nullptr>
static inline CppFunction makeFromUnboxedFunction(FuncPtr f)

Create a function from a compile time unboxed kernel function pointer.

This is typically used to register common operators. Compile time function pointers can be used to allow the compiler to optimize (e.g. inline) calls to it.


template<typename Func>
inline CppFunction dispatch(c10::DispatchKey k, Func &&raw_f)

Create a torch::CppFunction which is associated with a specific dispatch key.

torch::CppFunctions that are tagged with a c10::DispatchKey don’t get invoked unless the dispatcher determines that this particular c10::DispatchKey is the one that should be dispatched to.

This function is generally not used directly, instead, prefer using TORCH_LIBRARY_IMPL(), which will implicitly set the c10::DispatchKey for all registration calls inside of its body.

template<typename Func>
inline CppFunction dispatch(c10::DeviceType type, Func &&raw_f)

Convenience overload of dispatch() which accepts c10::DeviceType.

inline c10::FunctionSchema schema(const char *str, c10::AliasAnalysisKind k, bool allow_typevars = false)

Construct a c10::FunctionSchema from a string, with an explicitly specified c10::AliasAnalysisKind.

Ordinarily, schemas are simply passed in as strings, but if you need to specify a custom alias analysis, you can replace the string with a call to this function.

// Default alias analysis (FROM_SCHEMA)
m.def("def3(Tensor self) -> Tensor");
// Pure function alias analysis
m.def(torch::schema("def3(Tensor self) -> Tensor",

inline c10::FunctionSchema schema(const char *s, bool allow_typevars = false)

Function schemas can be directly constructed from string literals.


Access comprehensive developer documentation for PyTorch

View Docs


Get in-depth tutorials for beginners and advanced developers

View Tutorials


Find development resources and get your questions answered

View Resources