Function torch::autograd::backward¶
Defined in File autograd.h
Function Documentation¶

void torch::autograd::backward(const variable_list &tensors, const variable_list &grad_tensors = {}, c10::optional<bool> retain_graph = c10::nullopt, bool create_graph = false, const variable_list &inputs = {})¶
Computes the sum of gradients of given tensors with respect to graph leaves.
The graph is differentiated using the chain rule. If any of
tensors
are nonscalar (i.e. their data has more than one element) and require gradient, then the Jacobianvector product would be computed, in this case the function additionally requires specifyinggrad_tensors
. It should be a sequence of matching length, that contains the “vector” in the Jacobianvector product, usually the gradient of the differentiated function w.r.t. corresponding tensors (torch::Tensor()
is an acceptable value for all tensors that don’t need gradient tensors).This function accumulates gradients in the leaves  you might need to zero them before calling it.
 Parameters
tensors – Tensors of which the derivative will be computed.
grad_tensors – The “vector” in the Jacobianvector product, usually gradients w.r.t. each element of corresponding tensors.
torch::Tensor()
values can be specified for scalar Tensors or ones that don’t require grad. If atorch::Tensor()
value would be acceptable for all grad_tensors, then this argument is optional.retain_graph – If
false
, the graph used to compute the grad will be freed. Note that in nearly all cases setting this option totrue
is not needed and often can be worked around in a much more efficient way. Defaults to the value ofcreate_graph
.create_graph – If
true
, graph of the derivative will be constructed, allowing to compute higher order derivative products. Defaults tofalse
.inputs – Inputs w.r.t. which the gradient will be accumulated into
at::Tensor::grad
. All other Tensors will be ignored. If not provided, the gradient is accumulated into all the leaf Tensors that were used to compute paramtensors
.