Function torch::autograd::backward¶
Defined in File autograd.h
Function Documentation¶
-
void torch::autograd::backward(const variable_list &tensors, const variable_list &grad_tensors = {}, std::optional<bool> retain_graph = std::nullopt, bool create_graph = false, const variable_list &inputs = {})¶
Computes the sum of gradients of given tensors with respect to graph leaves.
The graph is differentiated using the chain rule. If any of
tensors
are non-scalar (i.e. their data has more than one element) and require gradient, then the Jacobian-vector product would be computed, in this case the function additionally requires specifyinggrad_tensors
. It should be a sequence of matching length, that contains the “vector” in the Jacobian-vector product, usually the gradient of the differentiated function w.r.t. corresponding tensors (torch::Tensor()
is an acceptable value for all tensors that don’t need gradient tensors).This function accumulates gradients in the leaves - you might need to zero them before calling it.
- Parameters
tensors – Tensors of which the derivative will be computed.
grad_tensors – The “vector” in the Jacobian-vector product, usually gradients w.r.t. each element of corresponding tensors.
torch::Tensor()
values can be specified for scalar Tensors or ones that don’t require grad. If atorch::Tensor()
value would be acceptable for all grad_tensors, then this argument is optional.retain_graph – If
false
, the graph used to compute the grad will be freed. Note that in nearly all cases setting this option totrue
is not needed and often can be worked around in a much more efficient way. Defaults to the value ofcreate_graph
.create_graph – If
true
, graph of the derivative will be constructed, allowing to compute higher order derivative products. Defaults tofalse
.inputs – Inputs w.r.t. which the gradient will be accumulated into
at::Tensor::grad
. All other Tensors will be ignored. If not provided, the gradient is accumulated into all the leaf Tensors that were used to compute paramtensors
.