Shortcuts, *, pivot=True, out=None)

Computes the LU decomposition with partial pivoting of a matrix.

Letting K\mathbb{K} be R\mathbb{R} or C\mathbb{C}, the LU decomposition with partial pivoting of a matrix AKm×nA \in \mathbb{K}^{m \times n} if k = min(m,n), is defined as

A=PLUPKm×m,LKm×k,UKk×nA = PLU\mathrlap{\qquad P \in \mathbb{K}^{m \times m}, L \in \mathbb{K}^{m \times k}, U \in \mathbb{K}^{k \times n}}

where PP is a permutation matrix, LL is lower triangular with ones on the diagonal and UU is upper triangular.

If pivot= False and A is on GPU, then the LU decomposition without pivoting is computed

A=LULKm×k,UKk×nA = LU\mathrlap{\qquad L \in \mathbb{K}^{m \times k}, U \in \mathbb{K}^{k \times n}}

When pivot= False, the returned matrix P will be empty. The LU decomposition without pivoting may not exist if any of the principal minors of A is singular. In this case, the output matrix may contain inf or NaN.

Supports input of float, double, cfloat and cdouble dtypes. Also supports batches of matrices, and if A is a batch of matrices then the output has the same batch dimensions.

See also

torch.linalg.solve() solves a system of linear equations using the LU decomposition with partial pivoting.


The LU decomposition is almost never unique, as often there are different permutation matrices that can yield different LU decompositions. As such, different platforms, like SciPy, or inputs on different devices, may produce different valid decompositions.


Gradient computations are only supported if the input matrix is full-rank. If this condition is not met, no error will be thrown, but the gradient may not be finite. This is because the LU decomposition with pivoting is not differentiable at these points.

  • A (Tensor) – tensor of shape (*, m, n) where * is zero or more batch dimensions.

  • pivot (bool, optional) – Controls whether to compute the LU decomposition with partial pivoting or no pivoting. Default: True.

Keyword Arguments

out (tuple, optional) – output tuple of three tensors. Ignored if None. Default: None.


A named tuple (P, L, U).


>>> A = torch.randn(3, 2)
>>> P, L, U =
>>> P
tensor([[0., 1., 0.],
        [0., 0., 1.],
        [1., 0., 0.]])
>>> L
tensor([[1.0000, 0.0000],
        [0.5007, 1.0000],
        [0.0633, 0.9755]])
>>> U
tensor([[0.3771, 0.0489],
        [0.0000, 0.9644]])
>>> torch.dist(A, P @ L @ U)

>>> A = torch.randn(2, 5, 7, device="cuda")
>>> P, L, U =, pivot=False)
>>> P
tensor([], device='cuda:0')
>>> torch.dist(A, L @ U)
tensor(1.0376e-06, device='cuda:0')


Access comprehensive developer documentation for PyTorch

View Docs


Get in-depth tutorials for beginners and advanced developers

View Tutorials


Find development resources and get your questions answered

View Resources