svd(input, some=True, compute_uv=True, *, out=None) -> (Tensor, Tensor, Tensor)¶
Computes the singular value decomposition of either a matrix or batch of matrices
input. The singular value decomposition is represented as a namedtuple (U,S,V), such that
input= U diag(S) Vᴴ, where Vᴴ is the transpose of V for the real-valued inputs, or the conjugate transpose of V for the complex-valued inputs. If
inputis a batch of tensors, then U, S, and V are also batched with the same batch dimensions as
True(default), the method returns the reduced singular value decomposition i.e., if the last two dimensions of
inputare m and n, then the returned U and V matrices will contain only min(n, m) orthonormal columns.
False, the returned U and V will be zero-filled matrices of shape (m × m) and (n × n) respectively, and the same device as
someargument has no effect when
Supports input of float, double, cfloat and cdouble data types. The dtypes of U and V are the same as
input’s. S will always be real-valued, even if
someis the opposite of
full_matricies. Note that default value for both is
True, so the default behavior is effectively the opposite.
The singular values are returned in descending order. If
inputis a batch of matrices, then the singular values of each matrix in the batch is returned in descending order.
The implementation of SVD on CPU uses the LAPACK routine ?gesdd (a divide-and-conquer algorithm) instead of ?gesvd for speed. Analogously, the SVD on GPU uses the cuSOLVER routines gesvdj and gesvdjBatched on CUDA 10.1.243 and later, and uses the MAGMA routine gesdd on earlier versions of CUDA.
The returned matrix U will be transposed, i.e. with strides
Gradients computed using U and V may be unstable if
inputis not full rank or has non-unique singular values.
False, the gradients on
U[..., :, min(m, n):]and
V[..., :, min(m, n):]will be ignored in backward as those vectors can be arbitrary bases of the subspaces.
The S tensor can only be used to compute gradients if
With the complex-valued input the backward operation works correctly only for gauge invariant loss functions. Please look at Gauge problem in AD for more details.
Since U and V of an SVD is not unique, each vector can be multiplied by an arbitrary phase factor while the SVD result is still correct. Different platforms, like Numpy, or inputs on different device types, may produce different U and V tensors.
input (Tensor) – the input tensor of size (*, m, n) where * is zero or more batch dimensions consisting of (m × n) matrices.
some (bool, optional) – controls whether to compute the reduced or full decomposition, and consequently the shape of returned U and V. Defaults to True.
compute_uv (bool, optional) – option whether to compute U and V or not. Defaults to True.
- Keyword Arguments
out (tuple, optional) – the output tuple of tensors
>>> a = torch.randn(5, 3) >>> a tensor([[ 0.2364, -0.7752, 0.6372], [ 1.7201, 0.7394, -0.0504], [-0.3371, -1.0584, 0.5296], [ 0.3550, -0.4022, 1.5569], [ 0.2445, -0.0158, 1.1414]]) >>> u, s, v = torch.svd(a) >>> u tensor([[ 0.4027, 0.0287, 0.5434], [-0.1946, 0.8833, 0.3679], [ 0.4296, -0.2890, 0.5261], [ 0.6604, 0.2717, -0.2618], [ 0.4234, 0.2481, -0.4733]]) >>> s tensor([2.3289, 2.0315, 0.7806]) >>> v tensor([[-0.0199, 0.8766, 0.4809], [-0.5080, 0.4054, -0.7600], [ 0.8611, 0.2594, -0.4373]]) >>> torch.dist(a, torch.mm(torch.mm(u, torch.diag(s)), v.t())) tensor(8.6531e-07) >>> a_big = torch.randn(7, 5, 3) >>> u, s, v = torch.svd(a_big) >>> torch.dist(a_big, torch.matmul(torch.matmul(u, torch.diag_embed(s)), v.transpose(-2, -1))) tensor(2.6503e-06)