Singular Value Decomposition (SVD)

The Singular Value Decomposition (SVD) represents the pinnacle of matrix factorizations. While the eigendecomposition is restricted to square matrices (and further limited by diagonalizability), the SVD is universal. Every linear operator $A : K^{n} \to K^{m}$ (where $K = R$ or $C$ ) possesses an SVD, providing profound insights into the geometry and numerical stability of the transformation.

1. The SVD Theorem

Let $A \in C^{m \times n}$ be a matrix of rank $r$ . There exists a factorization of the form:

$A = U Σ V^{*}$

where:

$U \in C^{m \times m}$ is a unitary matrix ( $U^{*} U = I_{m}$ ). Its columns are called the left singular vectors.
$V \in C^{n \times n}$ is a unitary matrix ( $V^{*} V = I_{n}$ ). Its columns are the right singular vectors.
$Σ \in R^{m \times n}$ is a rectangular diagonal matrix with non-negative real entries $σ_{1} \geq σ_{2} \geq \dots \geq σ_{m i n (m, n)} \geq 0$ on the diagonal. These are the singular values.

In the real case ( $A \in R^{m \times n}$ ), $U$ and $V$ are orthogonal matrices, and the decomposition is $A = U Σ V^{T}$ .

2. Derivation and Link to Eigendecomposition

The SVD is intimately connected to the spectral properties of the hermitian matrices $A^{*} A$ and $A A^{*}$ .

Right Singular Vectors ( $V$ ): Consider $A^{*} A \in C^{n \times n}$ . Since $A^{*} A$ is positive semi-definite and Hermitian, its eigenvalues are real and non-negative. Let these be $λ_{1}, λ_{2}, \dots, λ_{n}$ . The right singular vectors $v_{i}$ are the orthonormal eigenvectors of $A^{*} A$ . $(A^{*} A) v_{i} = λ_{i} v_{i}$
Singular Values ( $Σ$ ): The singular values are the square roots of these eigenvalues: $σ_{i} = λ_{i}$ .
Left Singular Vectors ( $U$ ): These are the orthonormal eigenvectors of $A A^{*} \in C^{m \times m}$ . For $σ_{i} > 0$ , they are uniquely determined by: $u_{i} = \frac{1}{σ _{i}} A v_{i}$

3. Geometric Intuition

Geometrically, the SVD asserts that any linear transformation can be decomposed into a rotation in the domain, a scaling along principal axes, and a rotation in the codomain.

Imagine a unit sphere $S$ in $R^{n}$ . Under the transformation $A$ , this sphere is mapped to a hyper-ellipsoid in $R^{m}$ .

The singular values $σ_{i}$ are the lengths of the semi-axes of the ellipsoid.
The left singular vectors $u_{i}$ define the directions of these semi-axes in the codomain.
The right singular vectors $v_{i}$ define the orthonormal basis in the domain that maps to these semi-axes.

4. Properties of Singular Values

Non-negativity: $σ_{i} \geq 0$ for all $i$ .
Ordering: Conventionally, $σ_{1} \geq σ_{2} \geq \dots \geq σ_{r} > 0$ .
Rank: The number of non-zero singular values is exactly the rank $r$ of the matrix $A$ .
Norms:
- Spectral Norm: $∥ A ∥_{2} = σ_{1}$ .
- Frobenius Norm: $∥ A ∥_{F} = σ_{1}^{2} + \dots + σ_{r}^{2}$ .
Condition Number: For an invertible square matrix, $κ (A) = σ_{1} / σ_{n}$ .

5. The Moore-Penrose Pseudoinverse

For any matrix $A$ , we define the pseudoinverse $A^{†}$ via the SVD:

$A^{†} = V Σ^{†} U^{*}$

where $Σ^{†}$ is obtained by taking the reciprocal of each non-zero element on the diagonal of $Σ^{T}$ .

Applications in Linear Systems: For an overdetermined system $A x = b$ , the least-squares solution that minimizes $∥ A x - b ∥_{2}$ is given by $\overset{x}{^} = A^{†} b$ . If multiple solutions exist (underdetermined), $A^{†} b$ provides the solution with the minimum Euclidean norm.

6. Low-Rank Approximation

The Eckart–Young–Mirsky Theorem provides the theoretical foundation for data compression. It states that the best rank- $k$ approximation of $A$ (where $k < r$ ) in the Frobenius norm is given by:

$A_{k} = \sum_{i = 1}^{k} σ_{i} u_{i} v_{i}^{*}$

The approximation error is $∥ A - A_{k} ∥_{F} = \sum_{i = k + 1}^{r} σ_{i}^{2}$ . This property is utilized in Principal Component Analysis (PCA) to reduce dimensionality while preserving maximum variance.

7. Polar Decomposition

The SVD enables the Polar Decomposition, analogous to the polar form of complex numbers ( $z = r e^{i θ}$ ). Any square matrix $A$ can be written as:

$A = QS$

where $Q$ is unitary (representing rotation/reflection) and $S$ is a positive semi-definite Hermitian matrix (representing scaling/stretching).

Using SVD: $A = (U V^{*}) \cdot (V Σ V^{*})$ .
Here $Q = U V^{*}$ and $S = V Σ V^{*}$ .

8. Python Implementation: Rank-k Approximation

The following script performs a rank- $k$ approximation on a randomly generated matrix and quantifies the relative error.

import numpy as np

def rank_k_approximation(A, k):
    # Perform SVD
    U, S, Vh = np.linalg.svd(A, full_matrices=False)
    
    # Construct Sigma_k
    Sk = np.diag(S[:k])
    
    # Reconstruct rank-k matrix
    Ak = U[:, :k] @ Sk @ Vh[:k, :]
    
    # Calculate Frobenius norm error
    full_norm = np.linalg.norm(A, 'fro')
    error_norm = np.linalg.norm(A - Ak, 'fro')
    
    return Ak, error_norm / full_norm

# Example usage
m, n = 100, 50
A = np.random.randn(m, n)
k = 10

Ak, rel_error = rank_k_approximation(A, k)
print(f"Original Rank: {np.linalg.matrix_rank(A)}")
print(f"Target Rank: {k}")
print(f"Relative Frobenius Error: {rel_error:.4f}")

9. Applications in Physics and Engineering

Image Compression: By treating an image as a matrix and keeping only the top $k$ singular values, we store $k (m + n + 1)$ values instead of $mn$ .
PCA (Principal Component Analysis): In statistics, SVD is applied to the covariance matrix to find the directions of maximum variance.
Signal Processing: SVD is used to separate signal from noise, as noise typically corresponds to small singular values.

Conceptual Check

Given a matrix A with singular values 10, 5, 2, 0.01. What is the minimal error (Frobenius norm) possible for a rank-2 approximation?

Conceptual Check

Which relationship correctly describes the connection between SVD and the eigenvalues of AA*?

Conceptual Check

Singular Value Decomposition (SVD)

Singular Value Decomposition (SVD)

1. The SVD Theorem

2. Derivation and Link to Eigendecomposition

3. Geometric Intuition

4. Properties of Singular Values

5. The Moore-Penrose Pseudoinverse

6. Low-Rank Approximation

7. Polar Decomposition

8. Python Implementation: Rank-k Approximation

9. Applications in Physics and Engineering

Given a matrix A with singular values 10, 5, 2, 0.01. What is the minimal error (Frobenius norm) possible for a rank-2 approximation?

Which relationship correctly describes the connection between SVD and the eigenvalues of AA*?

For an overdetermined system Ax = b where A has full column rank, the pseudoinverse solution A†b is equivalent to which classical solution?