Singular value decomposition (SVD) theorem

Theorem: Singular Value Decomposition (SVD)

An arbitrary matrix $A in mathbf{R}^{m times n}$ admits a decomposition of the form

$A = sum_{i=1}^r sigma_i u_i v_i^T = U tilde{ {S}} V^T, ;; tilde{ {S}} := left( begin{array}{cc} {S} & 0 0 & 0 end{array} right) ,$

where $U in mathbf{R}^{m times m}$ , $V in mathbf{R}^{n times n}$ are both orthogonal matrices, and the matrix {S} is diagonal:

$S = mbox{bf diag}(sigma_1 , ldots, sigma_r),$

where the positive numbers sigma_1 ge ldots ge sigma_r >0 are unique, and are called the singular values of . The number r le min(m,n) is equal to the rank of , and the triplet (U,tilde{ {S}},V) is called a singular value decomposition (SVD) of . The first columns of : u_i , i=1,ldots,r (resp. : v_i , ) are called left (resp. right) singular vectors of , and satisfy

A v_i = sigma_i u_i, ;; u_i^T A = sigma_i v_i, ;; i=1,ldots,r.

Proof: The matrix n times n A^TA is real and symmetric. According to the spectral theorem, it admits an eigenvalue decomposition in the form A^TA = V Lambda V^T , with a matrix whose columns form an orthonormal basis (that is, V^TV = VV^T = I_n ), and $Lambda = mbox{bf diag}(lambda_1,ldots,lambda_r,0,ldots,0)$ . Here, is the rank of A^TA (if r = n then there are no trailing zeros in Lambda ). Since A^TA is positive semi-definite, the lambda_j 's are non-negative, and we can define the non-zero quantities $sigma_j := sqrt{lambda_j}$ , j=1,ldots,r .

Note that when j >r , Av_j = 0 , since then |Av_j|_2^2 = v_j^TA^TAv_j = lambda_j v_j^Tv_j = 0 .

Let us construct an m times m orthogonal matrix as follows. We set

$u_i = frac{1}{sigma_i} Av_i, ;; i=1,ldots,r.$

These -vectors are unit-norm, and mutually orthogonal, since v_j 's are eigenvectors of A^TA . Using (say) the Gram-Schmidt orthogonalization procedure, we can complete (if necessary, that is in the case r <m ) this set of vectors by $u_{r+1},ldots,u_m$ in order to form an orthogonal matrix $U:=(u_1,ldots,u_m) in mathbf{R}^m$ .

Let us check that U,V satisfy the conditions of the theorem, by showing that $U^TAV^T=tilde{S}:=mbox{bf diag}(sigma_1,ldots,sigma_r,0,ldots,0)$ . We have

$(U^TAV)_{ij} = u_i^TAv_j = left{begin{array}{ll}sigma_j u_i^Tu_j &mbox{if } j le r 0 & mbox{otherwise,} end{array}right.$

where the second line stems from the fact that Av_j = 0 when j >r . Thus, $U^TAV = tilde{S}$ , as claimed.