Positive Semi-Definite Matrices

Symmetric Matrices > Definitions | Spectral theorem | PSD matrices | PCA | Applications

Definitions
Special cases and examples
Square root and Cholesky decomposition
Ellipsoids

Definitions

For a given symmetric matrix $A in mathbf{R}^{n times n}$ , the associated quadratic form is the function $q : mathbf{R}^n rightarrow mathbf{R}$ with values

A symmetric matrix is said to be positive semi-definite (PSD, notation: ) if and only if the associated quadratic form is non-negative everywhere:

$q(x) ge 0 mbox{ for every } x in mathbf{R}^n.$

It is said to be positive definite (PD, notation: ) if the quadratic form is non-negative, and definite, that is, if and only if .

It turns out that a matrix is PSD if and only if the eigenvalues of are non-negative. Thus, we can check if a form is PSD by computing the eigenvalue decomposition of the underlying symmetric matrix.

Theorem: eigenvalues of PSD matrices

A quadratic form q(x) = x^TAx , with $A in mathbf{S}^n$ is non-negative (resp. positive-definite) if and only if every eigenvalue of the symmetric matrix is non-negative (resp. positive).

Proof.

By definition, the PSD and PD properties are properties of the eigenvalues of the matrix only, not of the eigenvectors. Also, if the n times n matrix is PSD, then for every matrix with columns, the matrix B^T A B also is.

Special cases and examples

Symmetric dyads

Special cases of PSD matrices include symmetric dyads. Indeed, if A = uu^T for some vector $u in mathbf{R}^n$ , then for every :

More generally if $B in mathbf{R}^{m times n}$ , then A = B^TB is PSD, since

Diagonal matrices

A diagonal matrix is PSD (resp. PD) if and only if all of its (diagonal) elements are non-negative (resp. positive).

Examples of PSD matrices

Square root and Cholesky decomposition

For PD matrices, we can generalize the notion of ordinary square root of a non-negative number. Indeed, if is PSD, there exist a unique PSD matrix, denoted $A^{1/2}$ , such that $A = (A^{1/2})^2$ . We can express this matrix square root in terms of the SED of A = ULambda U^T , as $A^{1/2} = U Lambda^{1/2} U^T$ , where $Lambda^{1/2}$ is obtained from Lambda by taking the square root of its diagonal elements. If is PD, then so is its square root.

Any PSD matrix can be written as a product A = LL^T for an appropriate matrix . The decomposition is not unique, and $L = A^{1/2}$ is only a possible choice (the only PSD one). Another choice, in terms of the SED of A = U^T Lambda U , is $L = U^T Lambda^{1/2}$ . If is positive-definite, then we can choose to be lower triangular, and invertible. The decomposition is then known as the Cholesky decomposition of .

Ellipsoids

There is a strong correspondence between ellipsoids and PSD matrices.

Definition

We define an ellipsoid to be affine transformation of the unit ball for the Euclidean norm:

$mathbf{E} = left{ hat{x} + L z ~:~ |z|_2 le 1 right} ,$

where $L in mathbf{R}^{n times n}$ is an arbitrary non-singular matrix. We can express the ellipsoid as

$mathbf{E} = left{ x ~:~ |L^{-1}(x-hat{x})|_2 le 1 right} = left{ x ~:~ (x-hat{x})^T A^{-1} (x-hat{x}) le 1 right} ,$

where $A := L^{-T}L^{-1}$ is PD.

Geometric interpretation via SED

We can interpret the eigenvectors and associated eigenvalues of in terms of geometrical properties of the ellipsoid, as follows. Consider the SED of : A = U Lambda U^T , with U^TU = I and Lambda diagonal, with diagonal elements positive. The SED of its inverse is $A^{-1} = L L^T = U Lambda^{-1} U^T$ . Let $tilde{x} = U^T(x-hat{x})$ . We can express the condition x in mathbf{E} as

$tilde{x}^T Lambda tilde{x} = sum_{i=1}^n lambda_i tilde{x}_i^2 le 1.$

Now set $bar{x}_i = sqrt{lambda_i} tilde{x}_i$ , i=1,ldots,n . The above writes $bar{x}^Tbar{x} le 1$ : in bar{x} -space, the ellipsoid is simply an unit ball. In -space, the ellipsoid corresponds to scaling each bar{x} -axis by the square roots of the eigenvalues. The ellipsoid has principal axes parallel to the coordinate axes in tilde{x} -space. We then apply a rotation and a translation, to get the ellipsoid in the original -space. The rotation is determined by the eigenvectors of $A^{-1}$ , which are contained in the orthogonal matrix . Thus, the geometry of the ellipsoid can be read from the SED of the PD matrix $A^{-1} = LL^T$ : the eigenvectors give the principal directions, and the semi-axis lengths are the square root of the eigenvalues.

The graph on the left shows the ellipsoid ${ hat{x} + L z ::: |z|_2 le 1}$ , with

hat{x} = left( begin{array}{c}21 end{array} right), ;; L = left( begin{array}{cc}1 & 20 & 3 end{array} right).

The matrix LL^T admits the SED

$LL^T = U^T Lambda U, ;; U = left( begin{array}{cc}-0.9871 &0.1602 0.1602 &0.9871end{array} right), ;; Lambda = left( begin{array}{cc} 0.6754 & 0 0 &13.3246end{array}right) .$

We check that the columns of determine the principal directions, and $sqrt{lambda_1} = 3.6503$ , $sqrt{lambda_2} = 0.8219$ are the semi-axis lengths.

The above shows in particular that an equivalent representation of an ellipsoid is

$left{ x ~:~ (x-hat{x})^T B(x-hat{x}) le 1 right}$

where $B:=A^{-1}$ is PD.

It is possible to define degenerate ellipsoids, which correspond to cases when the matrix in the above, or its inverse , is degenerate. For example, cylinders or slabs (intersection of two parallel half-spaces) are degenerate ellipsoids.