Positive Semi-Definite Matrices

  • Definitions

  • Special cases and examples

  • Square root and Cholesky decomposition

  • Ellipsoids

Definitions

For a given symmetric matrix A in mathbf{R}^{n times n}, the associated quadratic form is the function q : mathbf{R}^n rightarrow mathbf{R} with values

 q(x) = x^TAx.
  • A symmetric matrix A is said to be positive semi-definite (PSD, notation: A succeq 0) if and only if the associated quadratic form q is non-negative everywhere:

 q(x) ge 0 mbox{ for every } x in mathbf{R}^n.
  • It is said to be positive definite (PD, notation: A succ 0) if the quadratic form is non-negative, and definite, that is, q(x) = 0 if and only if x=0.

It turns out that a matrix is PSD if and only if the eigenvalues of A are non-negative. Thus, we can check if a form is PSD by computing the eigenvalue decomposition of the underlying symmetric matrix.

Theorem: eigenvalues of PSD matrices

A quadratic form q(x) = x^TAx, with A in mathbf{S}^n is non-negative (resp. positive-definite) if and only if every eigenvalue of the symmetric matrix A is non-negative (resp. positive).

Proof.

By definition, the PSD and PD properties are properties of the eigenvalues of the matrix only, not of the eigenvectors. Also, if the n times n matrix A is PSD, then for every matrix B with n columns, the matrix B^T A B also is.

Special cases and examples

Symmetric dyads

Special cases of PSD matrices include symmetric dyads. Indeed, if A = uu^T for some vector u in mathbf{R}^n, then for every x:

 q_A (x)  = x^T uu^T x = (u^Tx)^2 ge 0.

More generally if B in mathbf{R}^{m times n}, then A = B^TB is PSD, since

 q_A(x) = x^TB^TBx = |Bx|_2^2 ge 0.

Diagonal matrices

A diagonal matrix is PSD (resp. PD) if and only if all of its (diagonal) elements are non-negative (resp. positive).

Examples of PSD matrices

Square root and Cholesky decomposition

For PD matrices, we can generalize the notion of ordinary square root of a non-negative number. Indeed, if A is PSD, there exist a unique PSD matrix, denoted A^{1/2}, such that A = (A^{1/2})^2. We can express this matrix square root in terms of the SED of A = ULambda U^T, asA^{1/2} = U Lambda^{1/2} U^T, where Lambda^{1/2} is obtained from Lambda by taking the square root of its diagonal elements. If A is PD, then so is its square root.

Any PSD matrix can be written as a product A = LL^T for an appropriate matrix L. The decomposition is not unique, and L = A^{1/2} is only a possible choice (the only PSD one). Another choice, in terms of the SED of A = U^T Lambda U, is L = U^T Lambda^{1/2}. If A is positive-definite, then we can choose L to be lower triangular, and invertible. The decomposition is then known as the Cholesky decomposition of A.

Ellipsoids

There is a strong correspondence between ellipsoids and PSD matrices.

Definition

We define an ellipsoid to be affine transformation of the unit ball for the Euclidean norm:

 mathbf{E} = left{ hat{x} + L z ~:~ |z|_2 le 1 right} ,

where L in mathbf{R}^{n times n} is an arbitrary non-singular matrix. We can express the ellipsoid as

 mathbf{E} = left{ x ~:~ |L^{-1}(x-hat{x})|_2 le 1 right}  =  left{ x ~:~ (x-hat{x})^T A^{-1} (x-hat{x}) le 1 right} ,

where A := L^{-T}L^{-1} is PD.

Geometric interpretation via SED

We can interpret the eigenvectors and associated eigenvalues of A in terms of geometrical properties of the ellipsoid, as follows. Consider the SED of A: A = U Lambda U^T, with U^TU = I and Lambda diagonal, with diagonal elements positive. The SED of its inverse is A^{-1} = L L^T = U Lambda^{-1} U^T. Let tilde{x} = U^T(x-hat{x}). We can express the condition x in mathbf{E} as

 tilde{x}^T Lambda tilde{x} = sum_{i=1}^n lambda_i tilde{x}_i^2 le 1.

Now set bar{x}_i = sqrt{lambda_i} tilde{x}_i, i=1,ldots,n. The above writes bar{x}^Tbar{x} le 1: in bar{x}-space, the ellipsoid is simply an unit ball. In tilde{x}-space, the ellipsoid corresponds to scaling each bar{x}-axis by the square roots of the eigenvalues. The ellipsoid has principal axes parallel to the coordinate axes in tilde{x}-space. We then apply a rotation and a translation, to get the ellipsoid in the original x-space. The rotation is determined by the eigenvectors of A^{-1}, which are contained in the orthogonal matrix U. Thus, the geometry of the ellipsoid can be read from the SED of the PD matrix A^{-1} = LL^T: the eigenvectors give the principal directions, and the semi-axis lengths are the square root of the eigenvalues.

alt text 

The graph on the left shows the ellipsoid { hat{x} + L z ::: |z|_2 le 1}, with

 hat{x} = left( begin{array}{c}21 end{array} right), ;; L = left( begin{array}{cc}1 & 20 & 3 end{array} right).

The matrix LL^T admits the SED

 LL^T = U^T Lambda U, ;; U = left( begin{array}{cc}-0.9871 &0.1602 0.1602 &0.9871end{array} right), ;; Lambda = left( begin{array}{cc} 0.6754 & 0 0 &13.3246end{array}right) .

We check that the columns of U determine the principal directions, and sqrt{lambda_1} = 3.6503, sqrt{lambda_2} = 0.8219 are the semi-axis lengths.

The above shows in particular that an equivalent representation of an ellipsoid is

 left{ x ~:~ (x-hat{x})^T B(x-hat{x}) le 1 right}

where B:=A^{-1} is PD.

It is possible to define degenerate ellipsoids, which correspond to cases when the matrix B in the above, or its inverse A, is degenerate. For example, cylinders or slabs (intersection of two parallel half-spaces) are degenerate ellipsoids.