Semidefinite Optimization

  • Definitions and standard forms

  • Special cases

  • Examples

Definition and standard forms

Definition

We say that a problem is a semidefinite optimization problem (SDP) if it is a conic optimization problem of the form
 min_x : c^Tx ~:~ Ax = b, ;; x in {cal K} ,
where the cone {cal K} is a product of semidefinite cones of given size. The decision variable x contains elements of matrices, and the constraint x in {cal K} imposes positive semi-definiteness on such matrices.

Standard form

Often, it is useful to think of x as a matrix variable. The following standard form formalizes this.

First note that, without loss of generality, we may assume that {cal K} = {cal S}^n_+. Let us take an example with {cal K} = {cal S}^{n_1}_+ times {cal S}^{n_2}_+ to illustrate why. Indeed, the condition that (X_1,X_2) in  {cal S}^{n_1}_+ times {cal S}^{n_2}_+ is the same as X:=mbox{bf diag}(X_1,X_2) in {cal S}^{n_1+n_2}_+. Thus, by imposing that certain elements of the matrix X be zero (those outside the diagonal blocks of size n_i times n_i), we can always reduce our problem to the case when the cone {cal K} is the semidefinite cone itself, and the matrix variable is constrained to be block diagonal (with block sizes n_1, n_2), and live in that cone.

A standard form for the SDP model is thus
 min_{X in {cal S}^n} : mbox{bf Tr} CX ~:~ mbox{bf Tr} (A_i X) = b_i, ;; i=1,ldots,m, ;; X succeq 0,
where C,A_1,ldots,A_min{cal S}^n, and b in mathbf{R}^m. (In the above, we have used the fact that a single affine equality constraint on a symmetric matrix variable X can be represented as a scalar product condition of the form langle A, X rangle  = b, for appropriate symmetric matrix A and scalar b.)

Inequality form

An alternate standard form, called the inequality standard form, is
 min_x : c^Tx ~:~ F_0 + sum_{i=1}^n x_i F_i succeq 0.
where matrices F_0,ldots,F_n in {cal S}^m. The constraint in the above problem is called a linear matrix inequality (LMI). The above form is easily converted to the previous standard form (I suggest you try as an exercise).

Special cases

As discussed above, we can use the above formalism to handle multiple LMIs, using block-diagonal matrices. To illustrate this, we show that SDPs contain LPs as a special case. Indeed, the LP
 min_x : c^Tx ~:~ Ax le b
is equivalent to the SDP
 min_x : c^Tx ~:~ F(x) := mbox{bf diag}(b_1-a_1^Tx, ldots,b_m-a_m^Tx) succeq 0,
where a_i^T is the i-th row of A. Thus, LPs are SDPs with a diagonal matrix in the LMIs.

Similarly, SDPs contain SOCPs, by virtue of the following result, already mentioned in XXX: for every x,t,
 |x|_2 le t Longleftrightarrow left(begin{array}{cccc}  t & x_1 & ldots & x_n  x_1 & t & & 0  vdots & & ddots &  x_n & 0 & & t end{array}right) succeq 0.
Thus, a second-order cone constraint can be written as an LMI with an ‘‘arrow’’ matrix.

Examples

SDP relaxation of boolean problems

In lecture 5, we have seen LP relaxations for boolean LPs, which are LPs with the added non-convex requirement that the decision variable should be a boolean vector. This approach does not easily extend to the case when the problem involves quadratics.

To illustrate this, consider the max-cut problem. We are given a graph with vertices labeled 1,ldots,n, with weights w_{ij}=w_{ji} for every pair of vertices (i,j). We seek a cut (a subset S of V) such that the total weight of all the edges that leave S is maximized. This can be expressed as the quadratic boolean problem
 max_{x} : frac{1}{2} sum_{i<j} w_{ij} (1-x_i x_j) ~:~ x in {-1,1}^n.
The problem is NP-hard. In 1995, Goemans and Williamson introduced an SDP relaxation (upper bound) of the problem, and showed that its value is at most within 15% of the optimal value of the combinatorial problem above, independent of problem size.

To explain the relaxation, let us embed the above problem into the more general problem of maximizing a given quadratic form over a boolean set:
 p^ast := max_{x in {-1,1}^n} : x^TWx,
where W in {cal S}^n is a given symmetric matrix. (As an exercise, try to cast the max-cut problem into the above formalism.)

We can express the problem as
 p^ast = max_{X} : mbox{bf Tr} WX ~:~ X succeq 0, ;; X_{ii} = 1, ;; i=1,ldots,n, ;; mbox{bf rank} X = 1.
Indeed, X is feasible for the above if and only X = xx^T for some x in {-1,1}^n, in which case mbox{bf Tr} WX = x^TWx.

Relaxing (meaning: ignoring) the rank constraint leads to the upper bound p^ast le d^ast, where
 d^ast := max_{X} : mbox{bf Tr} WX ~:~ X succeq 0, ;; X_{ii} = 1, ;; i=1,ldots,n .
The above is an SDP (in standard form). Nesterov has shown that, for arbitrary matrices W, the above relaxation is within pi/2 of the true value, that is, p^ast le d^ast le (pi/2) p^ast.

Non-convex quadratic optimization

The approach used above can be applied to general non-convex quadratic optimization, which has the form
 min_x : q_0(x) ~:~ q_i(x) le 0, ;; i=1,ldots,m,
where x in mathbf{R}^n is the decision variable, and q_i’s are quadratic functions, of the form
 q_i(x) := x^TQ_i x + 2q_i x + p_i, ;; i=1,ldots,m,
with Q_i in {cal S}^n, q_i in mathbf{R}^n and p_i in mathbf{R} given. The above problem is not convex in general (we have not imposed positive semi-definiteness on the Q_i’s).

We note that q_i(x) = L_i(xx^T,x), with L_i : {cal S}^n times mathbf{R}^n rightarrow mathbf{R} the affine functions:
 L_i(X,x) := mbox{bf Tr} XQ_i + 2q_i x + p_i, ;; i=1,ldots,m.
We can express the problem as
  min_{X,x} : L_0(X,x) ~:~ L_i(X,x) le 0, ;; i=1,ldots,m, ;; X succeq 0, ;; mbox{bf rank}(X) = 1.
Dropping the rank constraint leads to an SDP relaxation (lower bound):
 min_{X,x} : L_0(X,x) ~:~ L_i(X,x) le 0, ;; i=1,ldots,m, ;; X succeq 0 .
The above relaxation can be arbitrarily bad, but in some cases it is exact. For example, in the case of a single inequality constraint (m=1), the SDP relaxation provides the exact value of the original non-convex problem.

%=== Optimization of ellipsoids.

%=== Polynomial optimization.

Stability analysis of uncertain dynamical systems. Consider a time-varying dynamical system of the form


 x(t+1)= A(t) x(t), ;; t=0,1,2,ldots
with x(t) in mathbf{R}^n the state vector, and A(t) in mathbf{R}^{n times n}. We assume that all is known about A(t) is that A(t) in {cal A} = {A_1,ldots,A_L}, with A_iin mathbf{R}^{n times n}, i=1,ldots,L given matrices. We say that the system is asymptotically stable if, for every initial condition x(0) in mathbf{R}^n, the resulting trajectory goes to zero: x(t) rightarrow 0 as t rightarrow +infty. Note that ascertaining asymptotic stability of the above ‘‘switched’’ system is hard in general, so we look for a tractable sufficient condition for asymptotic stability.

Let us denote by |cdot| the largest singular value norm. Clearly, if for every t ge 0, we have |A(t) | le sigma <1 for some sigma<1, then |x(t+1)|_2 le sigma |x(t)|_2, which shows that the state vector goes to zero at least as fast as sigma^{t}. The norm condition ‘‘|A(t) | <1 for every t’’ is conservative, since it implies that the state decreases monotonically.

To refine the norm-based sufficient condition for asymptotic stability, we can use scaling. For S in mathbf{R}^{n times n} a given invertible matrix, we define bar{x}(t) := Sx(t). In the new state space defined by bar{x}, the state equations become bar{x}(t+1) = bar{A}(t) bar{x}(t), with bar{A}(t) := SA(t)S^{-1}. The asymptotic stability of the system is independent of the choice of S, since bar{x}(t) converges to zero if and only if x(t) does. However, the norm-based sufficient condition for asymptotic stability is not independent of the choice of S. Indeed, if we impose that condition on bar{A}(t), we obtain |SA(t)S^{-1}|  < 1. In turn, this condition writesfootnote{Here, we use the fact that for a matrix A, the condition |A| < 1 is equivalent to A^TA prec I (try to show this).}
 A(t)^T P A(t) prec P,
where P = S^TS succ 0. (The original norm-based sufficient condition is recovered with P = I.)

We conclude that the existence of Pin{cal S}^n such that
 P succ 0, ;; forall : i=1,ldots,L ~:~ A_i^T P A_i prec P
ensures the stability of the system, regardless of the choice of the trajectory of the matrix A(t) in {cal A}. The above condition is an LMI in matrix variable P.

Note that since for fixed Psucc 0, the set { A ::: A^TPA prec P} is convex, requiring that it contains the finite set {cal A} is the same as requiring that it contains its convex hull. So the above condition also ensures stability of the system, when A(t) is allowed to be any time-varying convex combination of the matrices A_i, i=1,ldots,L.

Also, note that when L=1, the above condition turns out to be exact, and equivalent to the condition that all the eigenvalues of the matrix A lie in the interior of the unit circle of the complex plane. This is the basic result of the so-called Lyapunov asymptotic stability theorem for linear systems.