Semidefinite Optimization

Conic Optimization > LP | Overview | SOCP | SDP

Definitions and standard forms
Special cases
Examples

Definition and standard forms

Definition

We say that a problem is a semidefinite optimization problem (SDP) if it is a conic optimization problem of the form
$min_x : c^Tx ~:~ Ax = b, ;; x in {cal K} ,$
where the cone {cal K}

is a product of semidefinite cones of given size. The decision variable

contains elements of matrices, and the constraint x in {cal K}

imposes positive semi-definiteness on such matrices.

Standard form

Often, it is useful to think of

as a matrix variable. The following standard form formalizes this.

First note that, without loss of generality, we may assume that ${cal K} = {cal S}^n_+$ . Let us take an example with ${cal K} = {cal S}^{n_1}_+ times {cal S}^{n_2}_+$ to illustrate why. Indeed, the condition that $(X_1,X_2) in {cal S}^{n_1}_+ times {cal S}^{n_2}_+$ is the same as $X:=mbox{bf diag}(X_1,X_2) in {cal S}^{n_1+n_2}_+$ . Thus, by imposing that certain elements of the matrix

be zero (those outside the diagonal blocks of size n_i times n_i

), we can always reduce our problem to the case when the cone {cal K}

is the semidefinite cone itself, and the matrix variable is constrained to be block diagonal (with block sizes n_1

), and live in that cone.

A standard form for the SDP model is thus
$min_{X in {cal S}^n} : mbox{bf Tr} CX ~:~ mbox{bf Tr} (A_i X) = b_i, ;; i=1,ldots,m, ;; X succeq 0,$
where $C,A_1,ldots,A_min{cal S}^n$ , and $b in mathbf{R}^m$ . (In the above, we have used the fact that a single affine equality constraint on a symmetric matrix variable

can be represented as a scalar product condition of the form langle A, X rangle = b

, for appropriate symmetric matrix

and scalar

Inequality form

An alternate standard form, called the inequality standard form, is
$min_x : c^Tx ~:~ F_0 + sum_{i=1}^n x_i F_i succeq 0.$
where matrices $F_0,ldots,F_n in {cal S}^m$ . The constraint in the above problem is called a linear matrix inequality (LMI). The above form is easily converted to the previous standard form (I suggest you try as an exercise).

Special cases

As discussed above, we can use the above formalism to handle multiple LMIs, using block-diagonal matrices. To illustrate this, we show that SDPs contain LPs as a special case. Indeed, the LP
min_x : c^Tx ~:~ Ax le b

is equivalent to the SDP
$min_x : c^Tx ~:~ F(x) := mbox{bf diag}(b_1-a_1^Tx, ldots,b_m-a_m^Tx) succeq 0,$
where a_i^T

is the

-th row of

. Thus, LPs are SDPs with a diagonal matrix in the LMIs.

Similarly, SDPs contain SOCPs, by virtue of the following result, already mentioned in XXX: for every x,t

,
$|x|_2 le t Longleftrightarrow left(begin{array}{cccc} t & x_1 & ldots & x_n x_1 & t & & 0 vdots & & ddots & x_n & 0 & & t end{array}right) succeq 0.$
Thus, a second-order cone constraint can be written as an LMI with an ‘‘arrow’’ matrix.

Examples

SDP relaxation of boolean problems

In lecture 5, we have seen LP relaxations for boolean LPs, which are LPs with the added non-convex requirement that the decision variable should be a boolean vector. This approach does not easily extend to the case when the problem involves quadratics.

To illustrate this, consider the max-cut problem. We are given a graph with vertices labeled 1,ldots,n

, with weights $w_{ij}=w_{ji}$ for every pair of vertices (i,j)

. We seek a cut (a subset

) such that the total weight of all the edges that leave

is maximized. This can be expressed as the quadratic boolean problem
$max_{x} : frac{1}{2} sum_{i<j} w_{ij} (1-x_i x_j) ~:~ x in {-1,1}^n.$
The problem is NP-hard. In 1995, Goemans and Williamson introduced an SDP relaxation (upper bound) of the problem, and showed that its value is at most within 15%

of the optimal value of the combinatorial problem above, independent of problem size.

To explain the relaxation, let us embed the above problem into the more general problem of maximizing a given quadratic form over a boolean set:
$p^ast := max_{x in {-1,1}^n} : x^TWx,$
where $W in {cal S}^n$ is a given symmetric matrix. (As an exercise, try to cast the max-cut problem into the above formalism.)

We can express the problem as
$p^ast = max_{X} : mbox{bf Tr} WX ~:~ X succeq 0, ;; X_{ii} = 1, ;; i=1,ldots,n, ;; mbox{bf rank} X = 1.$
Indeed,

is feasible for the above if and only X = xx^T

for some $x in {-1,1}^n$ , in which case $mbox{bf Tr} WX = x^TWx$ .

Relaxing (meaning: ignoring) the rank constraint leads to the upper bound p^ast le d^ast

, where
$d^ast := max_{X} : mbox{bf Tr} WX ~:~ X succeq 0, ;; X_{ii} = 1, ;; i=1,ldots,n .$
The above is an SDP (in standard form). Nesterov has shown that, for arbitrary matrices

, the above relaxation is within pi/2

of the true value, that is, p^ast le d^ast le (pi/2) p^ast

Non-convex quadratic optimization

The approach used above can be applied to general non-convex quadratic optimization, which has the form
min_x : q_0(x) ~:~ q_i(x) le 0, ;; i=1,ldots,m,

where $x in mathbf{R}^n$ is the decision variable, and q_i

’s are quadratic functions, of the form
q_i(x) := x^TQ_i x + 2q_i x + p_i, ;; i=1,ldots,m,

with $Q_i in {cal S}^n$ , $q_i in mathbf{R}^n$ and $p_i in mathbf{R}$ given. The above problem is not convex in general (we have not imposed positive semi-definiteness on the Q_i

’s).

We note that q_i(x) = L_i(xx^T,x)

, with $L_i : {cal S}^n times mathbf{R}^n rightarrow mathbf{R}$ the affine functions:
$L_i(X,x) := mbox{bf Tr} XQ_i + 2q_i x + p_i, ;; i=1,ldots,m.$
We can express the problem as
$min_{X,x} : L_0(X,x) ~:~ L_i(X,x) le 0, ;; i=1,ldots,m, ;; X succeq 0, ;; mbox{bf rank}(X) = 1.$
Dropping the rank constraint leads to an SDP relaxation (lower bound):
$min_{X,x} : L_0(X,x) ~:~ L_i(X,x) le 0, ;; i=1,ldots,m, ;; X succeq 0 .$
The above relaxation can be arbitrarily bad, but in some cases it is exact. For example, in the case of a single inequality constraint ( m=1

), the SDP relaxation provides the exact value of the original non-convex problem.

Stability analysis of uncertain dynamical systems. Consider a time-varying dynamical system of the form

with $x(t) in mathbf{R}^n$ the state vector, and $A(t) in mathbf{R}^{n times n}$ . We assume that all is known about A(t)

is that $A(t) in {cal A} = {A_1,ldots,A_L}$ , with $A_iin mathbf{R}^{n times n}$ , i=1,ldots,L

given matrices. We say that the system is asymptotically stable if, for every initial condition $x(0) in mathbf{R}^n$ , the resulting trajectory goes to zero: x(t) rightarrow 0

. Note that ascertaining asymptotic stability of the above ‘‘switched’’ system is hard in general, so we look for a tractable sufficient condition for asymptotic stability.

Let us denote by |cdot|

the largest singular value norm. Clearly, if for every t ge 0

, we have

for some

, then

, which shows that the state vector goes to zero at least as fast as $sigma^{t}$ . The norm condition ‘‘ |A(t) | <1

for every

’’ is conservative, since it implies that the state decreases monotonically.

To refine the norm-based sufficient condition for asymptotic stability, we can use scaling. For $S in mathbf{R}^{n times n}$ a given invertible matrix, we define bar{x}(t) := Sx(t)

. In the new state space defined by bar{x}

, the state equations become bar{x}(t+1) = bar{A}(t) bar{x}(t)

, with $bar{A}(t) := SA(t)S^{-1}$ . The asymptotic stability of the system is independent of the choice of

, since

converges to zero if and only if x(t)

does. However, the norm-based sufficient condition for asymptotic stability is not independent of the choice of

. Indeed, if we impose that condition on bar{A}(t)

, we obtain $|SA(t)S^{-1}| < 1$ . In turn, this condition writesfootnote{Here, we use the fact that for a matrix

, the condition |A| < 1

is equivalent to A^TA prec I

(try to show this).}
A(t)^T P A(t) prec P,

where

. (The original norm-based sufficient condition is recovered with P = I

We conclude that the existence of $Pin{cal S}^n$ such that
P succ 0, ;; forall : i=1,ldots,L ~:~ A_i^T P A_i prec P

ensures the stability of the system, regardless of the choice of the trajectory of the matrix A(t) in {cal A}

. The above condition is an LMI in matrix variable

Note that since for fixed Psucc 0

, the set ${ A ::: A^TPA prec P}$ is convex, requiring that it contains the finite set {cal A}

is the same as requiring that it contains its convex hull. So the above condition also ensures stability of the system, when A(t)

is allowed to be any time-varying convex combination of the matrices A_i

Also, note that when L=1

, the above condition turns out to be exact, and equivalent to the condition that all the eigenvalues of the matrix

lie in the interior of the unit circle of the complex plane. This is the basic result of the so-called Lyapunov asymptotic stability theorem for linear systems.