Cardinality minimization: the L1-norm trick

  • Cardinality mimimization problem

  • The l_1-norm heuristic

  • Why does it work?

  • Application: piece-wise constant fitting

Cardinality Minimization

Many problems in engineering and scientific computing can be cast as

 min_x : mbox{bf Card} (x) ~:~ x in mathbf{P},

where mathbf{P} is a polyhedron (or, more generally, a convex set), and mbox{bf Card}(x) denotes the cardinality (number of non-zero elements) of the vector x. Such problems seek a ‘‘sparse’’ solution, one with many zeroes in it.

A related problem is a penalized version of the above, where we seek to trade-off an objective function against cardinality:

 min_x : f(x) + lambda mbox{bf Card} (x)  ~:~ x in mathbf{P},

where f is some (usually convex) ‘‘cost’’ function, and lambda>0 is a penalty parameter.

Cardinality minimization is a hard problem in general. It appears in many areas, such as classification.

The l_1-norm heuristic

The l_1-norm heuristic consists in replacing the (non-convex) cardinality function mbox{bf Card}(x) with a polyhedral (hence, convex) one, involving the l_1-norm. This heuristic leads to replace the problem at the top with

 min_x : |x|_1 ~:~ x in mathbf{P},

which is an LP (provided mathbf{P} is a polyhedron).

If mathbf{P} is described via affine inequalities, as mathbf{P} = { x ::: Ax le b}, with A a matrix and b a vector existing in the matlab workspace, then the following CVX snippet solves the above problem.

CVX implementation
cvx_begin
   variable x(n,1)
   minimize( norm(x,1) )
   subject to
       Ax <= b;
cvx_end

Why does it work?

The l_1-norm heuristic often works well, especially if the variable x is bounded.

Often the variable is not bounded a priori, but its size is penalized, say by a squared l_2-norm. Consider an optimization problem of the form

  min_x : c^Tx + x^Tx ~:~ Ax le b, ;; mbox{bf Card}(x) le k,

where k in {1,ldots,n} is a given integer that bounds the number of non-zeroes in the variable. Invoking the Cauchy-Schwartz inequality between x and the vector y with 1's where x is non-zero, and 0's elsewhere, we obtain (with |x| the vector with elements |x_i|, i=1,ldots,n):

 forall : x ~:~ |x|_1 = y^T|x| le |y|_2 cdot |x|_2 = sqrt{k} |x|_2.

Hence, instead of solving the difficult problem above, we can solve for a lower bound:

  min_x : c^Tx + frac{1}{k} |x|_1^2 ~:~ Ax le b .

The above problem is convex. In fact, it can be written as the QP, by adding new variables:

 min_{x,u,t} : c^Tx + frac{1}{k} t^2 ~:~ t ge sum_{i=1}^n u_i , ;; Ax le b , ;; -u le x le u.

As often, CVX allows a direct implementation of the problem, but care must be taken to use the square_pos function to model the squared l_1-norm, as discussed in do's and don'ts of CVX.

CVX implementation
cvx_begin
    variable x(n)
    minimize( c'*x + (1/k^2)*square_pos(norm(x,1)) )
cvx_end

Applications