Cardinality minimization: the L1-norm trick

LP and QP > Polyhedra | Standard forms | Polyhedral Functions | Min-Cardinality | Applications

Cardinality mimimization problem
The -norm heuristic
Why does it work?
Application: piece-wise constant fitting

Cardinality Minimization

Many problems in engineering and scientific computing can be cast as

$min_x : mbox{bf Card} (x) ~:~ x in mathbf{P},$

where mathbf{P} is a polyhedron (or, more generally, a convex set), and mbox{bf Card}(x) denotes the cardinality (number of non-zero elements) of the vector . Such problems seek a ‘‘sparse’’ solution, one with many zeroes in it.

A related problem is a penalized version of the above, where we seek to trade-off an objective function against cardinality:

$min_x : f(x) + lambda mbox{bf Card} (x) ~:~ x in mathbf{P},$

where is some (usually convex) ‘‘cost’’ function, and lambda>0 is a penalty parameter.

Cardinality minimization is a hard problem in general. It appears in many areas, such as classification.

The -norm heuristic

The -norm heuristic consists in replacing the (non-convex) cardinality function mbox{bf Card}(x) with a polyhedral (hence, convex) one, involving the l_1 -norm. This heuristic leads to replace the problem at the top with

$min_x : |x|_1 ~:~ x in mathbf{P},$

which is an LP (provided mathbf{P} is a polyhedron).

If mathbf{P} is described via affine inequalities, as mathbf{P} = { x ::: Ax le b} , with a matrix and a vector existing in the matlab workspace, then the following CVX snippet solves the above problem.

CVX implementation
cvx_begin
   variable x(n,1)
   minimize( norm(x,1) )
   subject to
       Ax <= b;
cvx_end

Why does it work?

The l_1 -norm heuristic often works well, especially if the variable is bounded.

Often the variable is not bounded a priori, but its size is penalized, say by a squared l_2 -norm. Consider an optimization problem of the form

$min_x : c^Tx + x^Tx ~:~ Ax le b, ;; mbox{bf Card}(x) le k,$

where k in {1,ldots,n} is a given integer that bounds the number of non-zeroes in the variable. Invoking the Cauchy-Schwartz inequality between and the vector with 's where is non-zero, and 's elsewhere, we obtain (with |x| the vector with elements |x_i| , i=1,ldots,n ):

$forall : x ~:~ |x|_1 = y^T|x| le |y|_2 cdot |x|_2 = sqrt{k} |x|_2.$

Hence, instead of solving the difficult problem above, we can solve for a lower bound:

$min_x : c^Tx + frac{1}{k} |x|_1^2 ~:~ Ax le b .$

The above problem is convex. In fact, it can be written as the QP, by adding new variables:

$min_{x,u,t} : c^Tx + frac{1}{k} t^2 ~:~ t ge sum_{i=1}^n u_i , ;; Ax le b , ;; -u le x le u.$

As often, CVX allows a direct implementation of the problem, but care must be taken to use the square_pos function to model the squared l_1 -norm, as discussed in do's and don'ts of CVX.

CVX implementation
cvx_begin
    variable x(n)
    minimize( c'*x + (1/k^2)*square_pos(norm(x,1)) )
cvx_end

Applications