Solving linear equations via the QR decomposition

  • Basic idea

  • The QR decomposition of a matrix

  • Solution via full QR decomposition

  • Set of solutions

Basic idea: reduction to triangular systems of equations

Consider the problem of solving a system of linear equations Ax = y, where A in mathbf{R}^{m times n} and y in mathbf{R}^m are given.

The basic idea in the solution algorithm starts with the observation that in the special case when A is upper triangular, that is, A_{ij} = 0 if i < j, then the system can be easily solved by a process known as backward substitution. In backward substitution we simply start solving the system by eliminating the last variable first, then proceed to solve backwards. The process is illustrated in this example, and described in generality here.

The QR decomposition of a matrix

The QR decomposition allows to express any m times n matrix A as the product A = QR where Q is m times m and orthogonal (that is, Q^TQ = I_m) and R is upper triangular. For more details on this, see here.

Once the QR factorization of A is obtained, we can solve the system by first pre-multiplying with Q both sides of the equation:

 QR x = y Longleftrightarrow Rx = Q^Ty.

This is due to the fact that Q^TQ = I_m. The new system Rx = Q^Ty is triangular and can be solved by backwards substitution. For example, if A is full column rank, then R is invertible, so that the solution is unique, and given by x = R^{-1}Q^Ty.

Let us detail the process now.

Using the full QR decomposition

We start with the full QR decomposition of A with column permutations:

                          AP = QR = left( begin{array}{cc} Q_1 & Q_2 end{array} right) left( begin{array}{cc} R_1 & R_2 0 & 0 end{array} right)

where

  • Q = [Q_1,Q_2] is m times m and orthogonal (Q^TQ = I_m);

  • Q_1 is {m times r}, with orthonormal columns (Q^TQ=I_r);

  • Q_2 is {mtimes (m-r)}, with orthonormal columns (Q^TQ=I_{m-r});

  • r is the rank of A;

  • R_1 is r times r upper triangular, and invertible;

  • R_2 is a r times (n -r) matrix;

  • P is a m times m permutation matrix (thus, P^T = P^{-1}).

  • The zero submatrices in the bottom (block) row of R have m-r rows.

Using A = QRP^T, we can write Rz = Q^Ty, where z := P^Tx. Let's look at the equation in z in expanded form:

 left( begin{array}{cc} R_1 & R_20 & 0 end{array} right)          left( begin{array}{c} z_1z_2 end{array} right) = left( begin{array}{c} Q_1^TyQ_2^Ty end{array} right) .

We see that unless Q_2^Ty = 0, there is no solution. Let us assume that Q_2^Ty = 0. We have then

 R_1 z_1 + R_2 z_2 = Q_1^Ty,

which is a set of r linear equations in n variables.

A particular solution is obtained upon setting z_2 = 0, which leads to a triangular system in z_1, with an invertible triangular matrix R_1. Hence z_1 = R_1^{-1}Q_1^Ty, which corresponds to a particular solution x_0 to Ax=y:

  x_0 := P  left( begin{array}{c} R_1^{-1}Q_1^Ty0 end{array} right).

Set of solutions

We can also generate all the solutions, by noting that z_2 is a free variable. We have

  x = Pleft( begin{array}{c}R_1^{-1}Q_1^T(y-R_2z_2)0 end{array} right) = x_0 + Lz_2,

where

 L : = -Pleft( begin{array}{c}R_1^{-1}Q_1^TR_20 end{array} right).

The set of solutions is the affine set x_0+mbox{range}(L).