Orthogonalization: the Gram-Schmidt procedure

Orthogonalization
Projection on a line
Gram-Schmidt procedure

A basis $(u_i)_{i=1}^n$ is said to be orthogonal if u_i^Tu_j = 0 if i ne j . If in addition, |u_i|_2 = 1 , we say that the basis is orthonormal.

Example: An orthonormal basis in ${mathbf{R}^3}$ . The collection of vectors ${u_1,u_2}$ , with

$u_1 = frac{1}{sqrt{2}} left(begin{array}{c} 1 1 end{array}right), ;; u_2 = frac{1}{sqrt{2}} left(begin{array}{c} 1 -1 end{array}right),$

forms an orthonormal basis of $mathbf{R}^2$ .

What is orthogonalization?

Orthogonalization refers to a procedure that finds an orthonormal basis of the span of given vectors.

Given vectors $a_1,ldots,a_k in mathbf{R}^n$ , an orthogonalization procedure computes vectors $q_1,ldots,q_n in mathbf{R}^n$ such that

$S:=mbox{span} left{ a_1, ldots, a_k right} = mbox{span} left{ q_1, ldots, q_r right} ,$

where is the dimension of , and

q_i^Tq_j = 0 ;; (ine j), ;; q_i^Tq_i = 1 , ;; 1le i,j le r.

That is, the vectors (q_1,ldots,q_r) form an orthonormal basis for the span of the vectors a_1,ldots,a_k .

Basic step: projection on a line

A basic step in the procedure consists in projecting a vector on a line passing through zero. Consider the line

L(q) := left{ tq ~:~ t in mathbf{R} right} ,

where $q in mathbf{R}^n$ is given, and normalized ( |q|_2 = 1 ).

The projection of a given point $a in mathbf{R}^n$ on the line is a vector located on the line, that is closest to (in Euclidean norm). This corresponds to a simple optimization problem:

The vector $a_{rm proj} := t^ast q$ , where t^ast is the optimal value, is referred to as the projection of on the line L(q) . As seen here, the solution of this simple problem has a closed-form expression:

$a_{rm proj} = (q^Ta) q.$

Note that the vector can now be written as a sum of its projection and another vector that is orthogonal to the projection:

$a = (a-a_{rm proj}) + a_{rm proj} = (a-(q^Ta)q) + (q^Ta)q,$

where $a-a_{rm proj} = a-(q^Ta)q$ and $a_{rm proj} = (q^Ta)q$ are orthogonal. The vector $a-a_{rm proj}$ can be interpreted as the result of removing the component of along .

Gram-Schmidt procedure

The Gram-Schmidt procedure is a particular orthogonalization algorithm. The basic idea is to first orthogonalize each vector w.r.t. previous ones; then normalize result to have norm one.

Case when the vectors are independent

Let us assume that the vectors a_1,ldots,a_n are linearly independent. The GS algorithm is as follows.

Gram-Schmidt procedure:

set $tilde{q}_1 = a_1$ .
normalize: set $q_1 = tilde{q}_1 / |tilde{q}_1|_2$ .
remove component of in : set $tilde{q}_2 = a_2 - (a_2^Tq_1)q_1$ .
normalize: set $q_2 = tilde{q}_2/|tilde{q}_2|_2$ .
remove components of in : set $tilde{q}_3 = a_3 - (a_3^Tq_1)q_1 - (a_3^Tq_2)q_2$ .
normalize: set $q_3 = tilde{q}_3/|tilde{q}_3|_2$ .
etc.

The image of the left shows the GS procedure applied to the case of two vectors in two dimensions. We first set the first vector to be a normalized version of the first vector a_1 . Then we remove the component of a_2 along the direction q_1 . The difference is the (un-normalized) direction $tilde{q}_2$ , which becomes q_2 after normalization. At the end of the process, the vectors q_1,q_2 have both unit length and are orthogonal to each other.

The GS process is well-defined, since at each step $tilde{q}_i ne 0$ (otherwise this would contradict the linear independence of the a_i 's).

General case: the vectors may be dependent

It is possible to modify the algorithm to allow it to handle the case when the a_i 's are not linearly independent. If at step , we find $tilde{q}_i = 0$ , then we directly jump at the next step.

Modified Gram-Schmidt procedure:

set .
for :
1. set $tilde{q} = a_i - sum_{j=1}^r (q_j^Ta_i)q_j$ .
2. if , ; $q_r = tilde{q}/|tilde{q}|_2$ .

On exit, the integer is the dimension of the span of the vectors a_1,ldots,a_k .