Scalar Product, Norms and Angles

  • Scalar product

  • Norms

  • Three popular norms

  • Cauchy-schwartz inequality

  • Angles between vectors

Scalar product


The scalar product (or, inner product, or dot product) between two vectors x,y in mathbf{R}^n is the scalar denoted x^Ty, and defined as

 x^Ty = sum_{i=1}^n x_i y_i.

The motivation for our notation above will come later, when we define the matrix-matrix product. The scalar product is also sometimes denoted langle x, y rangle, a notation which originates in physics.

In matlab, we use a notation consistent with a later definition of matrix-matrix product.

Matlab syntax
>> x = [1; 2; 3]; y = [4; 5; 6];
>> scal_prod = x'*y;



We say that two vectors x,y in mathbf{R}^n are orthogonal if x^Ty = 0.

Example: Two orthogonal vectors in mathbf{R}^3.



Measuring the size of a scalar value is unambiguous — we just take the magnitude (absolute value) of the number. However, when we deal with higher dimensions, and try to define the notion of size, or length, of a vector, we are faced with many possible choices. These choices are encapsulated in the notion of norm.

Norms are real-valued functions that satisfy a basic set of rules that a sensible notion of size should involve. You can consult the formal definition of a norm here. The norm of a vector v is usually denoted |v|.

Three popular norms

In this course, we focus on the following three popular norms for a vector x in mathbf{R}^n:

alt text 

The Euclidean norm:

 |x|_2 := sqrt{sum_{i=1}^n x_i^2}=sqrt{x^Tx},

corresponds to the usual notion of distance in two or three dimensions. The set of points with equal l_2-norm is a circle (in 2D), a sphere (in 3D), or a hyper-sphere in higher dimensions.

alt text 

The l_1-norm:

 |x|_1 = sum_{i=1}^n |x_i|,

corresponds to the distance travelled on a rectangular grid to go from one point to another.

alt text 

The l_infty-norm:

 |x|_infty := displaystylemax_{1 le i le n} |x_i| ,

is useful in measuring peak values.

Matlab syntax
>> x = [1; 2; -3];
>> r2 = norm(x,2); % l2-norm
>> r1 = norm(x,1); % l1 norm
>> rinf = norm(x,inf); % l-infty norm


  • A given vector will in general have different ‘‘lengths" under different norms. For example, the vector x = [1,-2,3]^T yields |x|_2 =3.7417, |x|_1 = 6, and |x|_infty = 3.

  • Sample standard deviation.

Cauchy-Schwartz inequality

The Cauchy-Schwartz inequality allows to bound the scalar product of two vectors in terms of their Euclidean norm.

Theorem: Cauchy-Schwartz inequality

For any two vectors x,y in mathbf{R}^n, we have

 x^Ty le |x|_2 cdot |y|_2 .

The above inequality is an equality if and only if x,y are collinear. In other words:

 max_{x ::: |x|_2 le 1} : x^Ty = |y|_2,

with optimal x given by x^ast = y/|y|_2 if y is non-zero.

For a proof, see here. The Cauchy-Schwartz inequality can be generalized to other norms, using the concept of dual norm.

Angles between vectors

When none of the vectors x,y is zero, we can define the corresponding angle as theta such that

 cos theta = frac{x^Ty}{|x|_2 |y|_2} .

Applying the Cauchy-Schwartz inequality above to (x,y) and (x,-y) we see that indeed the number above is in [-1,1].

The notion above generalizes the usual notion of angle between two directions in two dimensions, and is useful in measuring the similarity (or, closeness) between two vectors. When the two vectors are orthogonal, that is, x^Ty = 0, we do obtain that their angle is theta = 90^circ.