• Definitions and interpretation

  • First-order approximation of non-linear maps

Definition and Intepretation

Definition

A map f: mathbf{R}^n rightarrow mathbf{R}^m is linear (resp. affine) if and only if every one of its components is. The formal definition we saw here for functions applies verbatim to maps.

To an m times n matrix A, we can associate a linear map f : mathbf{R}^n rightarrow mathbf{R}^m, with values f(x) = Ax. Conversely, to any linear map, we can uniquely associate a matrix A which satisfies f(x) = Ax for every x.

Indeed, if the components of f, f_i, i=1,ldots,m, are linear, then they can be expressed as f_i(x) = a_i^Tx for some a_i in mathbf{R}^n. The matrix A is the matrix that has a_i^T as its i-th row:

 f(x) = left(begin{array}{c} f_1(x)  vdots f_n(x) end{array}right) = left(begin{array}{c} a_1^Tx  vdots a_n^Tx end{array}right) = Ax, ;; mbox{ with } A := left(begin{array}{c} a_1^T vdots a_m^T end{array}right) in mathbf{R}^{m times n}.

Hence, there is a one-to-one correspondence between matrices and linear maps. This is extending what we saw for vectors, which are in one-to-one correspondence with linear functions.

This is summarized as follows.

Representation of affine maps via the matrix-vector product. A function f: mathbf{R}^n rightarrow mathbf{R}^m is affine if and only if it can be expressed via a matrix-vector product:

 f(x) = Ax+b,

for some unique pair (A,b), with A in mathbf{R}^{m times n} and b in mathbf{R}^m. The function is linear if and only if b = 0. diamondsuit

The result above shows that a matrix can be seen as a (linear) map from the ‘‘input“ space mathbf{R}^n to the ‘‘output” space mathbf{R}^m. Both points of view (matrices as simple collections of vectors, or as linear maps) are useful.

Interpretations

Consider an affine map x rightarrow y = Ax+b. An element A_{ij} gives the coefficient of influence of x_j over y_i. In this sense, if A_{13} >>A_{14} we can say that x_3 has much more influence on y_1 than x_4. Or, A_{24} = 0 says that y_2 does not depend at all on x_4. Often the constant term b = f(0) is referred to as the ‘‘bias’’ vector.

First-order approximation of non-linear maps

Since maps are just collections of functions, we can approximate a map with a linear (or affine) map, just as we did with functions here.

If f : mathbf{R}^n rightarrow mathbf{R}^m is differentiable, then we can approximate the (vector) values of f near a given point x_0 in mathbf{R}^n by an affine map tilde{f}:

 f(x) approx tilde{f} (x) : = f(x_0) + A (x-x_0) ,

where A_{ij} = frac{partial f_i}{partial x_j}(x_0) is the derivative of the i-th component of f with respect to x_j. (A is referred to as the Jacobian matrix of f atx_0.)

Examples: