Linear regression

A popular example of least-squares problem arises in the context of fitting a line through points. This is illustrated below in two dimensions.

alt text 

In this example we seek to analyze how customers react to an increase in the price of a given item. We are given two-dimensional data points (z_i,y_i), i=1,ldots,m. The z_i's contain the prices of the item, and the y_i's the average number of customers who buy the item.

The generic equation of a non-vertical line is y = x_1 z+x_2, where x=(x_1,x_2) contains the decision variables. The quality of the fit of a generic line is measured via the sum of the squares of the error in the component y (blue dotted lines). Thus, the best least-squares fit is obtained via the least-squares problem

 min_x : sum_{i=1}^m (x_1 z_i+x_2 - y_i)^2 .

Once the line is found, it can be used to predict the value of the average number of customers buying the item (y) for a new price (z). The prediction is shown in red.

The linear regression approach can be extended to multiple dimensions, that is, to problems where the z-axis in the above problem contains more than one dimension (see here). It can also be extended to the problem of fitting non-linear curves.