{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# EECS 16A Discussion 6D" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Problem 3: Polynomial Fitting" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In this discussion, we are trying to fit data (observations) of the form $\\{(x_i,y_i),i=1,2,...,n\\}$ to a polynomial that we know looks like this:\n", "\n", "$$y = f(x) = a_0 + a_1x + a_2x^2 + a_3x^3 + a_4x^4$$\n", "\n", "In other words, we want to find $a_0$, $a_1$, $a_2$, $a_3$, and $a_4$ that best fit the data.\n", "\n", "More generally, we might want to fit the data to a polynomial of different degree (for instance, if we do not know that the polynomial looks like as above), so we could try to solve for some $a_0,a_1,\\ldots,a_d$ that define a $d$-degree polynomial.\n", "\n", "Note that the observations are not perfect -- they are *noisy*, which means that $y_i \\neq f(x_i)$ in general! That is what makes this problem interesting.\n", "\n", "This first block of code contains functions that will help us set up some useful objects -- the polynomial curve for a set of parameters $a_0$, $a_1$, $a_2$, $a_3$, and $a_4$ and a \"data\" matrix that we will use to compute the least squares estimate later." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Initialization" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "import numpy.matlib\n", "import matplotlib.pyplot as plt\n", "from mpl_toolkits.mplot3d import Axes3D\n", "from numpy import pi, cos, exp\n", "\n", "%matplotlib inline" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Construct Data Matrix\n", "Use this block to make the data matrix. The input data is given in the array x_data, and the output data is given in the array y_data.\n", "\n", "To construct the data matrix, which we call DataMat, you only need to specify what degree polynomial you will use to fit the data.\n", "\n", "If x_data has the form x_{data}$=\\begin{bmatrix}x_1\\\\x_2\\\\ \\vdots \\\\x_n \\end{bmatrix}$, then DataMat has the form DataMat $=\\begin{bmatrix}1 &x_1&x_1^2& \\cdots &x_1^d\\\\1 &x_2&x_2^2& \\cdots &x_2^d\\\\ \\vdots & \\vdots & \\vdots & \\ddots & \\vdots \\\\1 &x_n&x_n^2& \\cdots &x_n^d \\end{bmatrix}$, where $d$ is the degree of the polynomial. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Function that defines a data matrix for some input data" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "\"\"\"You'll understand why it's constructed like this after\n", "doing the worksheet!\"\"\"\n", "def data_matrix(input_data,degree): \n", " # degree is the degree of the polynomial you plan to fit the data with \n", " Data=np.zeros((len(input_data),degree+1))\n", " \n", " for k in range(0,degree+1):\n", " Data[:,k]=(list(map(lambda x:x**k ,input_data)))\n", " \n", " return Data\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Function that helps with plotting " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "\"\"\"Function that defines the polynomial curve for a set of parameters and a range. The set of parameters defines the degree of the\n", "polynomial.\"\"\"\n", "def poly_curve(params,x_input):\n", " # params contains the coefficients that multiply the polynomial terms, in degree of lowest degree to highest degree\n", " degree=len(params)-1\n", " x_range=[np.min(x_input), np.max(x_input)]\n", " x=np.linspace(x_range[0],x_range[1],1000)\n", " y=x*0\n", " \n", " for k in range(0,degree+1):\n", " coeff=params[k]\n", " y=y+list(map(lambda z:coeff*z**k,x)) \n", " return x,y\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Printing the data matrix" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "x_data = [0.0, 0.5, 1, 1.5, 2, 2.5, 3, 3.5, 4, 4.5]\n", "y_data = [24.0, 6.61, 0, -0.95, 0.07, 0.73, -0.12, -0.83, -0.04, 6.42]\n", "\n", "## Use the function data_matrix to compute D\n", "D = ?\n", "print(D)\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Least Squares\n", "Recall the system of equations we are trying to solve here:\n", "\n", "$$\\texttt{D}\\vec{a} = \\vec{y},$$ where $\\vec{a} = \\begin{bmatrix}a_0\\\\a_1\\\\:\\\\a_d\\end{bmatrix}$ and $\\vec{y} = \\begin{bmatrix}y_1\\\\y_2\\\\ : \\\\y_n \\end{bmatrix}$.\n", "\n", "In the next block, you will implement code to compute $\\vec{a}$, the set of optimal polynomial coefficients (optimal in a least squares sense) to fit the data. Put the coefficients in an array called params." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Finding the least squares solution" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "params = ?\n", "print(params)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Plot Curve Fit\n", "Use the next block to compare your fitted polynomial to the data. All you need to do is enter the polynomial coefficients (in degree of lowest degree to highest degree) into an array called params. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Plotting the fitted polynomial and the original datapoints" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fig=plt.figure()\n", "ax=fig.add_subplot(111,xlim=[-1,5],ylim=[-5,25])\n", "x,y=poly_curve(params,x_data) ###### Plotting the fitted data\n", "\n", "ax.plot(x_data,y_data,'.b',markersize=15) ###### Plotting the original data\n", "ax.plot(x,y,'r') \n", "ax.legend(['Data points','line fit'])\n", "plt.title('Polynomial of Degree %d' %(len(params)-1),fontsize=16)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## [Practice] Another Polynomial Fitting Problem" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Input and output data\n", "\n", "x_data is the input (set of $x_i$'s), and y_data is the output (set of $y_i$'s). Here we plot our test points. Does it look like $y$ should be a quartic polynomial in $x$?" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "np.random.seed(10)\n", "\n", "# Parameters corresponding to the polynomial we want: $y = 1600 + 40x -102x^2 -x^3 + x^4$\n", "smarap=np.array([1600,40,-102,-1,1])/100.0\n", "x_data=np.linspace(-11,11,200)\n", "y_data=np.dot(data_matrix(x_data,4),smarap)+(np.random.rand(len(x_data))-0.5)*10\n", "\n", "y_data_training = y_data[0::2]\n", "x_data_training = x_data[0::2]\n", "y_data_test = y_data[1::2]\n", "x_data_test = x_data[1::2]\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Plotting the original datapoints" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# x_data and y_data have already been defined\n", "\n", "fig=plt.figure()\n", "ax=fig.add_subplot(111,xlim=[-11,11],ylim=[-21,21])\n", "x,y=poly_curve(smarap,x_data)\n", "ax.plot(x_data,y_data,'.b',markersize=15)\n", "ax.legend(['Data points'])\n", "print(len(x_data))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Generating the matrix D" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "degree=4\n", "D = data_matrix(x_data,degree)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Finding the least squares solution" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "\n", "params = np.linalg.inv(np.transpose(D) @ D) @ np.transpose(D) @ y_data" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Plot Curve Fit\n", "Use the next block to plot the fitted polynomial and find the magnitute of error. \n", "\n", "Once you're done, re-run all the blocks of code for this problem for $\\textbf{different degrees}$! What do you observe?" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "error = np.linalg.norm(y_data-np.dot(D,params))\n", "print(error)\n", "\n", "fig=plt.figure()\n", "ax=fig.add_subplot(111,xlim=[-11,11],ylim=[-21,21])\n", "x,y=poly_curve(params,x_data)\n", "\n", "ax.plot(x_data,y_data,'.b',markersize=15)\n", "ax.plot(x,y,'r') \n", "ax.legend(['Data points','line fit'])\n", "plt.title('Polynomial of Degree %d, ||error||=%f' %(len(params)-1,error),fontsize=16)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "anaconda-cloud": {}, "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.1" } }, "nbformat": 4, "nbformat_minor": 1 }