{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "#Eigenfaces" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Load the principal components and a few sample faces" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "%matplotlib inline\n", "import numpy as np\n", "import matplotlib.pyplot as plt\n", "import matplotlib.cm as cm\n", "import scipy.io as sio\n", "\n", "mat_contents = sio.loadmat('data/sample_faces.mat') # This file contains 10 example faces from the training set\n", "faces = mat_contents['faces']\n", "mat_contents = sio.loadmat('data/eigenface_components.mat') # This file contains the first 250 principal components\n", "Upca = mat_contents['Upca']\n", "mat_contents = sio.loadmat('data/eigenface_var.mat') # This file contains the singular values of the components\n", "var = mat_contents['var']\n", "mat_contents = sio.loadmat('data/eigenface_points.mat') # This file contains a set of 400 low-dimensional faces\n", "points = mat_contents['test']" ] }, { "cell_type": "markdown", "metadata": { "collapsed": true }, "source": [ "Let's take a look at some of the sample faces" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "# The faces are stored in a 62500x10 matrix where a column of the matrix contains the pixel values\n", "# of the image with each column of pixels placed one after the other\n", "for i in range(1,4): # This will plot the first 3 faces in the set \"faces\"\n", " fig = plt.figure()\n", " face = faces[:,i]\n", " face = np.reshape(face, (250,250), order='F') # We need to reshape the vector into a 250x250 image\n", " plt.imshow(face, cmap=cm.Greys_r)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now let's see what happens when project these faces onto the first 50 principal components" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "# Your Code Here:\n", "d = 100 #STUDENT, adjust this value to control the number of principal components used\n", "# you can set d to be any value between 1 and 100\n", "## # End Code #\n", "\n", "for i in range(1,4): # This will plot the first 3 faces in the set \"faces\"\n", " fig = plt.figure()\n", " face = faces[:,i]\n", " face = np.dot(Upca[:,0:d].T, face) # Project down into the d dimensional space\n", " face = np.dot(Upca[:,0:d], face) # Reconstruct the original image\n", " face = np.reshape(face, (250,250), order='F')\n", " plt.imshow(face, cmap=cm.Greys_r)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's take a look at what information is contained in the first 5 principal components to get an idea of what each component is adding to the image" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "# Your Code Here:\n", "for i in range(1,6): #STUDENT, adjust this range to see what information is contained in other principal components\n", " fig = plt.figure()\n", " face = Upca[:,i]\n", " face = np.reshape(face, (250,250), order='F')\n", " plt.imshow(face, cmap=cm.Greys_r)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The first two principal components capture the largest amount of variance in the data set. Therefore, we can attempt to visualize the data set by plotting the first two dimensions of the transformed faces." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "# Set up the plot so that it's the same size regardless of the data we plot\n", "fig = plt.figure(figsize=(12,8))\n", "plt.axes(xlim=(-200,50), ylim=(-100,50))\n", "\n", "# Your Code Here:\n", "#STUDENT, adjust these values to plot different dimensions of the low-dimensional faces\n", "x = 0 # This is the index of the principal component to plot on the x axis\n", "y = 1 # This is the index of the principal component to plot on the y axis\n", "## # End Code #\n", "# Plot the set of 400 faces picked randomly from the data set\n", "plt.plot(points[x,:],points[y,:],'.b',markersize=10)\n", "lowFaces = np.dot(Upca.T, faces)\n", "\n", "# Plot the 3 sample faces from above as red points to see where they lie compared to the data\n", "r=range(1,4)\n", "plt.plot(lowFaces[x,r],lowFaces[y,r],'.r',markersize=16)\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Because we can compress an image of a face to a lower dimensional space, we can try to generate a \"random\" face by picking a point in the low dimensional space and then reconstructing the resulting image by multiplying by the transformation matrix U. Let's see what kind of results we can get." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "# We want to generate a random vector, but we don't want to pick a random point that lies far away\n", "# from the training data. Therefore, we use the singular values and the mean of the training data to\n", "#ensure that the random vector is close to the training data\n", "\n", "d = 100 #STUDENT, adjust this value to control the number of principal components used\n", "# you can set d to be any value between 1 and 100\n", "\n", "cov = np.diag(np.asarray(var.T[0])) # This matrix uses the singular values to specify\n", " # the amount of variance in each dimension\n", "cov = cov[0:d,0:d]\n", "mean = np.mean(points, axis=1) # We use the set of 400 examples to approximate the mean for each principal component\n", "\n", "for i in range(1,4): # Let's generate a few examples\n", " fig = plt.figure()\n", " face = np.asmatrix(np.random.multivariate_normal(mean[0:d],cov)).T\n", " face = np.dot(Upca[:,0:d], face) # Transform the d-dimensional vector to the 250x250 image space\n", " face = np.reshape(face, (250,250), order='F')\n", " plt.imshow(face, cmap=cm.Greys_r)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Image Processing by Clustering" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false, "scrolled": true }, "outputs": [], "source": [ "# set up the environment\n", "%pylab inline\n", "from PIL import Image\n", "from scipy.cluster.vq import vq, kmeans, whiten" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "##Color Quantization" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The goal of color quantization is to reduce the number of colors used in an image, while trying to make the new image visually similar to the original image. This is an essentail technique for lossy image compression. In this homework problem, you will learn how to use the k-means algorithm to perform color quantization." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "def applyCentroids(input_features, centroids):\n", " \"\"\" This function replaces each pixel in input_features with the nearest centroid\n", " and returns as out_features.\n", " input_features is a list of sub-lists. Each sub-list contains features of one data point.\n", " \"\"\" \n", " pixel_index,_ = vq(input_features, centroids) # return the nearest centroid's index in centroids. \n", " out_features = [uint8(centroids[idx]) for idx in pixel_index]\n", " return out_features" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "###(a) Let's start with a small grayscale example." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "A grayscale image is an image, where each pixel is expressed by a single value, the intensity information. That is, there is only one feature for each pixel (data point)." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "example_gray = [[111, 178, 113, 0], # a list of sublists; each sub-list stands for one row.\n", " [ 63, 115, 27, 89],\n", " [175, 234, 135, 122],\n", " [123, 134, 235, 169]]\n", "example_input = [uint8(row) for row in example_gray] \n", "imshow(example_input, cmap = cm.Greys_r, interpolation = 'none')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "For the above image, the intensity of each pixel is stored by an unsigned integer, within the range [0, 255].\n", "Now please perform the k-mean algorithm for the 16 pixels on the 4 by 4 image with k = 3. The centroids of the three clusters are the representive colors for pixels. " ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "## Start Student solution\n", "# Your Code Here: please fill in the values of quantilization_example after performing k-means with k = 3\n", "quantization_example = [] # check the format of this list; it should be a list of sub-lists.\n", "## # End Code #\n", "## End Student solution\n", "example_output = [uint8(row) for row in quantization_example]\n", "imshow(example_output, cmap = cm.Greys_r, interpolation = 'none')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "###(b) Let's do color quantization for a grayscale image." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false, "scrolled": true }, "outputs": [], "source": [ "# show the original graysacle image\n", "gray_im = Image.open('figs/gray_photo.jpg', 'r').convert(\"L\") # read in the image in grayscale\n", "gray_list_in = np.asarray(gray_im) # this is a list of sub-lists. Each sub-list represents one row of pixels.\n", "imshow(gray_list_in,cmap = cm.Greys_r) # use grayscale color map to show this image." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "(h, w) = gray_list_in.shape # each pixel only has one feature.\n", "print (\"Image size: height = \", h, \" , width = \", w) # show the height and width for this image.\n", "features_gray = [[double(pixel)] for row in gray_list_in for pixel in row ] # arrange the data for k-means\n", "#Remember gray_list_in is a list of sub-lists. Each sub-list represents one row of pixels.\n", "#Because each pixel only have one feature, each element in one sub-list is the feature for that pixel." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "There are two main tasks in color quantization: (1) decide the representation colors, and (2) determine which representation color each pixel should be assigned. Both the two tasks can be done by the k-means algorithm:\n", "It groups data points into different clusters (with a given k as the number of clusters), and the centroid of each cluster will be the representation color for those pixels inside the cluster." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "## Start Student solution\n", "# Your Code Here: please fill in the value of k-gray!\n", "k_gray = 0 # replace 0 with some natural number.\n", "## # End Code #\n", "## End Student solution\n", "\n", "centroids_gray, distortion_gray = kmeans(features_gray, k_gray) # apply k-means\n", "print (\"grayscale distortion:\", distortion_gray)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "When you modify the value of k_gray, how does distortion_gray change? " ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "out_gray = applyCentroids(features_gray, centroids_gray)\n", "# replace pixels with the centroids of the cluster.\n", "\n", "gray_array_out = np.reshape(np.array(out_gray), (h, w))# arrange pixels back to a nested list for display\n", "imshow(gray_array_out, cmap = cm.Greys_r)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Please play with different values of k_gray and rerun the k-means algorithm. How does the image change?" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### (c) Now Let's try to work on a color image!" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Color digital images are composed of pixels, which are made of combinations of primary colors. Here we use the RGB (Red, Green, Blue) color model to produce colors on images. There are three features for each pixel (data): R, G, and B for intensities of each primary color. Those values are stored by unsigned integers, within the range [0, 255]." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false, "scrolled": true }, "outputs": [], "source": [ "color_im = Image.open('figs/color_berkeley.jpg', 'r') # load the image in RGB color model\n", "color_array_in = np.asarray(color_im)\n", "# color_array_in is a list of sub-lists; each sub-list contains sub-sub-lists. \n", "# Each sub-list represents one row of the image.\n", "# Each sub-sub-list contains three elements as R, G, B intensities for each pixel. \n", "imshow(color_array_in)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "h, w, num= color_array_in.shape # each pixel has three features(R, G, B)!\n", "print (\"Image size: height = \", h, \" , width = \", w, \", feature numbers =\", num)\n", "features_color = [double(pixelRGB) for row in color_array_in for pixelRGB in row ]\n", "# flatten the 2D image into a list of sub-lists; each sub-list represents a data point." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Here we want to do color quantization for this color image. There are three features for each data point." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "## Start Student solution\n", "# Your Code Here: please fill in the value of k_color!\n", "k_color = 0 # replace 0 with a natural number.\n", "## # End Code #\n", "## End Student solution\n", "centroids_RGB, distortion_color = kmeans(features_color, k_color) # run k-means\n", "print (\"RGB model distortion:\", distortion_color)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "When you modify the value of k_color, how does distortion_color change?" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "out_colors = applyCentroids(features_color, centroids_RGB) \n", "# replace pixels with the centroids of the clusters.\n", "outImage = np.reshape(np.array(out_colors), (h, w, 3 )) # arrange the pixels for display\n", "imshow(outImage)" ] }, { "cell_type": "markdown", "metadata": { "collapsed": true }, "source": [ "Try to modify the value of k_color, observe the final image. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Image Segmentation\n", "Image segmenatation can be done with a similar flow: Cluster pixels by their features, and then label pixels with the indicating color for its segment." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "def segmentByCentroids(input_features, centroids, k):\n", " \"\"\"This function decides the cluster(segment) each pixel belongs to,\n", " and then uses the indicating color for that segment to label it.\n", " \"\"\"\n", " # \n", " segment_color = [[255, 0, 0], # Red\n", " [ 0, 255, 0], # Green\n", " [ 0, 0, 255], # Blue\n", " [255, 255, 0], # Yellow\n", " [ 0, 0, 0], # Black\n", " [255, 255, 255], # White\n", " [155, 0, 255], # Purple\n", " [255, 128, 0]]; # Orange\n", " if k > 8:\n", " print (\"ERROR! k should not be greater than 8.\\n\")\n", " return input_features\n", " pixel_index,_ = vq(input_features,centroids) # return the nearest centroid's id for each pixel \n", " results = [uint8(segment_color[idx]) for idx in pixel_index] # assign the label each pixel\n", " return results" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### (d) Image segmentation by intensity." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "# show the original image\n", "color_im = Image.open('figs/fish.jpg', 'r')\n", "color_array_in = np.asarray(color_im) \n", "imshow(color_array_in)\n", "h, w, num = color_array_in.shape # each pixel has three features(R, G, B)!\n", "print (\"Image height =\", h, \", image width = \", w, \"color intensity number = \", num)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Take a look of the original image, how many segments (objects) are there?" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "# here we apply specific weights to R, G, and B to get the intensity.\n", "features_intensity = [ [RGB[0]*0.2126+RGB[1]*0.7152+RGB[2]*0.0722] for row in color_array_in for RGB in row] # \n", "\n", "## Start Student solution\n", "# Your Code Here: please fill in the value of k_segment_intensity!\n", "k_segment_intensity = 0 # replace 0 with a natural number.\n", "## # End Code #\n", "## End Student solution\n", "centroids_segment, distortion_segment = kmeans(features_intensity, k_segment_intensity)\n", "print (\"distortion for intensity as features: \", distortion_segment)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Here we use values of [R,G,B] to get the intensity of each pixel.\n", "See wiki of grayscale for more details: https://en.wikipedia.org/wiki/Grayscale" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "results_intensity = segmentByCentroids(features_intensity, centroids_segment, k_segment_intensity) \n", "\n", "outImage = np.reshape(np.array(results_intensity), (h, w, 3 ))\n", "\n", "imshow(outImage)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Does it work as you expected? Modify the value of k_segment_intensity and rerun the flow, do you find something?\n", "If you like this problem, please take the courses about image processing and computer vision." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.4.4" }, "name": "_merged" }, "nbformat": 4, "nbformat_minor": 0 }