CS 294-26: Intro to Computer Vision and Computational Photography, Fall 2022

Project 3: Face Morphing

Katherine Song (cs-194-26-acj)



Overview

In this project, we apply what we learned in class about manual keypoint selection, Delaunay triangulation, and affine transforms to warp faces to shapes of other faces (or population means), morph one face into another face (shape and color), and create caricatures by extrapolating from a population mean.

Part 1: Defining Correspondences

First, I cropped and rescaled my passport photo to match the dimensions of Martin Schoeller's George portrait:

Me
George

Next, I wrote a function for picking correspondences. My function allows the user to specify how many points to pick and then uses ginput to allow the user to specify alignment points on the side-by-side images. The user must select one point on image 1 then the corresponding point on image 2, then another point on image 1 and the corresponding point on image 2, etc. I selected 65 points on my and George's faces, and then I added the 4 corner points of the image to the alignment points list such that the entire image would be covered by triangles. I then calculated the mean of these points (halfway between each pair of alignment points) and used that to generate the Delaunay triangulation. The resulting keypoints are in red, and the triangulation mesh is drawn in blue on each image below:

Keypoints in red with Delaunay triangulation mesh in blue

Part 2: Computing the "Mid-way Face"

To compute the mid-way face, I first computed a list of "average" points by taking the average of each keypoint in my and George's faces. Since I had used the average points for computing the Delaunay triangulation, this list was simply the points of the triangulation structure. The meat of the task was to find out what pixels each pixel in the mid-way face corresponded to in my and George's photos so that the appropriate pixel color values could be averaged. To do this, we loop over the triangles (not the pixels!) in the average shape. For each triangle in the average shape, I computed the corresponding triangle in my image using the triangulation's simplices. I then used the triangles' points to compute the affine matrix A that transforms points in the triangle in my image to points in the triangle in George's. Since we know that A*p = p' (where p represents the homogeneous coordinates for the points in my image's triangle and p' represents points in the mid-way face triangle), A is simply p'*inv(p). Then, I used polygon to find all the pixels contained in the mid-way face's triangle, and from there, I calculated the corresponding pixels in my image by applying the inverse of A to the pixels inside the triangle (p = inv(A)*p'). I then repeated the process for George's photo, and the colors from the corresponding pixels of my photo and colors from George's were averaged and placed in the mid-way face photo. This process was repeated for every triangle.

I wrote a separate function for this to help me understand what was happening step-by-step, but the mid-way face can also be obtained using the morphed_im function I wrote for Part 3 (described subsequently) with warp_frac and dissolve_frac set to 0.5. The results are below:

Me, George, and the "Mid-way Face"
I was pleased with how "real" (though odd) the mid-way face looked, though there is still a significant amount of ghosting around the face itself; I have a lot of hair where George has none, so hair points couldn't really be keypointed.

Part 3: The Morph Sequence

To morph between 2 images, I wrote a function morph(im1, im2, im1_pts, im2_pts, tri, warp_frac, dissolve_frac) that computes an intermediate morph given warp_frac and dissolve_frac (both 0 at im1 and 1 at im2). I first computed the intermediate shape using warp_frac as (1-warp_frac)*im1_pts + warp_frac*im2_pts. I wrote a helper warp function where, similar to Part 2, I looped over all the triangles in that intermediate shape and computed where, in im1 and im2, the pixels in each triangle "came from." I used the affine transformation matrix A to created a warped version of im1 and warped version of im2. Instead of a straight average, the color placed in each pixel was determined by dissolve_frac as (1-dissolve_frac)*[warped im1] + dissolve_frac*[warped im2]. To make a video sequence, I did this for 46 frames as specified in the assignment and used imageio to create an animated gif with 30 frames/second.

Morph from me to George Clooney

Part 4: The "Mean Face" of a Population

I used the FEI face database that contains grayscale images of 200 individuals. Conveniently, the authors also provided a set with aligned and cropped images and with 46 alignment points each. I chose to use the smiling face (version b) of each individual; I knew that I would be morphing my own picture later, and since my own picture was one of me smiling, I felt that the results would be less weird if I also used the smiling FEI faces to compute the mean face.

After having done the previous parts, calculating the average shape of the population was relatively straightforward. I found the mean values of all alignment points and used that to generate a Delaunay triangulation structure. I then could use the warp and morph functions I wrote earlier to morph any image into the average shape.

Through this process, I went back and modified my warping code to clip any computed triangles and pixels (either in the intermediate shape or in the original shapes after applying the inverse of A to pixels in the intermediate shape) to the image boundaries -- I realized that there were a few instances in which images were different enough from the mean that calculated pixel transformations lay outside of image boundaries, which would throw errors. This clipping only affected the very edges of a few images in the set of 200.

Below are a few images morphed into the average shape. A lot of the faces in the database seemed to have pretty similar geometry to the average, so the change was fairly subtle.

Individual 1
Individual 67
Individual 128

I next cropped and made a grayscale version of my passport photo to match the dimensions and style of the FEI faces. I manually marked alignment points using the FEI database's scheme. As a guide, I plotted the alignment points on the first FEI database photo with numbered annotations:

FEI database keypoints labeled in order

Finally, I warped my image into the average geometry as well. The somewhat horrifying results are below:

Me warping into the average geometry

By taking an average across all of the warped versions of images in the dataset, I then calculated the average face:

Average smiling face from FEI database

To warp the average face into my geometry, I used my face's alignment points as the target geometry and the image of the average face as the input image to the warp function. The results:

The average face warping into my geometry

Part 5: Caricatures: Extrapolating from the Mean

From the exercises above, it was pretty clear to me that the features of my face were rather different from those in the "average" face in the FEI database. For example, my eyebrows slant upwards to the center significantly more, and my eyes slant downwards more. To create a caricature, we can accentuate these features by extrapolation from the population mean. To do this, I computed the delta between my image and the mean image (i.e. my image minus the mean) and added a portion of this to my image. 0.7*delta yielded the best results that were sufficiently caricaturized without being too unnaturally warped. The results are below.

Me, the average face, and a caricature of me

Bells and Whistles

I created a morphing music video on a theme -- specifically a reverse aging morph of myself using images from when I was 27, 14, 7, and less than 1. To do this, I marked all photos using the convention in the FEI images, and I computed morphs between each pair of age-adjacent images (i.e. 27 and 14, 14 and 7, and 7 and 0). I then stacked all the resulting intermediate images together and used the imageio library to create a single gif. The [compressed] gif is below, and the YouTube version with music is here. For this part, I used more intermediate frames to try to mitigate ghosting (especially since some of the images were significantly different from each other). Music for the video is Happy Panda from TunePocket.com (CC license).

Reverse aging