Project 3 - Face Morphing

Overview

In this project, we explore how to go about morphing together two images. The main challenge with image morphing is that when we have two images that differ in shape, performing a cross dissolve can result in strange objects that we would not feel comfortable classifying as a real human face. This can happen even if the two images we are morphing are of the same person! If there is any misalignment in the images, the cross dissolve will look messy. The way we get around this is by considering the idea of annotating our images with a set of keypoint correspondences. If we abstract a human face to being a set of keypoints in 2d space, then we can ask ourselves if it makes sense to think of these keypoints as vectors. If we take two corresponding keypoints from different images, we notice that we can interpolate between them to get another keypoint that lies between the two. If we apply this same interpolation to all corresponding keypoints, we will end up with a resulting set of keypoints that can be interpreted as the keypoints for some midway human face! Thus, we see that the space of keypoints forms a convex set. If we add the notion of extrapolation (as we did in class), then we can extend these convex combinations to include affine combinations, and interpret the space of human face keypoints as an affine space. This will be very helpful, as we now have a notion of a "midway" face shape between two people. We will see how this technique can help us morph faces in the section on triangulation.

Keypoints and Triangulation

The first step was to write code that can mark keypoints on two images. These points are used for the purposes of triangulation, which will help us morph an image from one shape to another. We will discuss this in more detail in the following section on affine transformations.

Affine Transformations

Now, let us discuss the bread and butter of this project: affine transformations. Our goal was to code an affine transformation that can transform any given triangle to any other target triangle. If we can find this transformation, then we can also apply the transformation to the pixels inside the triangle to color in the new triangle with the same material. Thus, by triangulating our images, we can transform between two shapes by remembering which triangles correspond to each other, and then perform the transformation on each triangle. Applying this technique to the triangulated images above allows us to morph both images to some halfway image so that they now have the same shape. Once our two images are the same shape, we can cross dissolve to end up with another human face. Here is the result of morphing both images to the halfway shape, and then averaging the pixel values between the two images:

Moprh Sequence

When we perform the morph sequence, we can simply begin with one image, and smoothly transition to the other by changing the weights in our affine combination slowly. Specifically, we can generate n-frames where we increase the weighting of our combination by 1/n on each frame. This results in a weight of 0 on the first image, and a weight of 1 in the last image. The greater we make n, the smoother this transition becomes. Here is the result of using n=60 to morph between the two faces above:

Mean Face

We can also apply these techniques to get the mean face of a population. To do this, I utilized the following dataset: images These images came pre-annotated, and so I simply had to compute the average shape from the keypoints, and then morph every image to the common shape. Then, I was able to average the faces to get the average face of the dataset. Here are some of faces from the dataset morphed into the average shape:

And here is the average face:

I also went ahead and morphed my own face into the shape of the average image from the above dataset, and vice versa. I marked my own keypoints on the average face to do this, so things got a bit messy on the borders. Here are the results:

Caricatures

Up until this point, we have only utilized convex combinations of face shapes, and so we were working under the assumption that face shapes live in a convex set. However, if we try to extend this idea to an affine space, we introduce the concept of extrapolation. This lets us look at face shapes beyond those that we aer given by introducing weights to our convex combination that are outside of the range [0, 1] (and thus, it becomes an affine combination). Here is the result of using a negative coefficient for the average face from the dataset, and a coefficient greater than 1 for my face shape:

Bells and Whistles

Finally, as a "Bells and Whistle" for my project, I made a music video showing how my face has evolved since 2004. This actually took quite a while to make, as I had to find all the pictures I wanted to use, make them the same size, label keypoints, and generate a very long morph sequence. However, it was fun to revisit some of these old photos. You can view the video at this link: timelapse