Face Morphing, and Other Face Effects!
In this project, I dive into a bunch of different ways to transform faces, including morphing faces into arbitrary faces, leveraging face morphing to clearly represent the average face in a subpopulation, building caricaturizations by exaggerating "unique" facial characteristics, and creating an "old photo" effect that makes faces look younger.
Jazz Singh // October 2020
1. Face Morphing
My goal for this section is to build a video where one face is seamlessly transformed into another, both in terms of structure and appearance.
An overview of the methodology (more detail is in the following sub-sections):
- - align and define corresponding keypoints on the source and target images;
- - triangulate both images;
- - morph the shape by warping the source image to the target image (with inverse affine transforms per triangle, and bilinear interpolation);
- - morph the appearance by taking a weighted average of the source and target images (cross-dissolve);
- - and finally, stitch the frames together into a video.
Though this method is applicable to arbitrary faces, I picked as an example the source image of my face and the target image of Hasan Minhaj's face. Here are the original images, for reference:
1.1. Image Alignment and Defining Correspondences
I aligned the image centers, rescaled the images based on the distance between the eyes, rotated the images so they're both at the same angle, and aligned image sizes, all with the aim of ensuring that I can define good correspondences between the images.
I wrote a small script to label each face with corresponding keypoints at strategic locations, using matplotlib's ginput function.
Below are the aligned, labeled images:
In order to be able to assume there exist affine transformations between sub-structures in the images, I needed to define corresponding triangular meshes. For this, I used the Delaunay triangulation (Wikipedia, scipy), which corresponds to the dual of the Voronoi diagram for a point set, since it builds triangles that aren't too skinny.
I computed a triangulation on the mean shape between the source image's keypoints and target image's keypoints. Note that before computing this triangulation, I added 4 "dummy" points -- one at each of the corners of the images -- so that the background too morphs somewhat even though it can't be morphed as well as the fully annotated faces.
Below is this mid-way triangulation applied to the source image's keypoints, and the same triangulation applied to the target keypoints:
1.3. Computing the "Mid-Way Face"
The code for this section is key towards the next (computing the rest of the frames in the source, target, and middle face).
First, I implemented a function that returns a source image warped to an aligned target image, given the triangulation and the images' keypoints. For each triangle in the triangulation, I did the following...
- - compute the affine transformation matrix T from the target triangle to the source triangle;
- - retrieve all the locations of the pixels within the target triangle (see sklean.draw.polygon);
- - apply T to these pixels in the target triangle, to get estimated corresponding source triangle pixels;
- - adjust the color of these estimated source triangle pixels by leveraging nearby information (bilinear interpolation);
- - set the target triangle to the interpolated source triangle.
A little more detail on computing T: Since I had obtained 3 pairs of points that define the corners of each triangle, I had enough information to solve a system of linear equations to find the 6 parameters that govern an affine transformation. I computed a transformation from the target to the source triangle, rather than from the source to the target.
To compute the mid-way face, I first warped the shape of both the source and target images to the average of their keypoints; then, I averaged these warped images. This resulted in an image with mid-way shape and mid-way appearance.
Below are the aligned images one more time, each of these images warped to the mid-way shape, and finally the mid-way face result:
1.4. The Morph Sequence
To create each of 45 frames, I slowly adjusted the weighted average for warping and cross-disolve. For example, for a given frame, I warp the source and target to (1 - alpha) * source_points + alpha * target_points, and I cross-dissolve the appearance by (1 - beta) * source_image + beta * target_image. For a smooth morph, alpha = beta.
I combined the resulting list of images into a video (with frame rate 30fps) using OpenCV's VideoWriter. I also cropped out some of the video as post-processing to lessen distractions from warping the background (non-facial) parts of the image.
Below is a gif with the face morphing result!
2. "Mean Face" of a Population
The goal of this section is to construct a really clear representation of the average face in a population, without the blurring effects that stem from taking a naive average across many different faces.
I used the IMM Face Database. Some sample smiling faces from this dataset:
I wanted to compute the average smiling face from this dataset. I used the following methodology to build a clear, representative average:
- - pick an image, center it, and use this as the reference image;
- - align all images and their keypoint annotations to the reference image (see 1.1);
- - take the average of all the keypoints;
- - compute a triangulation from this average;
- - use the triangulation, annotations, and images to warp each image to the average (see 1.2-1.3);
- - and finally, take a simple average over all these warped faces to average out the appearance.
Here is the result of this process -- the average smiling face in the IMM Data Set!
Below are a few interesting before-and-after intermediate results of warping faces (smiling and neutral faces) to the average smiling face (without appearance averaging). The unsmiling faces end up smiling; and the direction in which the eyes seem to be pointing is sometimes changed to towards the camera. There exist results where smiles are exaggerated to the point of being comical, but I've chosen to include the coolest examples here.
I also warped my face to the average smiling face, and the average face to my face's shape, to see what would happen:
The goal of this section is to build an image that exaggerates "unique" facial characteristics.
To do so, I extrapolated from the subpopulation mean calculated in Section 2. More specifically, instead of warping my image to the average shape between my face and the subpopulation mean, I warped my face to the summation of my face's shape and the difference vector pointing from the subpopulation mean shape to my face's shape weighted by a hyperparameter. In other words, I accentuate the shape differences between my face and the subpopulation mean. (Note that I left out conducting a similar process for appearance, because it seemed to result in a "brownface" effect.)
For detail on the warping steps, see 1.2-1.3.
Below is the result of caricaturizing my face:
4. Childhood Photo Effect
The goal of this section is to alter the structure and appearance of a face to make it seem more childlike.
To accomplish this, I picked an image of an average baby and an image of a friend, labeled and aligned both, and morphed my friend's image to the baby image (both warping and cross-dissolve, see sections 1.2-1.3). I played around with warping and cross-dissolve hyperparameters until arriving at the result.
Below is the result, along with some intermediate stages: