A morph is a simultaneous warp of the image shape and a cross-dissolve of the image colors. The cross-dissolve is the easy part; controlling and doing the warp is the hard part. The warp is controlled by defining a correspondence between the two pictures. The correspondence should map eyes to eyes, mouth to mouth, chin to chin, ears to ears, etc., to get the smoothest transformations possible.
In this assignment I will produce a "morph" animation of my face into someone else's face, compute the mean of a population of faces and extrapolate from a population mean to create a caricature of myself.
First, we will need to define pairs of corresponding points on the two images by hand (the more points, the better the morph, generally). In order for the morph to work we will need a consistent labeling of the two faces. We will label faces A and B in a consistent manner using the same ordering of keypoints in the two faces. Once we have the points on the two faces, we'll need to provide a triangulation of these points that will be used for morphing. I chose to use the Delaunay triangulation on the average point set of the two images since it does not produce overly skinny triangles.
For these two images, I manually selected a shit ton of 98 points (except 4 corners added automatically, so 102 points in total) in both the images across different facial features and accessories (glasses) to set us up for a good morph. Once I got that, I used scipy library's Delaunay function to compute an abstract triangle representation on the mean point set of the two images which is shown above.
David Huang (Mid-Way Face)
Zixun Huang (Face A)
David Bowie (Face B)
Before I started face morphing, I tested warping between two simple triangles (Left) to confirm that the function (affine computing and affine warping) I had implemented works properly.
Here are the results:
mask, warpped image (warp_frac = 1), mask, warpped image (warp_frac = 0.5), warpped image (warp_frac = 0.5, dissolve_frac = 0.5)
Before we compute the whole morph sequence, let's compute the mid-way face of these two images. This involves the following:
1. Computing the average shape (a.k.a the average of each keypoint location in the two faces)
2. Warping both faces into that shape
3. Averaging the colors together.
The main task in warping the faces into the average shape is implementing an affine warp for each triangle in the triangulation from the original images into this new shape. We iterated through all the triangles pairs and computed an affine transformation matrix between them. We then used this matrix along with some simple interpolation techniques to move the pixels between triangles and implemented an inverse warp of all the pixels.
Using this technology, we can also change our age and gender.
In the first row, I make a mid-way face calculation of myself and an older person. In the second row I take myself and the average face of a woman to do the mid-way face calculation.
For these two experiments, I tried each of the three calculations.
1. warp_frac = 0.5; dissolve_frac = 0.5 (standard mid-way face)
2. warp_frac = 0; dissolve_frac = 0.6 (warp the target face into my face, and then dissolve)
3. warp_frac = 0.6; dissolve_frac = 0 (only warp the shape of my face into another geometry)
Then I implemented a function to generate morph sequence:
morphed_im = morph(im1, im2, im1_pts, im2_pts, tri, warp_frac, dissolve_frac);
that produces a warp between im1 and im2. The parameters warp_frac and dissolve_frac control shape warping and cross-dissolve, respectively. In particular, images im1 and im2 are first warped into an intermediate shape configuration controlled by warp_frac, and then cross-dissolved according to dissolve_frac.
In the first attempt of morphing, I found some problems:
1. even if we use Delaunay function to avoid overly skinny triangles. but if the selection is dense and the relative position of the points changes too much, it will still lead to bad geometry
2. the selection of images A and B is very important, if the images do not correspond to some key visual focus, such as the eyes and mouth open and close in different states, this will cause the dissolve result to look werid
3. the difference between hair and body will lead to some artifacts, I hope to weaken the artifacts by adjusting the delay time of each frame
1. annotate the picture so that the points are in a similar relative position, for example, in the picture of David Bowie there are several points away from him, because I don't want to make bad geometry when the points are too close together, or even result in negative triangles.
2. I chose another picture of David Bowie with his eyes open and mouth closed like me.
3. The artifacts are still there because our bodies and hair are distinctly different. Adjust the delay of each frame so that the strange moments pass.
Here, I pick a photo of myself from different ages and make a movie that shows how my face changed over time
Here, I gang up with other students in the class and organize yourselves in one big chain of students.
I analyzed a free dataset of Danish faces meant to be used in a statistical model of shape for population facial features analysis.
We have 40 people in this dataset and we can start by finding the average face of this population. Calculating the average face starts by finding the average face shape of the population, converting all the faces to the average shape, and then blending all the transformed faces together with equal weights.
Here is the original dataset: annotated faces ;
40 faces Warped into Avg. Dane's geometry 👇
Average Dane's Face 👈
My face Warped into Avg. Dane's
Avg. Dane's face Warped into mine
Mid-Way Face of mine and Dane's
Now we make caricatures of my face by extrapolating from the population mean that we calculated in the last section. We do this by finding the difference between the average face features & my face features and then adding it back to my face after scaling it.
Alignment is very important, otherwise we will only see the face become bigger and smaller, rather than highlighting some features, such as the face becomes square and sharp, the corner of the eyes up
average keypoints coordinate + k *(my keypoints coordinate - average keypoints coordinate)
Here are results:
alpha: -0.5👉-0.2👉0.2👉0.7👉1.0👉1.2, 👉1.5👉1.7