CS194-26 Project 5: Facial Keypoint Detection with Neural Networks

Matthew Tang

Part 1: Nose Detection

The first part involved predicting the nose keypoint. First, I visualized some of the ground truth keypoints.

I made a small network with 3 conv layers and 2 fully connected layers and trained for 10 epochs. This is the loss graph from training and validation.

I experimented with different hyperparams for training to see which resulted in better results. The following are loss graphs for different learning rates. Using a learning rate of 1e-5 shows that it takes longer for the loss to decrease. Using a learning rate of 1e-1 shows the loss is very high initially (likely since it overshoots the minima) but does eventually converge.

Here are some of the results of the test set visualized.

2 Images that worked well:

2 Images that didn't work well:

These likely failed since the face was off centered. Much of the data has the face centered which gives rise to part 2 which involves using data augmentation to remedy this issue.

Part 2: Full Facial Keypoints Detection

Here are the ground truth of all the facial keypoints overlaid.

This was the architecture of my model. I used the Adam optimizer with learning rate 1e-3. Batch size was 16.

Here is a plot of the loss over epochs

And here are the learned filters from the first conv layer

Now, on to the results. Here are 2 images that worked well:

2 Images that didn't work well:

The left one likely failed since with the shift, the face is really close to the bottom of the image. This is a significant spatial shift, as most of the faces are in the center-ish of the image. The right one failed since the side faces are just harder to detect. Most of the faces are more frontal, so the network just wasn't able to detect side facial features well.

Part 3: Train With Larger Dataset

Kaggle username: mountang

Mean Absolute Error: 13.74786

This was the architecture of my model. I used the Adam optimizer with learning rate 1e-3. Batch size was 16.

Loss graphs

Results on test set

These seemed to work well since they are pretty standard faces.

These did not work quite as well. For the left, the man seems to have his eyes very close to his eyebrows (closer than other faces). For the right, the child's face is just smaller so the network predicts the face outline to be bigger than it really is since most images are adults.

My images

For Hugh Jackman, it gets the bottom of the face and the nose. But his eyes and eyebrows are actually much closer than the predictions. Similar for Ryan Reynolds. I expected the network to have a hard time with Simu Liu since his face is tilted, but it works surprisingly well with the exception of the left keypoints of the outline of his face.