CS194-26 Project 5

Facial Keypoint Detection with Neural Networks

Sarina Sabouri - November 12th, 2021

In this project, I used neural networks to automatically detect facial keypoints.


Part 1: Nose Tip Detection

In this part of the project, I designed a neural network to detect just the nose keypoints of faces.

Dataloader

First, I created a custom dataloader to load the images and the nose keypoints. This dataloader converts the image in each sample to grayscale, converts the image pixel values to normalized floats, and resizes the images into a smaller size. Here are a few images returned from my dataloader:

Dataloader Images:

Training and Validation Loss

Below is a plot of my loss while training my nose detection model:

Results

Here are the results of my network. The red points are my predicted points and the blue points are the ground truth points. There were many cases where my network detected the nose correctly, but for many of the images where the face was tilted or facing the side, my networking was not succesful in detecting the nose. This failure could be attributed to the fact that most images in the dataset were facing forwards, directly towards the camera. If there were more images in the dataset where the face was turned in different directions, I believe the performance of my network would be enhanced.

Good results:

Failed results:

Model & Hyperparameter Tuning

Model

I trained the following model over 25 epochs with a batch size of 3 and a learning rate equal to 0.001:

Hyperparameter Tuning

I was interested in observing my model performance by varying the hyperparameters. First, I observed the results of lowering my learning rate to 0.0001: I noticed slightly worse performance in detecting noses, and the loss curve for the training set had a much steeper drop in the first couple of epochs as shown below. Then, I observed my model's performance by removing a convolutional layer: this also resulted in weaker performance in comparison my original model, but it created a similar shaped training loss curve as my original model. The following are the train/validation MSE loss plots for the models with the tuned hyperparameters:

My model
Learning rate = 0.0001
Removed one Convolutional layer


Part 2: Full Facial Keypoints Detection

In this project, I aimed to detect a set of full facial keypoints.

Dataloader

To prevent my model from overfitting, I applied data augmentations such as a random rotation and a random color jitter. Here are some of the outputs of my dataloader:

Dataloader Images:

Training and Validation Loss

Below is a plot of my loss while training my nose detection model:

Results

Here are the results of my network. The red points are my predicted points and the blue points are the ground truth points. Similar to the nose points, my network was succesful in detecting the keypoints for when the face was facing forward, but had difficulty in correctly detecing keypoints for faces turned to the side. This failure could be attributed to the fact that most images in the dataset were facing forwards, directly towards the camera. If there were more images in the dataset where the face was turned in different directions, I believe the performance of my network would be enhanced.

Good results:

Failed results:

Model

I trained the following model over 25 epochs with a batch size of 5 and a learning rate equal to 0.003:

Learned Filters

The following are the learned filters from the first convolutional layer of my network:


Part 3: Train with a Larger Dataset

In this part of the project, I trained my model with a much larger dataset.

Model

I used the ResNet18 predefined model to train my network with a batch size of 64 and a learning rate of 0.001. The following are the specifics of the network:

Training and Validation Loss

Below is a plot of my loss while training my ResNet18 model:

Results from Testing Set

Here are the results of my network on the testing set.

Results from my own images

The following are the results on my own images. As seen below, it fails on images that are slightly off-center, but succeeds on faces that are centered in the frame.

Reflection/What I Learned

Overall, I had a lot of fun working on this project, as I learned how to use PyTorch for the first time and had the opportunity to implement the neural network skills in a very practical setting.