CS 194-26 Project 5

name: Andrew Shieh


Convolutional Neural Networks are super powerful, and a building block of modern computer vision techniques. In this project, I architect my own CNNs to use with face keypoint detection.

Nose Tip Detection

First, I worked on making a net for detecting the nose tips of different subjects. Using the Danish computer vision researchers dataset, I started by building a dataloader for their faces and nose keypoints. Take a look at a small sample of them here:

Then, I ran them through my network architecture. It's three convolutional layers followed by two fully-connected layers:

Here's the MSE loss of both training and validation datasets. I used the first 32 subjects as the training dataset, and the remaining for validation.

I also tried changing the hyperparameters of my model, by changing the convolutional layer kernel sizes to a consistent size of (3, 3), and also changing the learning rate to 1e-4. However, this actually gave worse results. Take a look at the MSE loss graph:

To get a better idea of the performance of the model, here are two examples of proper nose detection, and also two examples of improper detection. The improper detections are likely due to different saturations and facial features (e.g. facial hair), which would need to be fixed in future models. Take a look:

Full Facial Keypoints Detection

For full keypoint detection, I had to build a new dataset and dataloader for all the keypoints, as well as add data transformations to change the saturation, rotation, and image shifts.