Project 5: Facial Keypoint Detection with Neural Networks

COMPSCI 294-26: Computational Photography & Computer Vision (Fall 2021)
Rushil Kapadia

Part 1: Nose Tip Detection

Sampled image from dataloader visualized with ground-truth keypoints

Train and validation MSE loss during training process

Playing with the hyperparameters

Following are the train/validation errors for a change in kernel size (3 to 5), channel size (18 to 40), and learning rate (to .01) respectively:

2 facial images where the network detects the nose correctly

2 facial images where the network detects the nose incorrectly

Reasons for failed detection: It may have had trouble detecting the keypoint due to the rotation of the face (i.e. not facing directly at the camera), and possibly the lack of hair on the first image.

Part 2: Nose Tip Detection

Sampled image from your dataloader visualized with ground-truth keypoints

Model Architecture

Hyperparameters: It was trained for 10 epochs with learning rate = 1e-2 and batch size of 4.

Train and validation MSE loss across iterations

2 facial images which the network detects the facial keypoints correctly

2 facial images which the network detects incorrectly

Reasons for failed detection: It seems here the model struggled to account for turned faces. It is possible as turned faces are the minority of the dataset, the model learned to "ignore" them.

Filter Visualization

Learned kernels for first convolutional layer:

Learned kernels for final convolutional layer:

Part 3: Train with Larger Dataset

Mean Absolute Error

From Kaggle: Team Name: Rushil Kapadiaconv2

Model Architecture

I used ResNet18, the pytorch model suggested by the project. I changed the first layer to have 1 input channel instead of 3 input channels because we are loading in grayscale images. I also changed the final output channel number to be 136, because the output is 68 facial keypoints (2*68 = 136).

Hyperparameters: I trained for 5 epochs with a learning rate of 1e-2 and batch size of 1.

Train and validation MSE loss across iterations

Visualized images with keypoint prediction in testing set (good and bad)

Running the model on images in my collection...