Project 5 - Facial Keypoint Detection with Neural Networks

Eric Tang

Part 1: Nose Tip Detection

Dataloader Example

All Keypoints

Dataloader Nose Keypoint

Train and Validation Loss

Loss Curves for Nose Detection

Hyperparameter Tuning

Filter Size

Tuning Filter Size

Tuning Learning Rate

We see that the various hyperparameters don't seem to have a significant effect on results

Examples

Success

Correct 1

Correct 2

Failure

Wrong 1

Wrong 2

In these images, we see that the straight on faces generally have higher accuracy than turned ones, which we see is the case here. ## Part 2 ### Dataloader Samples

Dataloader Samples

Architecture Details

For the architecture, I used the following base layers:

Net( (conv1): Conv2d(1, 16, kernel_size=(7, 7), stride=(1, 1)) (conv2): Conv2d(16, 32, kernel_size=(5, 5), stride=(1, 1)) (conv3): Conv2d(32, 32, kernel_size=(3, 3), stride=(1, 1)) (conv4): Conv2d(32, 32, kernel_size=(3, 3), stride=(1, 1)) (conv5): Conv2d(32, 32, kernel_size=(3, 3), stride=(1, 1)) (fc1): Linear(in_features=1120, out_features=256, bias=True) (fc2): Linear(in_features=256, out_features=128, bias=True) (fc3): Linear(in_features=128, out_features=116, bias=True) )

Between each of the convolution layers, I used a ReLU layer followed by a 2x2 max pooling layer. After the last convolution layer, I flatten the output, and pass into a feed forward network with 3 layers, with ReLU between the fully connected layers (but not after the output).

I trained with batch size 32, learning rate 0.001, for 30 epochs using the Adam optimizer and MSE loss.

Train and Validation Loss

Train + Val loss

Examples

Success

Correct 1

Correct 2

Failure

Wrong 1

Wrong 2

In these cases, it's possible the bald man's head is missing relevant features (i.e. hair). For the picture of the woman, it seems like her hands are up, which might add out of distribution features int othe mage

Filter Visualization

Learned Filters

Part 3

Architecture Details

I used a ResNet-50, and trained on a compute cluster with 4 GPUs using nn.DataParallel. I used a batch size of 128, and a learning rate of 0.001, and trained for 15 epochs.

For data augmentation on the training set, I rotated images -15 to 15 degrees before evaluating, rotating keypoints in the same way.

Train + Validation Loss

Train + Val loss

Test Set Examples

Own Photos

Luka

Aaron Rodgers