CS 194-26 Project 5

Part 1: Nose Tip Detection

Sampled images from my dataloader

Predictions from my CNN: The first and fourth predictions displayed below are close to the ground-truth nose points, whereas the second and third are not. My model tended to have more accurate predictions for the head-on angles (first and fourth example) compared to the angled perspectives.

Hyperparameter tuning: I tried increasing the learning rate to 0.01, which provided less accurate results. Then, I adjusted the filter to 5x5 instead of my original filter of 3x3, which also provided less accurate results. The first plot shows the training (blue) and validation (yellow) loss over iterations for my original model, then changing the learning rate, then changing the filter size.

Part 2: Full Facial Keypoints Detection

Sampled images from my dataloader

Predictions from my CNN: My CNN has 5 convolution layers, with a 3x3 filter size. 1->12, 12->15, 15->32 channels, with 2 layers having the same number of input and output channels. Each convolution layer is followed by a ReLU layer and a maxpool layer. Then, I followed with 3 fully connected layers. The first two examples below are close to ground-truth whereas the last two are off. My model tends to have more inaccurate preditions for images altered to have lower exposures and rotations/angled perspectives.

Training (blue) and validation (yellow) loss over iterations