Part 3. Train With Larger Dataset
For this part, I used inputs from ibug’s Faces In The Wild dataset, resizing cropped faces to
the recommended 224x224 after converting them to grayscale images in the -0.5 to 0.5
range. Data augmentation was also applied, with the same combination of random
rotations within a range and brightness & saturation adjustments used to diversify inputs.
These images were loaded into a single custom PyTorch Dataset for training. Select images
can be seen below, with their associated labels (full facial keypoints) shown in light blue.
Randomly selected faces & associated facial landmarks:
This augmented dataset was loaded into a PyTorch Dataloader, with a batch size of 1 but
no train-test split since a separate test set was already provided. A pre-trained ResNet18
model was fine-tuned for this task, with default values adjusted as shown below to both
accommodate the input data properly as well as optimize performance.
25 epochs of training were conducted, with Adam optimization and a learning rate of 1e-4.
Mean squared error loss as recorded across these training epochs is shown below.