In this portion, I wrote a lot of code that I ended up using for the rest of the project. Specifically, I wrote out a dataset that reads the ASF files to collect the datapoints (with an option to either only keep the nose as the target or to keep all the facial points), a RegressionCNN module, and a general training loop that could be applied to different models / different dataloaders.
As a sanity check, I made sure the dataloader was correct in storing the datapoints.
I then ran two models with learning rates of $1e-3$ and $5e-4$, respectively. I also changed the hidden Fully Connected layer sizes, with the first model having a size of 256 while the second model has a size of 64. The training graphs are detailed below, respectively.
In this part, I defined my own transforms to augment the data, and reused a lot of the methods in Part 1. to train the model. For instance, I used the same dataset class previously defined, as well as the same training loop method.
The model I chose has the following architecture. I chose this architecture because it is relatively fast to train (relatively few weights for a 6-layer regression CNN).
My MSE error is 9.07269. For my model, I used a pretrined Resnet18 with a modified initial convolution layer and final fully connected layer.
I trained the Resnet18 on an augmented dataset consisting of 33330 points for 30 epochs with a learning rate of $2e-4$. I had the following training graph and validation predictions.