Project 4: Classification and Segmentation
Introduction
In part 1, we train a simple classifier for the Fashion-MNIST dataset. In part 2, we train a neural net to perform semantic segmentation of images in a mini Facade dataset. For this project, I ran code on Google Colab using a GPU.
Working with neural nets in pytorch
Creating and training the neural network for each respective part followed the same essential steps:
- Retrieve or create dataset(s), then initialize a loader object for it
- Split the data into training and validation sets
- Define the network architecture
- Train the network on the training set, evaluate it on the validation set, and tune hyperparameters if necessary. Repeat.
- Test the network with the test set. Report and analyze results.
Part 1: Classification
I read through this COLAB tutorial, which essentially goes through most of the steps we wanted to, but with a different datset.
These articles “Neural Networks” and “Training a Classifier” from pytorch
(linked in the spec) were also very useful in figuring out how the code works!
Hyperparameters
optimizer = optim.SGD with momentum=0.9
learning rate: 0.01
training: 35000 samples
validation: 7000 samples
epochs: 12
Net
Net architecture was as specified in the spec.
Some things that noticeably improved the classifier were 1) increasing filter size from a naive 3x3 and 2) increasing the maxpool filter size. I suspect it’s because this means each number in the next layer represents more values of the previous.
Results
Loss:
My training loss converged to 0.15, and validation loss to ~0.27.
Plot the train and validation accuracy during the training process.
Compute a per class accuracy of your classifier on the validation and test dataset.
Computing accuracy...
Accuracy of the network on the 60000 train images: 92.20 %
Accuracy of the network on the 7000 validation images: 90.50 %
Accuracy of the network on the 10000 test images: 90.23 %
Class Accuracy (%)
0 82.50
1 98.50
2 88.00
3 90.20
4 79.90
5 98.30
6 73.30
7 95.60
8 98.70
9 97.30
Which classes are the hardest to get?
Class 6 performed very poorly in comparison to the rest of the classes. Otherwise, classes 0 and 4 also did not do well.
Show 2 images from each class which the network classifies correctly, and 2 more images where it classifies incorrectly.
Visualize the learned filters.
Part 2: Segmentation
Hyperparameters
- Epoch: 25
- Training dataset: 600 images, batch size 32
- Evaluation dataset: 306 images, batch size 16
- Loss: CrossEntropyLoss() (as starter)
- Optimizer: Adam (as starter)
Net
- 5 conv layers with ReLU applied after
- Maxpool after ReLU (layers 2, 3, 4)
- Maximum number of channels outputted by Conv2d: 96
- Min filter size: 3
- Max filter size: 7
Results
My network converged around epoch 20, with losses of about 0.79 for both training and validation.
Display a plot showing both training and validation loss across iterations.
Report the average precision on the test set.
Testing on test set: 0.8173239387963948
AP = 0.6452316617590995
AP = 0.771376737037513
AP = 0.12939426037550147
AP = 0.8111214941012976
AP = 0.4557280429990795
Average AP: 0.5625704392544982
Try running the trained model on the photo of a building from your collection. Which parts of the images does it get right? Which ones does it fail on?
Input |
Output |
|
|
My model does pretty well with ‘windows’ (orange) and ‘others’ (black), but failed with ‘pillars’ (green) here, and didn’t get most of ‘balcony’ (red).
Project 4: Classification and Segmentation
Introduction
In part 1, we train a simple classifier for the Fashion-MNIST dataset. In part 2, we train a neural net to perform semantic segmentation of images in a mini Facade dataset. For this project, I ran code on Google Colab using a GPU.
Working with neural nets in pytorch
Creating and training the neural network for each respective part followed the same essential steps:
Part 1: Classification
I read through this COLAB tutorial, which essentially goes through most of the steps we wanted to, but with a different datset.
These articles “Neural Networks” and “Training a Classifier” from
pytorch
(linked in the spec) were also very useful in figuring out how the code works!Hyperparameters
optimizer = optim.SGD with
momentum=0.9
learning rate: 0.01
training: 35000 samples
validation: 7000 samples
epochs: 12
Net
Net architecture was as specified in the spec.
Some things that noticeably improved the classifier were 1) increasing filter size from a naive 3x3 and 2) increasing the maxpool filter size. I suspect it’s because this means each number in the next layer represents more values of the previous.
Results
Loss:
My training loss converged to 0.15, and validation loss to ~0.27.
Class 6 performed very poorly in comparison to the rest of the classes. Otherwise, classes 0 and 4 also did not do well.
Part 2: Segmentation
Hyperparameters
Net
Results
My network converged around epoch 20, with losses of about 0.79 for both training and validation.
Testing on test set: 0.8173239387963948
AP = 0.6452316617590995
AP = 0.771376737037513
AP = 0.12939426037550147
AP = 0.8111214941012976
AP = 0.4557280429990795
Average AP: 0.5625704392544982
My model does pretty well with ‘windows’ (orange) and ‘others’ (black), but failed with ‘pillars’ (green) here, and didn’t get most of ‘balcony’ (red).