Project 4: Classification and Segmentation
Aditya Yadav
Overview
For the first part of this project we train a CNN to perform image classification on the Fashion MNIST dataset. For the
second part we train a CNN to perform semantic segmentation on the Mini Facade dataset.
Part 1: Image Classification
These are a couple sample images from the dataset:
The classes from left to right are BAG, PULLOVER, COAT
For my CNN implementation I used the following:
- 2 conv layers, each with 64 channels
- Both conv layers followed by relu and a 2x2 max pool
- Then all of that followed by 2 fully connected dense layers
- The first dense layer had 80 nodes, the second had 10
- I used adam as my optimizer, with a LR of 0.001
- I used cross entropy loss as my loss function
- My batch size during training was 32, and I trained for 15 epochs
- My training dataset was 50000 images and the validation set was 10000 images
- Then there was a standalone test dataset of 10000 images used after training
Train and validation accuracy during the training process:
Per class accuracy of classifier on the validation and test dataset:
As you can see, the shirt was the hardest to get, but even then the model did really well.
Examples:
Left 2 are correctly classified, right 2 are wrongly classified
t-shirt:
trouser:
pullover:
dress:
coat:
sandal:
shirt:
sneaker:
bag:
ankleboot:
Learned Filters (from 1st conv layer) :
Part 2: Semantic Segmentation
Model architecture:
- My model consisted solely of convolutional and relu layers
- I had 6 convolution layers, each followed by relu, except the last
- They had 40, 160, 480, 160, 40, 5 channels in that order
- I used 5x5 kernels for each of them with a padding of 2 to ensure no size change
- My training dataset consisted of 724 images, validation had 182
- Test dataset had 114 images
- During training I used a batch size of 4
- I used cross entropy loss as my loss function
- I used adam with a LR of 1e-3 and L2 weight_decay value of 1e-5
Training and validation loss across iterations:
AP values:
The average AP here comes out to around 0.55
My example:
It seems to get the windows correct but wrongly identifies pillars
It also has some red sprinkled in it which I am not sure why
Overall though it does look pretty decent