Project 4 - Aditya Yadav

Project 4: Classification and Segmentation

Aditya Yadav

Overview

For the first part of this project we train a CNN to perform image classification on the Fashion MNIST dataset. For the second part we train a CNN to perform semantic segmentation on the Mini Facade dataset.

Part 1: Image Classification

These are a couple sample images from the dataset:

The classes from left to right are BAG, PULLOVER, COAT

For my CNN implementation I used the following:

2 conv layers, each with 64 channels
Both conv layers followed by relu and a 2x2 max pool
Then all of that followed by 2 fully connected dense layers
The first dense layer had 80 nodes, the second had 10
I used adam as my optimizer, with a LR of 0.001
I used cross entropy loss as my loss function
My batch size during training was 32, and I trained for 15 epochs
My training dataset was 50000 images and the validation set was 10000 images
Then there was a standalone test dataset of 10000 images used after training

Train and validation accuracy during the training process:

Per class accuracy of classifier on the validation and test dataset:

As you can see, the shirt was the hardest to get, but even then the model did really well.

Examples:

Left 2 are correctly classified, right 2 are wrongly classified

t-shirt:

trouser:

pullover:

dress:

coat:

sandal:

shirt:

sneaker:

bag:

ankleboot:

Learned Filters (from 1st conv layer) :

Part 2: Semantic Segmentation

Model architecture:

My model consisted solely of convolutional and relu layers
I had 6 convolution layers, each followed by relu, except the last
They had 40, 160, 480, 160, 40, 5 channels in that order
I used 5x5 kernels for each of them with a padding of 2 to ensure no size change
My training dataset consisted of 724 images, validation had 182
Test dataset had 114 images
During training I used a batch size of 4
I used cross entropy loss as my loss function
I used adam with a LR of 1e-3 and L2 weight_decay value of 1e-5

Training and validation loss across iterations:

AP values:

The average AP here comes out to around 0.55

My example:

It seems to get the windows correct but wrongly identifies pillars

It also has some red sprinkled in it which I am not sure why

Overall though it does look pretty decent