Classification and Segmentation

Part 1: Image Classification

In this part we use FashionMNIST as the input dataset to train a classification model. Here are some samples from the dataset.

The architecture of my model is conv->relu->maxpool->conv->relu->maxpool->fc->relu->fc.

Here are the accuracy curves of my model.

Here’s the per class accuracy of my classifier. We can see that Shirts are the hardest to get.

class correct wrong
T-shirt
Trouser
Pullover
Dress
Coat
Sandal
Shirt
Sneaker
Bag
Ankle boot

Here are the visualization of the filters from the first layer of convolution.

Part 2: Semantic Segmentation

The architecture of my model is :

lr=1e-3, weight_decay=1e-5, batch_size=1, epoch=30

And the accuracy curve is:

The average AP is:

Here are some of the result of my collection(the input have been reshape in to 256*256):

original output

We can see that facade(blue) and window(orange) is the easiest to obtain while pillar(green) and balcony(red) is the hardess to get.