Project 4: Classification and Segmentation

Zhi Chen   cs194-26-afa

Part 1: Image Classification

In this part, we built a CNN to classify the Fashion Mnist dataset. Here's a picture of samples of this dataset. For this dataset, there are ten classes in total: 'T-Shirt', 'Trouser', 'Pullover', 'Dress', 'Coat', 'Sandal', 'Shirt', 'Sneaker', 'Bag', 'Ankle Boot'.

We only use two layers of convolutional layers. Considering the time, I decrease the size of the layer and below is the CNN architecture.

Results are really good. Within 3 minutes, we can get a network that has almost 90% test accuracy. Here's the training accuracy and test accuracy during training procedure and the test accuracy for all classes after training.

Here are the some correct and incorrect classified sample images for each classes.

class 0:

class 1:

class 2:

class 3:

class 4:

class 5:

class 6:

class 7:

class 8:

class 9:

correct

incorrect

Here are the filters learned for each CNN layer.

First Layer

Second Layer

Part 2: Semantic Segmentation

In this part, we built a more complex CNN to do semantic segmentation on the Mini Facade dataset which classifies the different pattern in a building picture includes 5 classes: "balcony", "window", "pillar", "facade", "others". Below is a pair of samples.

Here is the architecture of my CNN. I use 4 layers of normal CNN followed by a Relu to control the non-linear calculation. For each normal CNN, I use the ConvTranspose layer to resize the network. I decided to use cross-entropy loss for the loss and to use the Adam optimizer with learning rate 1e-3 as the optimizer.

Here's the results. I trained the network for 40 epochs and accuracy for some classes are really high, but others really low.

Apply my network on the image of South Hall in Berkely. I have achieved this results. In the picture, the widows and the facades and other objects are easily to recognize but the stair which should be recognized as "others" shows some mistakes.

a result sample of the network

Average Precision

We find that the class "balcony" and "other" are really easy to produce mistakes.