Classification and Segmentation

A CS 194-26 project 4 by Xiangtian Li cs194-26-agr

Part1: Image Classification

Implementation and Training Details

We use a network that has two convolution layer, the first conv layer has 12 channels and the second conv layer has 24 channels. Each of them has kernel size of 3 and is follow the ReLU nonlinearity and a max-pooling layer of size 2 and stride 2.

At the very end we have two fully-connected layers of size 84 and 10 respectively. We use the cross-entropy loss and the network is trained with Adam with a learning rate of 0.001. 1/6 of the training set is used as validation set and the rest is used for training.

Show some Sample Images

Training Curve

We plot the training accuracy and validation accuracy during training. As training progresses, the network seems to overfit to the training set since the accuracy on the validation set no longer increases.

Accuracy of Different Classes

As we can see in the figure, our model performs well in 5 classes, the accuracies of which are more than 90%. However, its performance on shirt is no so good, which is 76.2%.

Some Successful and Failure Samples

Visualize Filters

Here I visualize two conv layers.

Besides, I also visualize two feature maps of the two layers

Part2: Semantic Segmentation

Implementation and Training Details

For the network structure, we use UNet to complete segmentation. The structure is below.

Besides, I use Adam and SGD as two optimizers and compare their performance with the learning rate=0.0001. And the weight decay is 0.00001 and momentum is 0.9. The training epoch is 50.

Training Curve

Adam

SGD

Result

Adam

SGD

As we can see, the average AP of Adam model is 0.70 and that of SGD model is 0.62. Thus, the Adam model performs better than SGD model.

My Data Sample

Below is some buildings from San Francisco. We can see that it got fight on walls and windows but performs bad on pillars.

Image	Segmentation