CS 194-26 Project 4

Overview

This project involved classification of images in the Fashion-MNIST dataset and semantic segmentation of images in the Mini Facade dataset using deep convolutional neural networks with PyTorch and Google Colab.

Part 1: Image Classification

In this part images of clothing from 10 classes are categorized. 60,000 training and 10,000 test images are used from the FashionMNIST dataset.

Sampling from the FashionMNIST Dataset

classes = { 0:'T-shirt/top', 1:'Trouser', 2:'Pullover', 3:'Dress', 4:'Coat', 5:'Sandal', 6:'Shirt', 7:'Sneaker', 8:'Bag', 9:'Ankle boot' }

The architecture of the neural network is 2 convolutional layers, 32 channels each, each followed by a ReLU nonlinearity followed by a maxpool. This is followed by 2 fully connected networks with ReLU after the first fc layer.

Training the CNN

The CNN was trained over five epochs with Adam optimizer using learning rate 1e-3 and weight decay 1e-5. Cross entropy loss was used as the prediction loss.

Results

The CNN was performed the worst on class 6, the shirt class, with accuracy 74%.

Examples of Correctly Labeled Images

Examples of Incorrectly Labeled Images

Visualizing Learned Filters

Part 2: Semantic Segmentation

In this part semantic segmentation is performed on images of architecture from around the world from the Mini Facade dataset. Every pixel of each image is labeled with one of five classes: balcony, window, pillar, facade and others.

CNN Architecture

Conv2d(3, 5, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))

output_shape: [[1, 5, 256, 256]]

ReLU() output_shape: [[1, 5, 256, 256]]

Conv2d(5, 5, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))

output_shape: [[1, 5, 256, 256]]

ReLU() output_shape: [[1, 5, 256, 256]]

MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)

output_shape": [[1, 5, 128, 128], [1, 5, 128, 128]]

Conv2d(5, 5, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))

output_shape": [[1, 5, 128, 128]]

ReLU() output_shape": [[1, 5, 128, 128]]

Conv2d(5, 5, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))

output_shape": [[1, 5, 128, 128]]

ReLU() output_shape": [[1, 5, 128, 128]]

MaxUnpool2d(kernel_size=(2, 2), stride=(2, 2), padding=(0, 0))

output_shape": [[1, 5, 128, 128]]

Conv2d(5, 5, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))

output_shape": [[1, 5, 128, 128]]

ReLU()

output_shape": [[1, 5, 128, 128]]

Conv2d(5, 5, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) output_shape": [[1, 5, 128, 128]]

ReLU()

output_shape": [[1, 5, 128, 128]]

Conv2d(5, 5, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) output_shape": [[1, 5, 256, 256]]

ReLU()

output_shape": [[1, 5, 256, 256]]

Conv2d(5, 5, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) output_shape": [[1, 5, 256, 256]]

The first and second parameters of Conv2d layers are number of features in and number of features out.

Hyperparameters

The model used cross entropy loss, Adam optimizer with learning rate 0.0009841 and weight decay 0.000005803. To find the learning rate and weight decay, I used Weights and Biases to sweep over possible hyperparameters. The hyperparameter combination resulted in the lowest validation loss of 0.9900442345456762 and highest average precision 0.4318011398682467.

All Runs: https://app.wandb.ai/qwerty321/proj4
Best Run: https://app.wandb.ai/qwerty321/proj4/runs/2la3f43f/overview?workspace=user-qwerty321

Training


lr = 0.0009841, weight decay = 5.803e-06

Start training
-----------------Epoch = 1-----------------
[epoch 1] loss: 1.074 elapsed time 19.803
1.2475201530115945
-----------------Epoch = 2-----------------
[epoch 2] loss: 1.099 elapsed time 19.730
1.184222888488036
-----------------Epoch = 3-----------------
[epoch 3] loss: 1.104 elapsed time 19.824
1.1611246067089038
-----------------Epoch = 4-----------------
[epoch 4] loss: 1.085 elapsed time 19.724
1.1474093188951304
-----------------Epoch = 5-----------------
[epoch 5] loss: 1.039 elapsed time 19.729
1.1315813195574416
-----------------Epoch = 6-----------------
[epoch 6] loss: 0.995 elapsed time 19.753
1.1094945272901555
-----------------Epoch = 7-----------------
[epoch 7] loss: 0.929 elapsed time 19.822
1.0924140545693073
-----------------Epoch = 8-----------------
[epoch 8] loss: 0.913 elapsed time 19.731
1.0763650487412464
-----------------Epoch = 9-----------------
[epoch 9] loss: 0.896 elapsed time 19.782
1.061426763678645
-----------------Epoch = 10-----------------
[epoch 10] loss: 0.871 elapsed time 19.751
1.0424758926197724
-----------------Epoch = 11-----------------
[epoch 11] loss: 0.814 elapsed time 19.776
1.0196489950457772
-----------------Epoch = 12-----------------
[epoch 12] loss: 0.808 elapsed time 19.840
1.021347157575272
-----------------Epoch = 13-----------------
[epoch 13] loss: 0.791 elapsed time 19.782
1.0112490899615236
-----------------Epoch = 14-----------------
[epoch 14] loss: 0.778 elapsed time 19.806
1.0023106788540934
-----------------Epoch = 15-----------------
[epoch 15] loss: 0.804 elapsed time 19.724
1.0026988318333259
-----------------Epoch = 16-----------------
[epoch 16] loss: 0.788 elapsed time 19.740
0.9974784284502596
-----------------Epoch = 17-----------------
[epoch 17] loss: 0.756 elapsed time 19.792
0.9924349172429724
-----------------Epoch = 18-----------------
[epoch 18] loss: 0.773 elapsed time 19.760
0.9855036938583458
-----------------Epoch = 19-----------------
[epoch 19] loss: 0.785 elapsed time 19.821
0.9867260564159561
-----------------Epoch = 20-----------------
[epoch 20] loss: 0.763 elapsed time 19.797
0.9813663579605438
-----------------Epoch = 21-----------------
[epoch 21] loss: 0.767 elapsed time 19.840
0.9819232621690729
-----------------Epoch = 22-----------------
[epoch 22] loss: 0.763 elapsed time 19.755
0.9804798403939048
-----------------Epoch = 23-----------------
[epoch 23] loss: 0.764 elapsed time 19.723
0.9735997528820247
-----------------Epoch = 24-----------------
[epoch 24] loss: 0.756 elapsed time 19.804
0.9686579458661132
-----------------Epoch = 25-----------------
[epoch 25] loss: 0.756 elapsed time 19.747
0.9665454098811517
-----------------Epoch = 26-----------------
[epoch 26] loss: 0.757 elapsed time 19.813
0.9643744079621284
-----------------Epoch = 27-----------------
[epoch 27] loss: 0.753 elapsed time 19.764
0.9617108489785876
-----------------Epoch = 28-----------------
[epoch 28] loss: 0.754 elapsed time 19.786
0.9610359177484618
-----------------Epoch = 29-----------------
[epoch 29] loss: 0.747 elapsed time 19.750
0.9577845488930796
-----------------Epoch = 30-----------------
[epoch 30] loss: 0.749 elapsed time 19.783
0.957324892937482
-----------------Epoch = 31-----------------
[epoch 31] loss: 0.757 elapsed time 19.783
0.9622196610812302
-----------------Epoch = 32-----------------
[epoch 32] loss: 0.746 elapsed time 19.735
0.9562080554909759
-----------------Epoch = 33-----------------
[epoch 33] loss: 0.750 elapsed time 19.747
0.9616343205446726
-----------------Epoch = 34-----------------
[epoch 34] loss: 0.758 elapsed time 19.708
0.9653705122706654
-----------------Epoch = 35-----------------
[epoch 35] loss: 0.760 elapsed time 19.849
0.9668513670727447
-----------------Epoch = 36-----------------
[epoch 36] loss: 0.745 elapsed time 19.862
0.9569179949524639
-----------------Epoch = 37-----------------
[epoch 37] loss: 0.742 elapsed time 19.775
0.9566913279858265
-----------------Epoch = 38-----------------
[epoch 38] loss: 0.736 elapsed time 19.724
0.9529203235448062
-----------------Epoch = 39-----------------
[epoch 39] loss: 0.751 elapsed time 19.768
0.9600898954239521
-----------------Epoch = 40-----------------
[epoch 40] loss: 0.752 elapsed time 19.778
0.9593775049670712
-----------------Epoch = 41-----------------
[epoch 41] loss: 0.753 elapsed time 19.762
0.9571986391649141
-----------------Epoch = 42-----------------
[epoch 42] loss: 0.751 elapsed time 19.794
0.9560067801030128
-----------------Epoch = 43-----------------
[epoch 43] loss: 0.732 elapsed time 19.721
0.9498448755059924
-----------------Epoch = 44-----------------
[epoch 44] loss: 0.738 elapsed time 19.815
0.949612496645896
-----------------Epoch = 45-----------------
[epoch 45] loss: 0.740 elapsed time 19.723
0.9521561438565725

Finished Training, Testing on test set
0.9656797733746076

Generating Unlabeled Result

AP = 0.6303548205778786
AP = 0.7219391347969041
AP = 0.048428428434867055
AP = 0.7036956574739648
AP = 0.10522684861314587
Average AP on test set: 0.442

Results

The algorithm model correctly labels the facade and windows but fails to label the remaining classes (others, pillar, and balcony).

Original

Output

Ground Truth

Original

Output

Ground Truth

References

https://pytorch.org/tutorials/beginner/blitz/cifar10_tutorial.html https://colab.research.google.com/drive/1B5KQvPySqYEa6XicRHdOwgv8fN1BrCgQ#scrollTo=4zsDWUk5cavn https://pytorch.org/tutorials/beginner/blitz/neural_networks_tutorial.html https://pytorch.org/tutorials/beginner/blitz/neural_networks_tutorial.html https://towardsdatascience.com/visualizing-convolution-neural-networks-using-pytorch-3dfa8443e74e https://github.com/Airconaaron/blog_post_visualizing_pytorch_cnn/blob/master/Visualizing%20Learned%20Filters%20in%20PyTorch.ipynb https://machinelearningmastery.com/how-to-visualize-filters-and-feature-maps-in-convolutional-neural-networks/