CS 194-26:Image Manipulation, Computer Vision and Computational Photography

Project 4: Classification and Segmentation

Bryant Le: cs194-26-acu

Part 1: Image Classification

I used the MNIST fashion dataset to train a model that classifies clothing into 10 classes. The architecture for this model was Convolution Layer -> Max Pooling Layer -> ReLU -> Convolution Layer -> Max Pooling Layer -> ReLU -> Fully Connected -> Fully Connected -> Output (10). I used cross entropy as a loss function, and Adam with a learning rate of 0.01 for the optimizer.
Here is the architecture of the model, overall accuracy, and some of the learned filters.
Here are the accuracies for each class.
Here is the accuracy and loss during training.
Here are some correct and incorrect classifications from each class.

Part 2: Semantic Segmentation

I was able to experiement a lot with the architecture of my model, but I decided on using 6 layers. At each layer except that last, I would have a convolutional layer and normalize and use ReLU with a kernal size of 3. On the layer before my last, I also use max pooling. My architecutre channel was 3 -> 16 -> 32 -> 64 -> 32 -> 16 -> 5. I had an average AP of 0.46!

Hyperparameters

For the loss function, I used cross entropy loss. For the optimizer I used Adam. I had a learning rate of 0.001 and a weight decay of 0.00001. I had 15 epochs, and my batch size was 10.

Demonstration

Here is an example of my model trying to apply segmentation to the following image compared to the expected result.