Project 4

Jeffrey Huo

Part 1: Image Classification

Below are a few sample images from the fashion-MNIST dataset with labels.

Here is the structure of the convolutional neural network. I used a learning rate of 0.0015 and six epochs to train the network. 

These are the accuracies of the train and validation sets over time in epochs. 

These are the results per class as well as samples from correct classifications and incorrect classifications per class.

These are the learned filters from the first layer of the CNN network.

Part 2: Semantic Segmentation

Here is the structure of the convolutional neural network used for this part. I used a learning rate of 1e-3, a weight decay of 1e-5, and 25 epochs to train the network. 

Below is the average ap achieved by this network. From the order of the numbers, we can see that the CNN poorly identifies pillars but does reasonably well for other classes. 

Below we have the training set losses and validation set losses over time using epochs.

Below I added another image and tested the results. The CNN is able to identify most of the windows but fails other parts of the building such as pillars and other. This is most likely due to the fact that this image differs greatly in appearance from those in the training set, causing the CNN to fail.