berkeley logoProgramming Project #4 (proj4)
CS194-26: Image Manipulation, Computer Vision and Computational Photography

Due Date: 11:59pm on Monday, Mar 30, 2020 [START EARLY]

 

Classification and Segmentation

 


In this problem, we will solve classification of images in the Fashion-MNIST dataset and semantic segmentation of images in mini Facade dataset using Deep Nets! For this question, you can use pytorch/tensorflow or any other deep learning framework you like. We will primarily focus on pytorch for the implementation details but you can look for equivalent functions in tensorflow. You will run your code on Google Colab, or use your own GPU if you want. If you choose to use colab, once you create a new notebook, you need to go to Runtime --> change runtime type and set the hardware accelerator to GPU. Note that one google colab session has an idle timeout for 90 minutes and an absolute timeout for 12 hours, so please download your results/your trained model frequently. Make sure to START EARLY if you are not familiar with colab, we will not further extend the due date for problems caused by that. Once you are done, export your code form colab as .py and submit it to bcourse.

Part 1: Image Classification

We will use the Fasion MNIST dataset available in torchvision.datasets.FashionMNIST for training our model. Fashion MNIST has 10 classes and 60000 train + validation images and 10000 test images.
fashion mnist

Part 2: Semantic Segmentation

example input output color map

Semantic Segmentation refers to labeling each pixel in the image to its correct object class. We will use the Mini Facade dataset. The starter code and data for this part is available here. Mini Facade dataset consists of images of different cities around the world and diverse architectural styles (in .jpg format), shown as the image on the left. It also contains semantic segmentation labels (in .png format) in 5 different classes: balcony, window, pillar, facade and others. Your task is to train a network to convert image on the left to the labels on the right. The color to label map is shown in the table below.

Resources

[1] Introductory Pytorch Tutorial
[2] Tutorial on writing a custom Dataloader
[3] Example of a neural network class
[4] Google Colab Tutorial (Using this should be very similar to using an ipython notebook)

Acknowledgements

The assignment has been adapted from David Fouhey's Computer Vision Course at Michigan University.