from IPython.core.display import HTML
HTML("""
<style>
div.cell { /* Tunes the space between cells */
margin-top:1em;
margin-bottom:1em;
}
div.text_cell_render h1 { /* Main titles bigger, centered */
font-size: 2.2em;
line-height:0.9em;
}
div.text_cell_render h2 { /* Parts names nearer from text */
margin-bottom: -0.4em;
}
div.text_cell_render { /* Customize text cells */
font-family: 'Georgia';
font-size:1.2em;
line-height:1.4em;
padding-left:3em;
padding-right:3em;
}
.output_png {
display: table-cell;
text-align: center;
vertical-align: middle;
}
</style>
<script>
code_show=true;
function code_toggle() {
if (code_show){
$('div.input').hide();
} else {
$('div.input').show();
}
code_show = !code_show
}
$( document ).ready(code_toggle);
</script>
The raw code for this IPython notebook is by default hidden for easier reading.
To toggle on/off the raw code, click <a href="javascript:code_toggle()">here</a>.
""")
#Trebuchet MS
import numpy as np
import matplotlib.pyplot as plt
import skimage.io as io
import skimage as sk
%matplotlib inline
In this part, we trained a convolutional neural network model to classify the images in FashionMNIST dataset.
We use the recommended architecture as a start. The architecture of the neural network should is conv layers, 32 channels each with 33 filters, where each conv layer is followed by a ReLU followed by a 22 maxpool. This is followed by 2 fully connected networks. Apply ReLU after the first fc layer. We used Adam optimizer with learning rate 1e-3. The batch size is 100.
From the learning curve plotted below, we can see that the learning converges after the first 20 epoches. And the training is slowly overfitting.
lc = io.imread('learning_curve.png')
f=plt.figure(figsize=(15,15))
ax=f.add_subplot(1,1,1)
ax.imshow(lc)
plt.axis('off')
plt.show()
We show correctly classified images for each category:
lc = io.imread('imgs1.png')
f=plt.figure(figsize=(15,15))
ax=f.add_subplot(1,1,1)
ax.imshow(lc)
plt.axis('off')
plt.show()
We show incorrectly classified images for each category:
lc = io.imread('imgs2.png')
f=plt.figure(figsize=(15,15))
ax=f.add_subplot(1,1,1)
ax.imshow(lc)
plt.axis('off')
plt.show()
We show a per class accuracy of your classifier on the validation and test dataset, from which we can see it's hardest to get the shirt class.
Class | Val acc. | Test Acc. |
---|---|---|
TShirt | 85% | 83% |
Trouser | 98% | 99% |
Pullover | 85% | 86% |
Dress | 92% | 91% |
Coat | 86% | 85% |
Sandal | 97% | 98% |
Shirt | 75% | 72% |
Sneaker | 96% | 96% |
Bag | 98% | 98% |
Ankle Boot | 97% | 97% |
— | — | — |
TOTAL | 91% | 90% |
lc = io.imread('filter.png')
f=plt.figure(figsize=(15,15))
ax=f.add_subplot(1,1,1)
ax.imshow(lc)
plt.axis('off')
plt.show()
We use the Mini Facade dataset to train a ConvNet for semantic segmentation. It refers to labeling each pixel in the image to its correct object class. Mini Facade dataset consists of images of different cities around the world and diverse architectural styles. It also contains semantic segmentation labels (in .png format) in 5 different classes: balcony, window, pillar, facade and others.
The network consists of 7 layers, with 64, 128, 256, 512, 512, 1024 and 5 channels respectively. Each is followed by a ReLu function. The filters' size are all 33 with zero padding. We add two 22 max pooling layers after the 5th and 6th conv layers. At the end, we upsample the output to match the size of the original image. We use Adam optimizer with 1e-3 learning rate and 1e-5 decay rate. We manually reduce the learning rate to 1e-4 after the first 65 epoches, since the training starts diverging because of too large learning rate. The batch size is 10.
We use 1e-3 learning rate at start. Then we manually reduce the learning rate to 1e-4 after the first 65 epoches, since the training starts diverging. That's why you can see that the validation loss suddenly drops at he 65th epoch.
lc = io.imread('lc.png')
f=plt.figure(figsize=(15,15))
ax=f.add_subplot(1,1,1)
ax.imshow(lc)
plt.axis('off')
plt.show()
The AP value can achive 0.47 after the training, but we can see that Pillar and Bacony are hard to recognize.
Class | AP |
---|---|
Others | 0.65 |
Facade | 0.63 |
Pillar | 0.09 |
Window | 0.59 |
Balcony | 0.42 |
We show an input image(left) and the segmentation(right), where we can see that it gets windows and some pillars right, but it doesn't get balcony right.
im1 = io.imread('input.png')
im2 = io.imread('output.png')
f=plt.figure(figsize=(15,15))
ax1=f.add_subplot(1,2,1)
ax1.imshow(im1)
plt.axis('off')
ax2=f.add_subplot(1,2,2)
ax2.imshow(im2)
plt.axis('off')
plt.show()