Loss: CrossEntropyLoss
Optimizer: SGD(lr=0.01, weight_decay=1e-4, momentum=0.9
'T-shirt/top' | 'Trouser' | 'Pullover' | 'Dress' | 'Coat' | 'Sandal' | 'Shirt' | 'Sneaker' | 'Bag' | 'Ankle boot' | Total |
83.3 | 98.5 | 89.1 | 92.4 | 82.1 | 98.1 | 73.4 | 96.0 | 97.0 | 95.9 | 90.58 |
Each block in the image includes convolution, BatchNorm and ReLU. And the "Down" block also have a maxpooling layer to half the image size. The "Up" block first upsample the output of the last block and concatenates it with the output of another block, which has the same size.
'others' | 'facade' | 'pillar' | 'window' | 'balcony' | mean |
61.08 | 71.68 | 21.74 | 83.99 | 52.14 | 58.12 |
The model does good on windows, but it cannot segment the pillars very well. I think the reason is the data of pillars is small, which is not enough for the model to learn the feature of pillars. Besides, the model has diffculty when segment 'facade' and 'others'. The reason may be that there is no stable feature of 'others' class.