CS194-26 Project 1

Vincent Zhu 3032108340

Overview:

In the early 20th century, Sergei Mikhailovich Prokudin-Gorskii took thousands of color photographs of the Russian empire, in the form of RGB glass-plate negatives. These photographs have recently been digitized and published online. The purpose of this project is to take the three color channel images and properly align them to produce a full color photograph.

My Approach: I assumed a simple [x,y] translation was sufficient to align the images properly. I exhaustively searched over a [-15, 15] pixel displacement window and chose the displacement that yielded the best score under the sum of squared differences similarity metric. For photos that were too large to do this on, I used an image pyramid approach, starting from the coarsest scale and moving downwards, updating the best displacement estimate at every level.

There were no problems that required me to change my implementation design; the only problems I encountered were in implementing said design, which was probably a good thing. For example, a very persistent bug I spent a long time finding was in line 71 of main.py. At one point, I wasn't summing the two displacement values; I was just returning one of them. This meant my estimate was not updating properly.

aligned_cathedral.jpg

aligned_emir.tif

aligned_harvesters.tif

aligned_icon.tif

aligned_melons.tif

aligned_monastery.jpg

aligned_onion_church.tif

aligned_lady.tif

aligned_tobolsk.jpg

aligned_self_portrait.tif

Calculated Offsets:

Cathedral: G[5, 2] R[12, 3]

Emir: G[48, 24] R[-55, 16]

Harvesters: G[59, 18] R[124, 15]

Icon: G[41, 18] R[90, 23]

Melons: G[84, 10] R[180, 14]

Monastery: G[-3, 2] R[3, 2]

Onion Church: G[50, 26] R[108, 37]

Lady: G[58, 5] R[118, 10]

Tobolsk: G[3, 3] R[7, 3]

Self Portrait: G[77, 29] R[175, 37]

Three Generations: G[54, 16] R[106, 14]

Train: G[42, 7] R[85, 32]

Workshop: G[160, 133] R[288, 60]

aligned_guardhouse.tif

aligned_por_porog_waterfall.tif

aligned_lake_sterzh.tif

aligned_casting.tif

Calculated Offsets:

Guardhouse: G[10, 18] R[27, 28]

Por Porog Waterfall: G[27, 0] R[106, 0]

Lake Sterzh: G[64, 24] R[77, 30]

Casting: G[61, 23] R[142, 21]

Of all the ones I tried, Emir and Village aligned very poorly, and Lake Sterzh aligned passably but noticeably worse than the others. Village and Lake Sterzh both had long expanses of similar pixel values that made it hard to find the correct displacement values, while Emir had a lot of noise in its data. Implementing the bells and whistles would probably have helped significantly with these images, especially cropping the borders and trying slight rotations.