CS194-26 Project 1

Vincent Zhu 3032108340

Overview:

In the early 20th century, Sergei Mikhailovich Prokudin-Gorskii took thousands of color photographs of the Russian empire, in the form of RGB glass-plate negatives. These photographs have recently been digitized and published online. The purpose of this project is to take the three color channel images and properly align them to produce a full color photograph.

My Approach: I assumed a simple [x,y] translation was sufficient to align the images properly. I exhaustively searched over a [-15, 15] pixel displacement window and chose the displacement that yielded the best score under the sum of squared differences similarity metric. For photos that were too large to do this on, I used an image pyramid approach, starting from the coarsest scale and moving downwards, updating the best displacement estimate at every level.

There were no problems that required me to change my implementation design; the only problems I encountered were in implementing said design, which was probably a good thing. For example, a very persistent bug I spent a long time finding was in line 71 of main.py. At one point, I wasn't summing the two displacement values; I was just returning one of them. This meant my estimate was not updating properly.

In [24]:
 
aligned_cathedral.jpg
aligned_emir.tif
aligned_harvesters.tif
aligned_icon.tif
aligned_melons.tif
aligned_monastery.jpg
aligned_onion_church.tif
aligned_lady.tif
aligned_tobolsk.jpg
aligned_self_portrait.tif
aligned_three_generations.tif
aligned_train.tif
aligned_village.tif
aligned_workshop.tif

Calculated Offsets:

Cathedral: G[5, 2] R[12, 3]

Emir: G[48, 24] R[-55, 16]

Harvesters: G[59, 18] R[124, 15]

Icon: G[41, 18] R[90, 23]

Melons: G[84, 10] R[180, 14]

Monastery: G[-3, 2] R[3, 2]

Onion Church: G[50, 26] R[108, 37]

Lady: G[58, 5] R[118, 10]

Tobolsk: G[3, 3] R[7, 3]

Self Portrait: G[77, 29] R[175, 37]

Three Generations: G[54, 16] R[106, 14]

Train: G[42, 7] R[85, 32]

Workshop: G[160, 133] R[288, 60]

In [25]:
 
aligned_guardhouse.tif
aligned_por_porog_waterfall.tif
aligned_lake_sterzh.tif
aligned_casting.tif

Calculated Offsets:

Guardhouse: G[10, 18] R[27, 28]

Por Porog Waterfall: G[27, 0] R[106, 0]

Lake Sterzh: G[64, 24] R[77, 30]

Casting: G[61, 23] R[142, 21]

Of all the ones I tried, Emir and Village aligned very poorly, and Lake Sterzh aligned passably but noticeably worse than the others. Village and Lake Sterzh both had long expanses of similar pixel values that made it hard to find the correct displacement values, while Emir had a lot of noise in its data. Implementing the bells and whistles would probably have helped significantly with these images, especially cropping the borders and trying slight rotations.