Project 1: Images of the Russian Empire

Richard Liu (3033944112)

For this project, I designed an image alignment algorithm to colorize Sergie Prokudin-Gorskii's photos of 20th-century Russia.

Before running my algorithm, I preprocessed the images by cropped out the black and white borders, so that we would only try to align the internal pixels. Then, I split the image into its red, green, and blue filtered versions.

My function aligned the green and red images to the blue image by finding the x and y offsets that minimized the L2 (sum of squared differences) cost between any pair of images. Because of the larger image dimensions, I used a recursive image pyramid approach as described in the project spec to optimize runtime. I shrink the image multiple times using a scaling factor of 1/5 until it reaches either 10 levels of scaling "up" the pyramid or is less than 50 pixels wide. I apply the aforementioned L2 minimization to find the optimal displacement, then pass that value down a level of the pyramid, for which I'd only need to search for the optimal displacement between the uncertainty range of the previous optimal displacement (specifically, ± 5 pixels). After recursively finishing this until the original "bottom" level of the pyramid, I'm able to find the best displacement in closer to logarithmic rather than linear time (relative to the # of pixels).

Here are the images generated from my program from the example folder:

Unfortunately, my algorithm failed to align the emir.jpg image. For emir.jpg, the brightness was inconsistent across the three color channels, so the L2 cost between pixels was not a very accurate. To solve this, I may have to use a different cost function or look for other features (curves, gradients) than just pixel brightness.

Here are the offsets (x, y) I calculated for each photo:

self_portrait.jpg - Green: (-54, 29) Red: (3028, 36)
monastery.jpg - Green: (-11, 2) Red: (-13, 2)
melons.tiff - Green: (-50, 10) Red: (3021, 13)
castle.tiff - Green: (-79, 3) Red: (3006, 3)
onion_church.tiff - Green: (-67, 26) Red: (2966, 36)
lady.tiff - Green: (-46, 8) Red: (3012, 10)
workshop.tiff - Green: (-5, 2) Red: (-8, 3)
emir.tiff - Green: (-75, 24) Red: (2969, 2240)
train.tiff - Green: (3010, 5) Red: (2920, 24)
icon.tiff - Green: (3033, 17) Red: (2956, 23)
harvesters.tiff - Green: (-64, 16) Red: (2971, 13)
cathedral.jpg - Green: (-5, 2) Red: (-8, 3)
tobolsk.jpg - Green: (-5, 3) Red: (-10, 3)
three_generations.tiff: Green: (-68, 14) Red: (2957, 11)

Here are some additional images from the Prokudin-Gorskii collection that I ran my program on:

I had a lot of fun implementing this! :)