For this project, I designed an image alignment algorithm to colorize Sergie Prokudin-Gorskii's photos of 20th-century Russia.
Before running my algorithm, I preprocessed the images by cropped out the black and white borders, so that we would only try to align the internal pixels. Then, I split the image into its red, green, and blue filtered versions.
My function aligned the green and red images to the blue image by finding the x and y offsets that minimized the L2 (sum of squared differences) cost between any pair of images. Because of the larger image dimensions, I used a recursive image pyramid approach as described in the project spec to optimize runtime. I shrink the image multiple times using a scaling factor of 1/5 until it reaches either 10 levels of scaling "up" the pyramid or is less than 50 pixels wide. I apply the aforementioned L2 minimization to find the optimal displacement, then pass that value down a level of the pyramid, for which I'd only need to search for the optimal displacement between the uncertainty range of the previous optimal displacement (specifically, ± 5 pixels). After recursively finishing this until the original "bottom" level of the pyramid, I'm able to find the best displacement in closer to logarithmic rather than linear time (relative to the # of pixels).
Here are the images generated from my program from the example folder:
Unfortunately, my algorithm failed to align the emir.jpg image. For emir.jpg, the brightness was inconsistent across the three color channels, so the L2 cost between pixels was not a very accurate. To solve this, I may have to use a different cost function or look for other features (curves, gradients) than just pixel brightness.
Here are the offsets (x, y) I calculated for each photo:
Here are some additional images from the Prokudin-Gorskii collection that I ran my program on:
I had a lot of fun implementing this! :)