In this project, we synthesized color images from the Prokudin-Gorskii collection by aligning and combining their red, blue, and green color channels.

Approach

For the base align function, the implementation attempts every combination of the row and column offsets, rolling the image and scoring it against the reference image using SSD to find the best one. The default window I used for both row and column was [-20, 20]. Before using it, I also cropped borders of 20 pixels on all sides for the jpg images to make them look nicer.

For pyramid alignments, it recursively scales both the image and reference by half for the given number of times (I used 8) before calling my base align function to find the best displacement for the heavily scaled image. It then recurses back up, using the scaled down displacement as a reference to reduce the number of searches required. This heavily cuts down the number of combinations that need to be tried, but at a sacrifice of image quality, so it doesn't work as well on jpg images as the base align function does. Additionally, I cropped borders of 100 pixels on all sides for the tif images to make them look nicer.

Bells and Whistles

By improving the alignment using Canny edge detection, I got the emir.tif image to look much nicer. I do this by running the images through the canny algorithm provided by skimage and converting the results into 0s and 1s. I then use these to calculate the ideal displacements, and this works much better because it's much easier to align black/white than colored images, where the edges are decimals and aren't nearly as obvious from one another. These displacements are used to roll the original images before stacking them together.