Project Motivation

Sergey Mikhaylovich Prokudin-Gorsky (Russian: Сергей Михайлович Прокудин-Горский; August 30 [O.S. August 18] 1863 – September 27, 1944) was a Russian chemist and photographer. He is best known for his pioneering work in colour photography and his effort to document early 20th-century Russia.

Using a railroad-car darkroom provided by Tsar Nicholas II, Prokudin-Gorsky traveled the Russian Empire from around 1909 to 1915 using his three-image colour photography to record its many aspects. While some of his negatives were lost, the majority ended up in the U.S. Library of Congress after his death. Starting in 2000, the negatives were digitised and the colour triples for each subject digitally combined to produce hundreds of high-quality colour images of century-ago Russia.

Overview

In the project, the task is to use Prokudin-Gorskii's three-image color photography as inputs to produce color images as what Prokudin-Gorskii would do at his time. The inputs consist of 3 black and white images, each correspond to the intensity image of red, green and blue images Prokudin-Gorskii took. Due to the fact Prokudin-Gorskii did not take the 3 images at the same time, the 3 images are supposed to be off-aligned. The main task is to align these 3 images to produce a perfect color image. The two main techniques used in this project can be catagorized as the Native Approach and the Coarse-to-fine Pyramid Speedup Approach

Naive Approach

As the name suggest, this approach takes the most intuitive direction: We "manually" compare all the possible displacements of the 3 images and find the best one that aligns the 3 images. Of course, we only need to check all displacements under a small window because the images are not too non-aligned. Thus for each displacement in the window of [-15, 15] for each axis, we compare two images by using the Sum of Squared Differences, and find the displacement with minimum difference. I use the green image as the main reference image, and use it to find the alignment of blue and red images

Below are the three outputs using this Naive Approach with the displacements noted below.

Result

G: [5, 2]; R: [-7, -2]

G: [-3, 2]; R: [-6, -1]

G: [3, 3]; R: [-4, -1]

Coarse-to-fine Pyramid Speedup Approach

One of the biggest drawbacks for the Naive Approach is that we can't use it to process images with high resolution. Even with a tiny window of [-15, 15], we need to check 31*31~1000 possible displacements. If we are using a image of 3200 * 3200pixels, each displacement then requires a examination of 3200 * 3200 ~ 10^7 pixels, which leads to the examination of 10^10, over 10billion, pixels in total! Thus we introduce Coarse-to-fine Pyramid Speedup Approach

Instead of looking at all the pixels at first, we start the process by first "shrink" the image. We then find the best displacement for this shrinked image and then move on to another slightly bigger, but still has much fewer pixels than the original one, image. We do the process again, with a smaller window. We keep doing this process until we reach the base level, which is our original image. By doing this, we either use a much smaller window over a large amount of pixels, or use a normal window over a very small amount of pixels. This low-resolution to high-resoluton approach looks like a pyramid and that's where the name comes from. As we go down the pyramid (closer to the original image), we have finer displacement that resembles the true displacement. Thus by using this approach, we can start processing the images with huge resolutions.

Below are the outputs using this Coarse-to-fine Pyramid Speedup Approach with the displacements noted below.