Colorizing the Prokudin-Gorskii Photo Collection

CS194-26, Project 1

Inan Husain



Overview

James Clerk Maxwell first suggest one of the first methods for taking color photographs-take three black and white photographs and the results could produce a color image through what was called a chromoscope. Sergei Mikhailovich Prokudin-Gorskii was convinced that this technique was the future of photography. As such he took all kinds of portraits from all over Russia, picture of people, nature, civilization-- the entire totality of Russia is represented in his work. In this project, I aimed to colorize these pictures from his collection.

Implementation

My algorithm works by first splitting the image into thirds for each of the red, green, and blue channels. I then rescale the images down a factor of 2 until it is less than 400 pixels wide or long. From there, I use the blue channel as a reference for the green and red images, sliding them over a window of -15 to 15 pixels left-to-right and up-and-down. We then score each displacement by using normalized cross-correlation (NCC) to get a guess for how good the alignment is. I take the alignment and use this as an initial estimate for what the alignment of the full resolution image should be. After obtaining this initial estimate for the displacement, I start scaling up by factors of 2 towards the original resolution, repeating the NCC sliding window algorithm, but this time I only check a window 1 pixel around the estimated displacement, as we already havea good idea that further displacements didn't work before, and we simply wish to test the new information given to us by scaling back up. I do this until I have the alignment for the original full resolution channel images. After this I simply stack the images, and take the output as my result.
It should be noted that in my naive implementation, I don't bother with scaling the image down, I simply tried to test the [-15,15] window without getting any estimates beforehand, which was very slow and ineffective on large images. The image pyramid algorithm described above runs in around 15 seconds for most of these images. Also while computing the NCC, I only use the inner 2/3 of the image so that the displaced border does not affect the score.

Results on example images (with calculated offsets listed below)

G: [5,2], R: [12,3]
G: [-3,2], R: [3,2]
G: [3,1], R: [8,0]
G: [7,0], R: [14,-1]
G: [48,12], R: [112,12]
G: [48,24], R: [224,-207]
G: [48,16], R: [112,16]
G: [64,16], R: [128,16]
G: [80,30], R: [176,32]
G: [48,0], R: [80,17]
G: [64,15], R: [144,24]
G: [33,16], R: [97,30]
G: [48,16], R: [96,16]

Explanation for failure

As you can see from the results, most of the images ended up looking really good, there was just one failure with the photograph of Emir. This is mostly due to the different channels have different luminance values, meaning that the brightness of the image varies from image to image, making NCC a poor evaluator of alignment in this case. This could be fixed with some pre-processsing of the image, or using a better feature such as the edges of the image instead.

Examples of my choosing