Colorizing the Prokudin-Gorskii Photo Collection

CS 194: Project 1
Images of the Russian Empire: Colorizing the Prokudin-Gorskii Photo Collection

Background:

A long, long, time ago... a man ahead of his time, Sergei Mikhailovich Prokudin-Gorskii traveled throughout Russia and photographed everything he saw: people, buidings, landscapes, literally everything. You may be asking, but why? Prokudin-Gorskii was convinced that color photography was the future, and he wanted to ensure that Russian children would be able to witness their nation in color. To accomplish this, he took 3 photos of everything he saw: one with a red filter, one with green and the last with blue. These glass negatives, when properly aligned would yield a colorized image!

Project Overview:

Our objective for this project was to do exactly what Prokudin-Gorskii envisioned: aligning the 3 color channels to convert a grayscale image to a colored one. Before doing any sort of image processing, there was a setup procedure which entailed reading in the desired jpg/tif file and dividing it into three parts for the three channels: B, G and R. Once this has been done, we now need to actually align these channels. To do so, we used two different methods:

Exhaustive Search

This is the naive method for aligning images: given the 3 color channels, you choose one channel as the base channel, align the other channels to this one by searching across a displacement window (such as [-15, 15]) and using some sort of scoring metric (such as Normalized Cross-Correlation (NCC) or Sum of Squared Differences (SSD)) to assign each displacement a score.

My Algorithm:

I initially chose Blue as my base channel and attempted to align Green and Red to it. However, after testing alignments on other base channels, I realized that choosing Green as the base channel actually worked the best, so that's the base channel that I chose to move forward with for the rest of the project. To align one channel (c0) to another (c1), I iterated over a displacement window of [-15, 15] in both the x and y axes and used np.roll() on c0 to test various displacement vectors, choosing the one with the highest NCC. This approach worked well on the .jpg files, this is because these images are not very large and therefore this naive implementation will run fairly quickly.

Results:

Cathedral

Displacement Vectors: Red = (1, 7) & Blue = (-2, -5)

Monastery

Displacement Vectors: Red = (1, 6) & Blue = (-2, 3)

Tobolsk

Displacement Vectors: Red = (1, 4) & Blue = (-3, -3)

Image Pyramid Search

This is a more advanced algorithm, one used on more higher-resolution inputs (like the .tif files). In this case, we leverage an image pyramid to quickly calculate the displacement vectors. An image pyramid represents the image at multiple scales (usually scaled by a factor of 2) and the processing is done sequentially starting from the coarsest scale (smallest image) and going down the pyramid, updating the estimate of the displacement vector as you go.

My Algorithm:

I used a recursive structure to run my pyramid search, each recursive call rescaled the image by 1/2 and doubled the displacement window. The bottom of the recursive stack was when the image was either less than 400 x 400 pixels (roughly the size of the JPG input images) or when it's shrunk down to 1/32 the original. At this point, align() was called on the red and blue channels (just as before to find the best offset), with the expanded displacement window passed in as an argument. The returned value will contain the displacement vectors needed to align these channels to green. These displacement vectors will be passed up the stack, being multiplied by 2 at each level to account for the rescaled image size (with the displacement window being divided by 2). These offsets will be used as a starting point for exhaustive search on each level, with the returned vectors updating the offsets and repeating the process on the next highest level until we've reached the original image.

Results:

Church

Displacement Vectors: Red = (-7, 31) & Blue = (-1, -25)

Emir

Displacement Vectors: Red = (15, 55) & Blue = (-25, -49)

Harvesters

Displacement Vectors: Red = (-5, 59) & Blue = (-17, -61)

Icon

Displacement Vectors: Red = (3, 47) & Blue = (-17, -41)

Lady

Displacement Vectors: Red = (3, 59) & Blue = (-9, -55)

Melons

Displacement Vectors: Red = (1, 95) & Blue = (-5, -85)

Onion Church

Displacement Vectors: Red = (7, 55) & Blue = (-25, -53)

Self Portrait

Displacement Vectors: Red = (3, 95) & Blue = (1, -51)

Three Generations

Displacement Vectors: Red = (-3, 57) & Blue = (-13, -55)

Train

Displacement Vectors: Red = (25, 43) & Blue = (-7, -45)

Workshop

Displacement Vectors: Red = (-13, 51) & Blue = (-1, 53)

Custom Images from the Collection:

Stove

Displacement Vectors: Red = (15, 47) & Blue = (-33, -41)

Dmitrievskii Cathedral

Displacement Vectors: Red = (3, 59) & Blue = (-25, -59)

Stork

Displacement Vectors: Red = (7, 59) & Blue = (-11, -45)

Challenges

Initially, I didn't do any sort of cropping on the input images. This worked decently well on the JPG inputs, the resulting images weren't amazingly clear but they looked solid. However, when I ran my algorithm for Image Pyramid Search I realized that my outputs were not clear due to the black borders likely throwing off the alignments. Thus, I added a step to my preprocessing in which I cropped out the outer 5% of the height and width of the input image. Therefore only running the alignment algorithms on the middle 95% of the images. This worked extremely well on most of the images, transforming them from blurry renditions to clear ones! However, there were some images where cropping the input either didn't improve the output or actually worsened it. So, for these images I didn't crop them and ran the alignment on the originals. These were the 'melons.tif', 'onion_church.tif' and 'self_portrait.tif'. Now, I'm not 100% sure why the output was worse for 'melons.tif' and 'self_portrait.tif'. If I had to guess, I would think that the edges of these images play a significant part in the alignment and therefore by removing them, the quality of the alignment suffered.