Project 1: Images of the Russian Empire

Patrick Lutz (cs194-26-acv)

Overview

The goal of this project was to colorize images taken by Sergei Mikhailovich Prokudin-Gorskii in Czarist Russia in the early twentieth century. Using only black and white film, he recovered enough data to create colored images by taking three separate photos: one each with a red, green, and blue filter. So colorizing these photos is much easier than colorizing typical old black-and-white photos. The only difficulty is that there are small offsets between the 3 versions of each photo, so colorizing them basically just requires finding a way to align the red, green, and blue photos before combining them.

Approach

Here was my approach to aligning the images (which won’t be too surprising because it is the one recommended in the project instructions). My algorithm tried to find translations of the red and green channels which best aligned them with the blue channel. Here’s how that works. First search over all small translations of the red channel. For each translation, compare the translated pixel values in the red channel to the pixels in the blue channel. Compute some kind of dissimilarity score and pick the translation with the lowest score. Then repeat for the green channel. And finally, stack the blue channel and the translated red and green channels to create the color image.

For the dissimilarity score, I used the L²-distance (what the instructions referred to as the sum of squared differences). I also tried normalized cross-correlation, but it seemed like the results were pretty much the same when using either one.

Doing an exhaustive search over possible translations became prohibitively time consuming for the larger images. First, because there were more translations to try and second, because each operation became more expensive with a larger image. To solve this, my algorithm first scaled the image down until it was less than 100x100. It then searched for the best translation and then recursively found the best translation for the less-scaled down versions. Each time the translation was 2 times as high resolution as the previous level, and each time it searched for a translation within ± 1 steps from the translation found on the previous level.

This improvement immediately dropped the time the algorithm took for each of the large images from more than 10 minutes to about 10 seconds.

Bells and Whistles

Using the L² distance worked very well for most of the photos, but for a few which had different brightness levels in the three channels, it produced monstrosities like these:

Emir Flowers

To try to fix this, before aligning the photos I first used the edge detection feature in scikit-image to extract the edges of the photos, hoping that this would allow better alignment. In some cases this worked quite well. For instance, the two photos above now looked like this:

Emir Flowers

However, a few photos now looked slightly worse than previously. For instance, here’s one photo before and after I added edge detection:

harvesters

I tried taking a linear combination of the original photos along with the extracted edges, but this didn’t always work:

Bad Emir

Results

Cathedral (green (5, 2); red (12, 3))	Emir (green (49, 23); red (107, 40))
Harvesters (green (60, 18); red (123, 9))	Icon (green (42, 16); red(90, 22))
Lady (green (56, 10); red (120, 13))	Melons (green (79, 9); red (177, 14))
Monastery (green (-3, 2); red (3, 2))	Onion Church (green (52, 24); red (107, 35))
Self Portrait (green (77, 29); red (175, 37))	Three Generations (green (57, 17); red (115, 11))
Tobolsk (green (3, 3); red (6, 3))	Train (green (44, 2); red (84, 34))
Village (green (65, 11); red (137, 21))	Workshop (green (52, -2); red (105, -12))

More images

Churchyard arch	Two men in a boat
Dredger	Wild flowers