Overview

From 1907 to 1915, Russian photographer Sergei Mikhailovich Prokudin-Gorskii obtained permission from the Tsar to travel throughout the empire, taking photographs of the things he saw. However, these photographs were not ordinary for the time - they encoded color information instead of one color dimension. This was done by taking three exposures of the same scene using red, green, and blue filters.

When properly aligned, these three frames can compose a color picture if the intensities of each frame are interpretted as intensity of their color channel. Through the Library of Congress, digitized versions of the three color exposures are available to manipulate in .jpg or .tif form. The following is a description and evaluation of the algorithm used to align the channels.

Small Image Algorithm

For small images, a simple search over different shifts of x and y axis displacements sufficed. A certain color channel (in this case green) was chosen as the stationary channel which other channels had to align to. A metric is used to 'score' each shift, and the shift with the best score was chosen as the final position of the frame. The particular metric chosen was Normalized Cross-Correlation, defined as:

NCC = <(Image1/||Image1||), (Image2/||Image2||)>

Where Image1 and Image2 and vectorized versions of images, and <a,b> represents the dot product of a and b.

Large Image Algorithm

For larger images, the search space for shifting is likely too small to align the frames. To make the algorithm more robust, an image pyramid must utilized. The objective of the image pyramid is to scale the image down by a factor, run the small image algorithm, and repeat the process with larger scale of the image. This improves the alignment because it allows the search space to be much larger (the effective range of the search space increases by factor of 1/scale).

Originally the search space used was the same for every pyramid level. This resulted in processing times of around 4-5 min since the processing time is at bound to be larger than if we applied the small image algorithm on a large image, which already took too much time. To remedy this, the search space for a given level cut down by a factor of two compared to the previous level. Once layer 3 was reached, if at all, then the search space used was a fourth of the scale of the original search space. This resulted in an order of magnitude change in computation time while retaining result quality. The reason result quality wasn't reduce because after the first layer in the pyramid, the image was already somewhat aligned. Subsequent layer only needed to make small adjustments to increase alignment.

In my code, align_recc() was the recursive function that implemented the large image algorithm. The small image algorithm can just be run by not specifying the number of levels in the function call (it defaults to 0, meaning no scaling).

Example Images

Below are the results of the larger image algo for some of the images from the LOC collection, along with the alignment displacements:

* Original high-resolution .tif scans have been output here as smaller, lower-resolution .jpg

cathedral.jpg

R: [7 1]
B: [-5 -2]

emir.tif

R: [57 17]
B: [-49 24]

harvesters.tif

R: [65 -3]
B: [-60 -17]

icon.tif

R: [48 5]
B: [-41 -17]

lady.tif

R: [62 3]
B: [-55 -9]

monastery.jpg

R: [6 1]
B: [3 -2]

nativity.jpg

R: [4 -1]
B: [-3 -1]

self_portrait.tif

R: [98 8]
B: [-79 -29]

settlers.jpg

R: [8 -1]
B: [-7 0]

three_generations.tif

R: [-53 -14]
B: [52 -3]

train.tif

R: [43 27]
B: [-42 -6]

turkmen.tif

R: [60 7]
B: [-56 -21]

village.tif

R: [73 10]
B: [-65 -12]

Issues

The one image that I had issues with was self_portrait.tif. Whereas the other images needed typically four pyramid levels to align, this image took five. The likely explanation is that the color channel offsets are larger than those of the others. An additional level was thus needed to align the image better at the small scale.

Additional Images

Below are some selected images from the Prokudin-Gorskii collection:
* Original black and white scans were all high-resolution .tif files, but results shown here are smaller, lower-resolution .jpg

canal.jpg

R: [42 -3]
B: [-40 -2]

entrance.jpg

R: [80 16]
B: [-58 -22]

plaque.jpg

R: [47 -7]
B: [15 -8]

production_shop.jpg

R: [60 13]
B: [-48 -20]

Bells and Whistles: Automatic Color Balance

Color balancing is a technique that assumes the mean pixel color should be grey and 'corrects' all the other pixels by scaling their color channels by a factor (grey-world assumption). A simple algorithm to achieve this is by taking the mean pixel values of every color channel, picking the channel with the largest mean as a base color, and then scaling the non-base channel by a factor. The factor for a given channel C is simply the mean of the base channel divided by the mean of C.

The first two images are examples where white-balancing worked well, while the last one is one where it worsened the appearance. Color balancing on these photos lifts a sort of 'haze' over the image and gives a bright and clear appearance. The algorithm doesn't work well all the time however - in photos which already have a lot of brightness or whiteish hues, the balancing tends to make them even brighter, resulting in an unrealistic appearance. For example, on Lady, the woman's shirt is way too bright and unaesthetic.

Monastery

CS 194-26 Project 1

Colorizing the Prokudin-Gorskii Photo Collection
Arnav Vaid

Overview

Small Image Algorithm

Large Image Algorithm