Images of the Russian Empire

Alex Stennet, `cs 194-26-agn`

alt text

In 1907, well ahead of his time, Sergei Mikhailovich Prokudin-Gorskii traveled across the Russian Empire taking color photographs. His process for taking these photos was to take 3 black and white photographs each with a different filter applied: blue, green and red (the left most of the above image). He was then able to project these images through 3 colored lenses using a device that looked like this:

alt text

The goal is to present a methodology to align each of the 3 scans into a single color image in an efficient and quick manner.

Basic Alignment

A naive solution to the alignment problem is to try various alignments and use a metric such as Sum of Squared Differences (SSD) to determine which alignment is best. This works fairly well for images of lower quality and little variety in color. For example:

alt text

However, due to how costly calculating SSD, there is a limitation on how many alignments that can be feasibly attempted on larger images. This can lead to results such as:

alt text

This is caused because a good alignment was not able to be found in the small area surrounding the center. This attempted alignment took 95 seconds while the above 3 took about 1.5 seconds. The same number of alignments were attempted so the time difference lies strictly in how long SSD takes to compute.

Image Pyramid

A technique to improve the performance while increasing the number of alignments attempted is the image pyramid. Conceptually what the image pyramid does is it attempts to align a very low quality version of the same image and progressively increases the quality of the image making smaller and smaller alignment corrections along the way. The effect is allowing for a much larger number of alignments to be computed more quickly and cover a larger area of the image. Here is the same image repeated using the image pyramid:

alt text

The time this took was only 34 seconds, but was able to attempt significantly more alignments; unfortunately, in the case of this image, the resulting alignment is in fact worse than the previous. What this signifies is that either the image pyramid broke the alignment or the metric of alignment being used is not effective for images like this (we'll hope that it isn't broken and think of a new metric).

Edge Filtering

One means to possibly improve the alignment ability of the algorithm is to use edge filtering, converting an image into one that is represented only by it's edges. The motivation behind this is that images with a wide breadth of color brightnesses won't align properly because they appear so different in each color; instead, we will assume that areas of large contrast will still be consistent between colors (ie edges).

An example of taking an image and converting it via an edge filter (I used the Sobel filter to do this):

alt text

After using this filter in addition to the image pyramid we obtain the following alignment (which takes approximately 37 seconds):

alt text

At this point we have a good alignment; however, additional modification can be made to improve the resulting image.

Cropping

Due to a byproduct of the way the images are aligned, there is a minor "roll over" affect present. For example, in the above image there is a blue line at the bottom of the image which is actually the top part of the image "rolled over" the top of the image to the bottom. This can be cleaned by simply cropping by the amount that was needed to move to align:

alt text

Now to remove the extra white space and black bars on the left and right side of the image. This was removed by iterating over each column and row of the image and if the mean color of the row / column was above or below a threshold it was removed. This resulted in:

alt text

Final Touches

After all edge based artifacts have been removed, a minor tweak to the coloring of the image can be made. To make an attempt to better white balance the image, the mean color of the image is taken and a transformation is made to made this mean color grey (0.5, 0.5, 0.5). This resulted in the following image:

alt text