Images of the Russian Empire

Sean Dooher - Spring 2020

Background

Starting in 1907, Sergei Mikhailovich Prokudin-Gorskii travelled around Russia taking photographs with three colored lenses with the idea they could be combined to display color photographs using normal film. Our goal in this project is to take these images and combine them into a single color image using computer image processing techniques.

Naive Aligment

The basic concept for the alignment algorithm is to test a number of possible different alignments and choose the best one produced. To avoid having to inspect each of these new alignments by manual inspection, we must develop an automated method of determining what a "good" alignment is.

Scoring

One method of doing this is through the sum of the squared difference between two images (SSD). Essentially this technique works by first subtracting two images represented as matrices, squaring the resulting matrix and then taking the sum of all the values. If the two images are similar, the subtraction step should essentially cancel out most entries in the matrix, making the resulting sum smaller. The squaring makes smaller differences appear larger and also has the added effect of making the differences absolute -- not depending on the sign of the difference. I also tried a Normalized Cross Correlation to align the images and it worked well on most images, but the final approach I describe below with SSD on the gradient of images ended up performing better with no misaligments on all photos I tested.

Base Channel Selection

To use this scoring system, we must first choose one of the color channels to be our "base" channel and generates offsets from that. I chose the green channel to the base channel as green is the color humans are most perceptually sensitive to. By aligning to green, if the images are slightly misaligned due to errors in the algorithm or faults in the original photographs themselves, the misalignment should be less noticable. Additionally, I cropped all images by a certain amount before finding the alignment.

Dealing with Borders

As the photos have solid borders, the difference when they align with a non border pixel may end up dominating the score instead of the actual differences between the part of the photo we care about. As the cropping should not change the alignment if both images are cropped the same as it is a relative offset we are calculating. As the code runs faster with a higher crop due to the resulting image being smaller, I played around with this value and found that a 20% crop off of all sides (so a 40% crop in each dimension) had a decent speedup with no degradation of results.

Example Images -- Small

This approach works well for images of a relatively small size as seen on the three examples below.

monastery.jpg

Blue Offsets: (3, -2) Red Offsets (6, 1)

monastery

tobolsk.jpg

Blue Offsets: (-3, -2) Red Offsets (4, 0)

tobolsk

cathedral.jpg

Blue Offsets: (-5, -2) Red Offsets (7, 1)

cathedral

Handling Larger Photos

Unfortunately the approach above does not scale well to larger photos, as the alignment scoring involves taking the difference of a matrix and squaring the resulting matrix. Especially as the squaring operation does not scale linearly and a larger resolution means we will have to iterate over a larger range of offsets, the runtime quickly becomes unreasonable when dealing with large images. To get around this we use an image pyramid. This essentially means we recursively generate offsets based of a down-scaled image, and then scale up the offset to be used as a base to search around for the full size image. This allows us to generate better offsets in a fraction of the time, as we no longer have to potentially go through thousands of offsets for the full size image. This was a fairly straight forward extension of the naive code and allowed for the alignment of the following photos in well under a minute each.

Example Images -- Large

workshop.tif

Blue Offsets: (-53, 1) Red Offsets: (52, -11)

workshop

onion-church.tif

Blue Offsets: (-51, -27) Red Offsets: (58, 10)

onion_church

village.tif

Blue Offsets: (-65, -11) Red Offsets: (73, 3)

village

self-portrait.tif

Blue Offsets: (-79, -30) Red Offsets: (98, 8)

self_portrait

icon.tif

Blue Offsets: (-40, -17) Red Offsets: (49, 5)

train.tif

Blue Offsets: (-41, -8) Red Offsets: (44, 26)

train

melons.tif

Blue Offsets: (-81, -10) Red Offsets: (96, 3)

melons

harvesters.tif

Blue Offsets: (-60, -16) Red Offsets: (64, -2)

harvesters

emir.tif

Blue Offsets: (-49, -23) Red Offsets: (58, 17)

emir

three_generations.tif

Blue Offsets: (-53, -13) Red Offsets: (59, -5)

three_generations

lady.tif

Blue Offsets: (-56, -9) Red Offsets: (63, 4)

lady

Gradient Scoring

With the method described above, I managed to have almost every image align correctly. Unfortunately, it was insuffient for the village.tif image by itself. To fix this, instead of scoring the images directly, I first took the gradient of the images, which is essentially the derivative/amount of change in each pixel. This makes the scoring function more sensitive to sharp changes (such as edges of objects or landscape features). This gradient filtering made all images I tested it on align correctly with no major alignment issues.