Project 1: Images of the Russian Empire

Leland Yu (CS 194-26-abz)

Overview

Sergey Prokudin-Gorsky, a Russian photographer in the early 1900s, is most known for his impact in the early days of color photography. At the time, there was no equipment adequate enough for colored photography; however, he had the idea of taking a picture three times, each time using a different colored filter. This can later be recreated by placing the appropriate filter in front of a standard light projector. He was tasked with documenting the Russian Empire; unfortunately, his work was cut short due to the Russian Revolution. While some of his photographs were confiscated or lost, the Library of Congress managed to purchase a decent collection of the remaining ones, and later would clean and release them to the public. In this project, we are to take the black and white photographs and recreate Prokudin-Gorsky's vision of viewing Russia in color.

Approach

After separating the stacked photographs into the individual color channels, we need to find a way to align the images properly on top of each other. For small images, a brute-force search across a range of 30 pixels in each direction is adequate. There are a couple approaches to evaluate the best alignment - the smallest sum of squared differences or the largest normalized cross-correlation. While implementing these functions, I discovered that the border affects the alignment scores more than it should, considering it is not part of the picture. After ignoring 10% of the image on each side, I found that both formulas achieve identical results, with SSD running slightly faster.

The larger tiff images have too bif of a range is too large to efficiently iterate over. For these pictures, I implemented an image pyramid technique to find the best alignment. I downscale the pictures by powers of 2 until the image is small enough to easily work with. After finding the alignment for the small image and with a little bit of math, I can then downscale to the next smallest factor and probe a smaller range of alignments. In essence, the image pyramid employs a binary search technique to find the optimal alignment more efficiently.

Bells and Whistles

Since the images are not aligned perfectly, the borders become an amalgamation of the primary and secondary colors. I implemented automatic cropping to remove these eyesores. Rather than calling complicated border detection formulas, I simply took the mean across the rows and columns and reasoned that the minimum value would be on the border. I then arbitrarily set a threshold to indicate where the border stops and the image begins. After adding a small pixel buffer (0.75% of the dimension) to remove excess artifacts and some more math to figure the impact from the alignment itself, I simply take the maximum border size across all three colors in the four directions and crop that amount. This technique struggles against naturally dark images (e.g. the top border of icon.tif) and pictures with a gradual fading of the border (e.g. train.tif on the top border). I additionally did some preprocessing to remove the white borders from the top and bottom. This change makes the division of color channels slightly more accurate, salvaging a few more rows of the image. Note that this also alters the alignment values in the y-direction compared to the naive implementation.

Results

Below are the results from the uncropped and the automatically cropped versions for the given starter images and a few additional images. The displacement values refer to the [x,y] offset of the green and the red channels, respectively, for the uncropped version. Again, these alignment offsets are different from the cropped version, since I removed the white border in preprocessing.

Image NameBase AlignmentCropped Alignment
Small JPG Images
Cathedral
[2, 5] [3, 12]
Monastery
[2, -3] [2, 3]
Tobolsk
[3, 3] [3, 6]
Larger TIFF Images
Emir
[24, 49] [57, 103]
Harvesters
[16, 59] [13, 124]
Icon
[17, 41] [23, 90]
Lady
[8, 56] [11, 116]
Melons
[11, 82] [13, 178]
Onion_church
[27, 51] [36, 108]
Self_portrait
[29, 79] [37, 176]
Three_generations
[14, 53] [11, 112]
Train
[6, 43] [32, 87]
Village
[12, 65] [22, 138]
Workshop
[0,53] [-12, 105]
Other Pictures from the Prokudin-Gorsky Collection
Gong
[19, 63] [26, 135]
Monument
[21, 24] [29, 60]
Vendor
[-4, 51] [-24, 113]

Final Thoughts

Overall, the images look realistic, considering that they were black and white beforehand. I am especially happy with the performance of the cropping, which goes to show that simple measures can work well. However, the images are far from perfect. The emir.tif image is slightly unaligned from the blue channel's perspective, which can be seen from the sleeve on his left hand and the door on the right. This is most likely due to the stark blue of his outfit, which interferes with the L2 SSD metric that I was using. For this image specifically, an edge detection filter across the three color channel will yield better results. Additionally, many images can be improved with better color blancing and contrasting. The image vendor.tif is a good example of this, since there is a red hue and the black parts of the shadow look blue instead. Nevertheless, the images are more than eough to give us in the modern era a glimpse into the lives of Russia in Prokudin-Gorsky's time.