Sergei Mikhailovich Prokudin-Gorskii (1863-1944) [Сергей Михайлович Прокудин-Горский] was convinced, as early as 1907, that color photography was the wave of the future. He traveled across the vast Russian Empire and took color photographs of everything he saw including the only color portrait of Leo Tolstoy, other people, buildings, landscapes, railroads, bridges, etc. His idea was simple: record three exposures of every scene onto a glass plate using a red, a green, and a blue filter. He believed special projectors to be installed in "multimedia" classrooms all across Russia where the children would be able to learn by combining all the slides with red, green, and blue lights. Although, his plans never materialized, his RGB glass plate negatives, capturing the last years of the Russian Empire, survived and were purchased in 1948 by the Library of Congress. The LoC has recently digitized the negatives and made them available online.

The goal of this project is use automatic image processing techniques to produce a color image using the digitized Prokudin-Gorskii glass plate images as shown on the right.

For this project, I explored several ways to align the images, increase speedup using image pyramids, implemented autocrop, implented AWB, and implemented auto contrast.

Alignment and Color Pyramids

For the naive solution to align the different colored glass plates, I essentially brute forced different [x, y] offsets between the plates and tried to measure how well they aligned. I used the default blue slide as the default image, and tried fitting the green and red images on top of it. After recording the ideal offsets for the green and red slides, I would use them to move the appropriate RGB color channels. The two different metrics to measure image alignment that I tried were Sum of Squared Differences (SDD) and Normalized Cross Correlation (NCC).

While SSD measures the differences between the pixel values themselves, NCC treats them as vectors and measures difference in terms of the angle separating the vectors. Through trying both examples, I found the SSD worked better.

For smaller images, I was able to use this brute force solution. For larger images, simply brute forcing our way and trying large sets of offsets was simply too slow. For this, I implemented an image pyramid which would recursively scale down the size of the image by 2 until it reached a minimum size (in my case under 500 pixels). Then, I would brute force an offset range of [-20, 20] pixels on this smallest image and find the best offset. Once we have the offset at the small scale and as the recursion undos and the images become bigger, I scale up the offset as well and perform a smaller localized search around the predicted offset [-2, +2] to finetune my results. This resulted in significant speedup, since it scaled the runtime by log(n) where n was the original runtime.

Example Images

As you can see the alignment only works in most situations. In emir.tif in particular, it does quite poorly. In the next section, I will go over certain fixes I used and the end results.

Fixing Aligment

To fix aligment issues, I implemented a few fixes that were not mentioned in the specifications.

Normalizing Color Vectors

Here, I took each color slide as a matrix and performed an element-wise operation to normalize the values by subtracting the mean value of the matrix and dividing by the standard deviation. I did this because in some images, the color ranges are not fully identical and thus not comparable. Converting them to standard units proved to be helpful.

Using Green as Base

The starter code used blue as the base image on which we were to align the red and blue images. However, through experimentation I found that using green tended to work better. Specifically for emir.tif, it seems like different parts of the image have different color distributions and matching pixel color values would not work. However, green seemed to be the color whose intensity was typically between the blue and red colors, so it worked as a good base.

Cropping Image

The final trick I used was to only focus the matching near the center of the images. I cropped the images so that the edges were only 65% of their original lengths. Many of the images provided had artifacts at their edges, so removing them and focusing on the main subject of the image (which tend to be near the center) was really helpful.

Using Edge Detection

While I did not try using edge detection, I believe it would have been really useful in handling most of the problems faced when colors themselves had different brightnesses. I did end up using edge detection for autocropping, which I cover in the "Bells and Whistles" section.

Comparing Before and After Fixes

Hover to see the before images

All Processed Images

Provided Examples

R Displacement: (1, 7)
B Displacement: (-2, -5)
Runtime: 1.41 s
R Displacement: (-8, 33)
B Displacement: (-4, -25)
Runtime: 6.6 s
R Displacement: (17, 57)
B Displacement: (-24, -49)
Runtime: 6.07 s
R Displacement: (-3, 65)
B Displacement: (-17, -59)
Runtime: 6.31 s
R Displacement: (5, 48)
B Displacement: (-17, -41)
Runtime: 6.41 s
R Displacement: (4, 62)
B Displacement: (-9, -55)
Runtime: 6.89 s
R Displacement: (3, 96)
B Displacement: (-11, -82)
Runtime: 6.98 s
R Displacement: (1, 6)
B Displacement: (-2, 3)
Runtime: 1.22 s
R Displacement: (10, 57)
B Displacement: (-27, -51)
Runtime: 6.39 s
R Displacement: (-16, 107)
B Displacement: (11, -33)
Runtime: 7.47 s
R Displacement: (8, 98)
B Displacement: (-29, -79)
Runtime: 6.43 s
R Displacement: (-3, 59)
B Displacement: (-14, -52)
Runtime: 6.23 s
R Displacement: (1, 4)
B Displacement: (-3, -3)
Runtime: 1.48 s
R Displacement: (27, 43)
B Displacement: (-6, -43)
Runtime: 7.2 s

Extra Samples

R Displacement: (25, 71)
B Displacement: (-27, -14)
Runtime: 8.82 s
R Displacement: (9, 38)
B Displacement: (-29, 16)
Runtime: 7.84 s
R Displacement: (-1, 66)
B Displacement: (2, -14)
Runtime: 7.05 s
R Displacement: (-16, 107)
B Displacement: (11, -33)
Runtime: 7.47 s
R Displacement: (15, 13)
B Displacement: (-12, 21)
Runtime: 7.1 s
R Displacement: (-20, 50)
B Displacement: (18, -26)
Runtime: 8.62 s

Bells and Whistles: Autocrop

Many of the images here have varying sized borders for the different color channels and often do not align properly. In this part, I attempt to automatically crop out the borders. The project website lists using dissimilarity values at different parts of the image to determine where the mismatching borders are located. I wanted to explore using edge detection instead. My solution involved using manually implementing Sobel filters to find vertical and horizontal edges for each of the color channels. Then, I would find which vertical and horizontal sections of the image had the largest edge/gradient values. These, rows and columns indicate my new borders.

Sobel Filter Kernels
Onion Church Composition
Horizontal Edges
Vertical Edges

Comparing Before and After Autocrop

Hover to see the before images

Bells and Whistles: Auto White Balance

To implement AWB, I took the brightest pixel in the image per color channel, and I scaled that up to 255 after determining some multiplier. I then applied that multiplier across all pixels and all color channels. For most of the images, this did not cause much change to the images. Hover to see the original images.

Bells and Whistles: Auto Contrast

To add auto contrast, I manually implemented the histogram equalization technique discussed in class. For this, I created a histogram of the different intensity values. For each intensity value, I took the cumulative sum of all values below that intensity value. I treated each bin as the query value for a point-wise function which and the cumulative sum as the remapped value.

Example of Histogram Equalization

For my first implementation of this, I created different histograms for each red, green, and blue color channels. While it did not maintain the original colors well, it still created some interesting results. Hover to see the original images:

For my second implementation of this, I created a single histograms across all red, green, and blue color channels. It seems to do more of what we want while maintaining the original colors. Hover to see the original images: