Colorizing the Prokudin-Gorskii Photo Collection

Overview

"Sergei Mikhailovich Prokudin-Gorskii (1863-1944) [Сергей Михайлович Прокудин-Горский, to his Russian friends] was a man well ahead of his time. Convinced, as early as 1907, that color photography was the wave of the future, he won Tzar's special permission to travel across the vast Russian Empire and take color photographs of everything he saw including the only color portrait of Leo Tolstoy. And he really photographed everything: people, buildings, landscapes, railroads, bridges... thousands of color pictures! His idea was simple: record three exposures of every scene onto a glass plate using a red, a green, and a blue filter. Never mind that there was no way to print color photographs until much later -- he envisioned special projectors to be installed in "multimedia" classrooms all across Russia where the children would be able to learn about their vast country. Alas, his plans never materialized: he left Russia in 1918, right after the revolution, never to return again. Luckily, his RGB glass plate negatives, capturing the last years of the Russian Empire, survived and were purchased in 1948 by the Library of Congress. The LoC has recently digitized the negatives and made them available on-line."

The goal is to use image processing techniques to transform these black and white negatives into a colorized photograph.

Techniques

Alignment using Sum of Squared Differences Error Metric

Imagine a 2D grid, where we can shift the black and white images in the x or y direction to align them with eachother. At each of the 15x15 possible new locations, we compute the Sum of Squared Differences metric to quantify how different the images are at each pixel. Then we pick the location with the smallest metric.

Alignment using Normalized Cross-Correlation

Repeat the above procedure, but instead of using the SSD error metric, we essentially treat each of the black-and-white images as a vector, normalize them, and take the dot product between images to determine how correlated they are.

Ignore Borders

The two above metrics work much better for auto-aligning images if we remove the borders before trying to compute the metrics. There is a lot of noise around the borders of an image, such as ripples, tears, and writing. In addition, when we shift images in a direction, we need to fill in the other side of that image with a dummy value (I picked the average pixel value for that glass plate negative). This keeps the dot product consistent. Another possibility would be to just throw away these pixels values.

Image Pyramid

It is extremely expensive to shift images. As the size/resolution of the images grow, the number of candidate positions we have to consider grows quadratically, and we also have to manipulate bigger matrices. Then for large images, we shrink the image by a factor of 2 until it is small enough (under 256px wide), and then perform a search across a 12x12 grid of position shifts. Then we scale this up by factors of 2 and perform the search across a 2x2 grid each time. We end up doing much fewer comparisons, at only 4 comparisons for each recursive call. The base case is also fast since the image is so small.

Sobel Edge Detector

Just using NCC alignment along with some cropping and filters to ignore noise, most images could be aligned very well, and they look pleasing to the eye. Emir the Bukhara was not aligned properly though, since pixels that should be aligned do not really correlate in brightness values since they are from different color channels. To remedy this, I passed all three color channels through a Sobel Edge Detector filter, and then fed them into the NCC alignment algorithm. Now we are alinging edges, which produes a much better image. Unfortunately, this didn't help much for the Three Generations photograph, where the green color channel is still slightly misaligned and is visably noticable. It is possible that this image needs some other method, such as rotation, in order to be properly aligned.

Automatic Cropping

The next step to making these images look even better would be to implement an algorithm that can automatically crop the borders off of the colorized images. This makes a huge difference to the eye. I tried using edge detectors to find horizontal and vertical edges along the borders, but often times this would crop off one of the outer boarders and leave some other border uncropped. Ultimately, it was easy to find a border of the image, but hard to find the correct (inner-most) border to crop along.

	NCC Alignment Algorithm	Sobel Edge Detector + NCC
	green offset: (2, 5), red offset: (3, 12)	green offset: (2, 5), red offset: (3, 12)
	green offset: (24, 49), red offset: (75, 42)	green offset: (24, 49), red offset: (40, 107)
	green offset: (16, 59), red offset: (13, 123)	green offset: (16, 60), red offset: (14, 124)
	green offset: (17, 41), red offset: (23, 89)	green offset: (17, 42), red offset: (23, 90)
	green offset: (8, 51), red offset: (11, 112)	green offset: (9, 56), red offset: (13, 120)
	green offset: (10, 81), red offset: (13, 178)	green offset: (10, 80), red offset: (13, 177)
	green offset: (2, -3), red offset: (2, 3)	green offset: (2, -3), red offset: (2, 3)
	green offset: (26, 51), red offset: (36, 108)	green offset: (24, 52), red offset: (35, 107)
	green offset: (29, 78), red offset: (37, 176)	green offset: (29, 78), red offset: (37, 176)
	green offset: (14, 53), red offset: (11, 112)	green offset: (12, 54), red offset: (9, 111)
	green offset: (2, 3), red offset: (3, 6)	green offset: (2, 3), red offset: (3, 6)
	green offset: (5, 42), red offset: (32, 87)	green offset: (0, 41), red offset: (29, 85)
	green offset: (11, 64), red offset: (22, 137)	green offset: (10, 64), red offset: (24, 137)
	green offset: (0, 52), red offset: (-12, 105)	green offset: (0, 53), red offset: (-12, 105)