CS 194-26 Project 1:
Colorizing the Prokudin-Gorskii photo collection

Donglei Cai

Background

Sergei Mikhailovich Prokudin-Gorskii (1863-1944) [Сергей Михайлович Прокудин-Горский, to his Russian friends] was a man well ahead of his time. Convinced, as early as 1907, that color photography was the wave of the future, he won Tzar's special permission to travel across the vast Russian Empire and take color photographs of everything he saw including the only color portrait of Leo Tolstoy. And he really photographed everything: people, buildings, landscapes, railroads, bridges... thousands of color pictures! His idea was simple: record three exposures of every scene onto a glass plate using a red, a green, and a blue filter. Never mind that there was no way to print color photographs until much later -- he envisioned special projectors to be installed in "multimedia" classrooms all across Russia where the children would be able to learn about their vast country. Alas, his plans never materialized: he left Russia in 1918, right after the revolution, never to return again. Luckily, his RGB glass plate negatives, capturing the last years of the Russian Empire, survived and were purchased in 1948 by the Library of Congress. The LoC has recently digitized the negatives and made them available on-line.

Overview

The goal of this project is to take the digitized Prokudin-Gorskii glass plate images and, using image processing techniques, automatically produce a color image with as few visual artifacts as possible. In order to do this, we will need to extract the three color channel images, place them on top of each other, and align them so that they form a single RGB color image. A cool explanation on how the Library of Congress created the color images on their site is available here.

Exhaustive Search

The easiest way to align the parts is to exhaustively search over a window of possible displacements (say [-15,15] pixels), score each one using some image matching metric, and take the displacement with the best score. Here I am using the L2 norm, which is also known as the Sum of Squared Differences (SSD) distance.
First I divided each image into three equal parts which corresponds to three color channels and then align the Red and Green parts to Blue part.
Exhaustive Search works very fast on JPEG images because they have relatively small sizes (in terms of the number of pixels). Exhaustive Search becomes very expensive when the pixel displacement is too large. This often happens with high resolution images.

Result


Cathedral R = (3,12), G = (2,5)


Monastery R = (2,3), G = (2,-3)


Tobolsk R = (3,6), G = (3,3)

Pyramid Search

As mentioned before, exhaustive search will become very expensive and slow for high resolution images because the pixel displacement will be very large. For high resolution images, Exhaustive search will become expensive and slow because the pixel displacement will be large. The pyramid alignment algorithm is an efficient way to find the displacement. For each image, we downscale create different levels and downscale the image with power 2^(level). The base level 0 is the original image. Starting from the highest level, where the image is the smallest and coarsest, we apply exhaustive search with window size 8*8 and then then move the image based on the rescaled displacement. The aligned image then becomes the input of the next level in the pyramid. We move down a level each time until we reach the base level. This algorithm significantly speeds up the process speed. In addition, I chose green part of the images as base for alignment. I chose green because for the emir image, other two bases give poor results compared to green base. For other images, there was no significant difference among different bases.

Result


Church R = (-8,33), B = (-4,-25)


Emir R = (17,57), B = (-24,-49)


Harvesters R = (-3,65), B = (-16,-60)


Icon R = (5,48), B = (-17,-41)


Lady R = (3,63), B = (-8,-52)


Melons R = (4,96), B = (-10,-82)


Onion Church R = (10,57), B = (-26,-52)


Self Portrait R = (8,98), B = (-29,-79)


Three Generations R = (-3,58), B = (-14,-53)


Train R = (27,43), B = (-6,-42)


Workshop R = (-11,52), B = (0,-53)

Bells & Whistles (Extra Credit)

Automatic Cropping

The goal is to crop out the black/white/colored strips on the borders of the images to make the images look cleaner. I calculated the mean pixel value of each column and each row and set a threshold to filter out rows/columns that have extreme values. For rows, I crop out rows with mean pixel value > 0.95 or mean pixel value < 0.18. For columns, I crop out columns with mean pixel value > 0.85 or mean pixel value < 0.25. The overall image quality does increase but there still exist imperfections in the cropped images. Some images still have small color strips at the border while other images are slightly overcropped.

Result

Original Cropped
Original Cropped
Original Cropped
Original Cropped
Original Cropped
Original Cropped
Original Cropped
Original Cropped
Original Cropped
Original Cropped
Original Cropped
Original Cropped
Original Cropped
Original Cropped

Self Selected Images

Original R = (0,5), B = (-1,-3) Cropped
Original R = (2,7), B = (-4,-6) Cropped
Original R = (-2,5), B = (1,-5) Cropped