CS294-26 Project 1 Report

 

Images of The Russian Empire: Colorizing The Prokudin-Gorskii Photo Collection

Jimmy Xu

 

Overview

Sergei Mikhailovich Prokudin-Gorskii (1863-1944) [Сергей Михайлович Прокудин-Горский] was a man well ahead of his time. Convinced, as early as 1907, that color photography was the wave of the future, he won Tzar's special permission to travel across the vast Russian Empire and take color photographs of everything he saw including the only color portrait of Leo Tolstoy. And he really photographed everything: people, buildings, landscapes, railroads, bridges... thousands of color pictures! His idea was simple: record three exposures of every scene onto a glass plate using a red, a green, and a blue filter.

Never mind that there was no way to print color photographs until much later -- he envisioned special projectors to be installed in "multimedia" classrooms all across Russia where the children would be able to learn about their vast country. Alas, his plans never materialized: he left Russia in 1918, right after the revolution, never to return again. Luckily, his RGB glass plate negatives, capturing the last years of the Russian Empire, survived and were purchased in 1948 by the Library of Congress.

In this project, I attempt to colorize some pictures of the collection and will describe my approach below.

Keywords: SSD, NCC, image pyramid search, and auto whitebalance.

 

Problem Statement

The goal of this project is to convert a glass plate image to a color image. Each of the original pictures is the three color channels of the same object that are vertically stacked together.

monastery

The naive approach is to divide a picture into three images of the same height and stack them in the color channel. However, this approach won't work because the three images don't align with each other.

The first challenge is then to find an appropriate displacement for each channel so that they will align to each other. An additional challenge is to do this efficiently so that we can process pictures of very large size.

Here's a preview of an aligned image:

 

Channel Alignment

My approach to align the pictures is as follows:

  1. Select a base channel.

  2. If the resolution of the channel is too large, it will be downscaled by a factor of 2. It will be recursively downscaled until it reaches the limit or the height or width is less than a range.

  3. For each of the other two channels, compute the displacement to minimize the metric between the channel and the base channel

    1. The channel is rolled by [-10, 10]x[-10, 10] (400 trials in total)
    2. For both channels, crop out 20% of the width and the height for each border.
    3. Calculate the metric between the cropped channels
    4. Return the offset that minimizes the metric
  4. The offset from the channel of the lower resolution will be scaled back as the starting point of the rolling. Repeat 3.

Metric

The metric is used to calculate the similarity or dissimilarity between two pictures. I tried the sum of squared differences (SSD) and normalized cross-correlation (NCC). After applying both metrics to some pictures, it turns out SSD works better.

Base Channel Selection

It turns out the base channel may have some significant impact on the final image. For example, the picture below uses blue as the base channel, which doesn't yield a good result. By simply changing the base channel to green or red, the result becomes much better. Through experimentation, I decide to use green as the base channel.

Center Crop

Instead of comparing the entire channel, I use the cropped center to calculate the metric, because the border of the channels are usually damaged or missing. This also makes computing faster. The picture below shows the result of doing alignment without center crop. Note the border of the riverbank. It is more misaligned.

If the input image is very large, it is prohibitively expensive to exhausively search for the best displacement pixel by pixel. I implement pyramid search (see above for implementation details) to search displacement from coarsest-scaled image to finest-scaled image, which greatly speeds up the process.

 

Colorized Images

For grading purposes, here's the list of aligned images

 

Post Processing

Border Cropping

Since the border of each aligned image is very regular, I decide to center crop a certain percentage of the picture.

 

Auto White Balance

I used the method mentioned in the python tutorial.

  1. Find the brightest pixel of the grey-scale picture
  2. Assume the brightest pixel is white. Calculate a scaling factor to do that.
  3. Apply the scaling factor to the entire picture

This usually leads to a more accurate, sometimes less appealing, representation of color.

 

Final Results

 

Additional Results

I downloaded some additional glass plate images of Prokudin-Gorskii from the Library of Congress website. I then colorized and post-processed them.

 

Summary of Offset

For grading purposes, here's the list of offsets for each of the images

PictureBlue to GreenRed to Green
cathedral.jpg(-5,-2)(7,1)
church.tif(-25,-4)(33,-8)
emir.tif(-49,-24)(57,17)
harvesters.tif(-59,-17)(65,-3)
icon.tif(-41,-17)(48,5)
lady.tif(-55,-9)(62,4)
melons.tif(-82,-11)(96,3)
monastery.jpg(3,-2)(6,1)
onion_church.tif(-51,-27)(57,10)
self_portrait.tif(-78,-29)(98,8)
three_generations.tif(-52,-14)(59,-3)
tobolsk.jpg(-3,-3)(4,1)
train.tif(-43,-6)(43,26)
workshop.tif(-53,1)(52,-11)

 

Summary of Bells and Whistles

For grading purposes, here's the list of extra features I did for this project

  1. Center crop for better alignment
  2. Auto white balance
  3. Crop out borders
  4. Test and select a better base channel
  5. Align additional pictures