CS 194-26 Project 1: Colorizing the Prokudin-Gorskii Photo Collection

Monica Tang


Overview:

Prokudin-Gorskii captured thousands of "photos" of the Russian Empire by recording each scene onto glass plates with a red, a green, and a blue filter. His RGB glass plate negatives have been digitized and are now available online through the Library of Congress. Now, we are able to visualize life in color from the 1900s! This project aims to automatically produce a color image from these RGB plates via image-processing techniques.

Method:

The main idea is to separate a digitized glass plate image into 3 separate color channel images: R, G, B, and to align the three images together. To do so, for 2 of the color channels, we determine the best (x, y) displacement vectors that align them to the third channel. One can choose any of the three channels as the reference channel for alignment; I chose to align the images to the Green channel (for reasons detailed below).

Color Channels:

Initially, I had aligned my Red and Green channels to the Blue channel. However, this did not produce satisfactory results for emir.tif. Aligning to the Green channel produced a cleaner result. Additionally, there were no obvious differences between aligning to the Blue or Green channel for the other images, so I decided to stick with Green for all of them.



Naive Approach (Exhaustive Search):

The straightforward way to align the images is to search over a window of (x, y) values and use a scoring metric to determine the most optimal (x, y) offset value. One such metric is the L2 Norm, also known as the Sum of Squared Differences (SSD), which is the metric I used. Another is Normalized Cross Correlation (NCC), but I found that this did not produce results as well-aligned as SSD.

Some adjustments can be made to the images beforehand to produce better results, such as cropping the borders. I used the middle 60% of each image to avoid the "dirty" borders influencing the scores of each (x, y) offset pair.

Below are the results for the low-resolution .jpeg images using exhaustive search on a window of [-15, 15]:

cathedral.jpeg
b offset: (-2, -5)
r offset: (1, 7)
monastery.jpeg
b offset: (-2, 3)
r offset: (1, 6)
tobolsk.jpeg
b offset: (-3, -3)
r offset: (1, 4)

Pyramid Search:

For large files, such as .tif files, it is far too expensive to use the naive approach. Therefore, a faster and more efficient algorithm must be used. Pyramid search uses a recursive approach to align the color channel images.

Essentially, we specify a scaling factor and the number of recursive steps (or levels) we want. And at each recursive step, we scale our image down by a scaling factor. When we are at our base case (the top-most level of the pyramid or if the image is small enough), we use the rescaled lowest-resolution version of the image and perform exhaustive search with a small window size. Before returning to the next lower level of the pyramid, the (x, y) displacement vector obtained must be scaled appropriately. This is then used as a starting point for the next level's image alignment. As we travel down the pyramid, we are refining the (x, y) displacement vecotr by aligning higher and higher resolution versions of the image until we reach the bottom-most level with the original resolution image.

For my base case, the window was [-10, 10] and at the other levels, [-5, 5]. I used 3 levels and a scaling factor of 4.

Below are the results for the high-resolution .tif images:

church.tif
b offset: (-4, -24)
r offset: (-8, 32)
emir.tif
b offset: (-24, -48)
r offset: (16, 56)
harvesters.tif
b offset: (-16, -60)
r offset: (-4, 64)
icon.tif
b offset: (-16, -40)
r offset: (4, 48)
lady.tif
b offset: (-8, -48)
r offset: (4, 60)
melons.tif
b offset: (-8, -80)
r offset: (4, 96)
onion_church.tif
b offset: (-28, -52)
r offset: (12, 56)
self_portrait.tif
b offset: (-28, -80)
r offset: (8, 96)
three_generations.tif
b offset: (-16, -48)
r offset: (-4, 60)
train.tif
b offset: (-4, -44)
r offset: (28, 44)
workshop.tif
b offset: (0, -52)
r offset: (-12, 52)
sittinglady.tif
b offset: (-20, -36)
r offset: (16, 40)
dog.tif
b offset: (0, -12)
r offset: (0, 68)
fishingboat.tif
b offset: (-16, -16)
r offset: (16, 64)
girls.tif
b offset: (-8, 16)
r offset: (8, 24)
family.tif
b offset: (-28, -24)
r offset: (12, 116)
river.tif
b offset: (-24, -16)
r offset: (20, 68)

Auto-Contrast:

Using the skimage.exposure package, we can also automatically contrast our final images via contrast stretching. This is done with the rescale_intensity function, where we rescale the image's intensities so that they all lie between the 2nd and 98th percentile.

Below are several examples (hover to see the contrasted version):

church.tif
lady.tif
onion_church.tif
three_generations.tif
girls.tif