CS194-26 Project 1 - Kristin Ho

Images of the Russian Empire: Colorizing the Prokudin-Gorskii photo collection

Overview

Background

Sergei Mikhailovich Prokudin-Gorskii (1863-1944) captured RGB glass plate negatives of the last years of the Russian Empire. A man ahead of his time, he realized that although color photography did not exist yet, capturing images in such a way could provide a means for them to be viewed in color later. Alas, he left Russia in 1918, after the revolution, never to return again; his plans for displaying color images with special projectors in "multimedia" classrooms throughout Russia never materialized. Luckily, his glass plate negatives survived, and with some image manipulation, we are able to complete his vision of colorizing the photographs.

My Approach Part 1: Naive Algorithm

I began by using a simple, straightforward algorithm on the .jpg images. The algorithm performed an exhaustive search of all possible alignment offsets of one image compared to another in the interval [-15, 15] and picked the one that worked best. In short, I tried 961 possibilities, with thirty-one possibilities for x and thirty-one possibilities for y together making 961 combinations. The "best" offset was determined by its estimated "distance" from the ideal alignment, via SSD (sum of squared differences) between the two image matrices. After computing the best offset for the r and g channels with respect to the b channel (not offset), the three images were stacked together. As one may imagine, computing all 961 possibilities on the full scale image would prove far too inefficient for higher resolution .tif images.

My Approach Part 2: Image Pyramid Algorithm

My image pyramid implementation speeds up the search process by searching for the best alignment at lower resolution versions, and then using that as a basis for determing what interval to search in for a higher resolution, getting the best alignment at that resolution, and then repeating that process until one reaches the highest resolution. Because one is always basing the interval off of a reasonable estimate from a lower resolution, the search interval does not need to be as large, and thus the number of possibilities tried is far fewer, and the algorithm completes more quickly. I had a total of five levels in my algorithm, with Level 0 being the original size and Level 4 being 1/16 of the original size (each level is half the resolution of the previous). I began with an interval of size five, searching two below and two above the provided offset estimate. However, since the results I was getting with this small interval were not accurate enough, and having a large search interval was not efficient enough, I decided to try a middle ground by searching in an interval of (2 + current_level) below the estimated offset and (2 + current_level) above the estimated offset. For example, if the current level was 3, and the estimated offset was 0, then I would search [-5, 5] since 2 + 3 (current level) = 5. Or if the current level was 0 (highest resolution), and the estimated offset was 4, then I would search [2, 6] since 2 + 0 (current level) = 2. This way, the size of the search interval would decrease as the algorithm approached the highest resolution, so as to maintain the efficiency while still searching more. I saw some definite improvements using this dynamic search interval approach; for example, "village" and "turkmen" appear much better aligned with this method. I will denote those images below with "absolute interval" for the two above two below approach, and "dynamic interval" for the two plus level above and below approach. The rest of the images are by default results of the dynamic interval approach.

My Approach Overall: Details

Since the borders of the images were unrelated to the content of the images themselves, and since the images aligned best in the center (given that some of them may have been slightly rotated or morphed), I calculated the ideal alignment based on the inner 2/3rds of each picture. To achieve that, I trimmed 1/6 off of the top, bottom, left, and right margins of each image matrix before calculating the ideal offset/the SSD between it and the base image. However, I retained the full image when stacking.

Sample of Glass Plate Negatives

Cathedral

Cathedral

Church of Resurrection

Church of Resurrection

Monastery

Monastery

Nativity

Nativity

Settlers

Settlers

Colorized Images

Cathedral: R[12, 3] G[5, 2]

Cathedral

Monastery: R[3, 2] G[-3, 2]

Monastery

Nativity: R[7, 0] G[3, 1]

Nativity

Settlers: R[15, -1] G[7, 0]

Settlers

Emir: R[0, 43] G[49, 24]

The y-axis alignment for this image did not succeed. The reason for this is probably that the comparison between single channels for SSD assumes that similar areas have similar intensity/values for r, g, and b. However, if some parts of the image are more blue or green and some areas are more red, then this throws off the SSD method of computing best offset.

Emir

Harvesters: R[124, 14] G[59, 17]

Harvesters

Icon: R[89, 23] G[41, 17]

Icon

Lady: R[114, 11] G[53, 9]

Lady

Self Portrait: R[160, 33] G[78, 29]

SelfPortrait

Three Generations: R[110, 11] G[52, 14]

ThreeGenerations

Train: R[87, 32] G[42, 6]

Train

Turkmen (with the absolute interval): R[62, 19] G[55, 20]

Turkmen

Turkmen (with the dynamic interval): R[116, 28] G[55, 20]

Turkmen

Village (with the absolute interval): R[62, -58] G[0, 43]

Village

Village (with the dynamic interval): R[116, 28] G[64, 12]

Village

Additional Colorized Images (chosen by me)

Church of Resurrection: R[11, 4] G[5, 3]

Church of Resurrection

Miass Station: R[0, 6] G[1, 3]

Miass Station

Solovetskii Monastery: R[12, 0] G[3, 0]

Solovetskii Monastery

Austrian Prisoners: R[124, 18] G[35, 8]

Austrian Prisoners