Images of the Russian Empire: Colorizing the Prokudin-Gorskii photo collection

COMPSCI 194-26: Computational Photography & Computer Vision

Professors Alyosha Efros & Angjoo Kanazawa

Steven Christopher

Background

Sergei Mikhailovich Prokudin-Gorskii (1863-1944) was a photographer who took color photographs of everything he saw. His idea was simple: record three exposures of every scene onto a glass plate using a red, a green, and a blue filter.

The goal of this project is to take these digitized Prokudin-Gorskii glass plate images and, using image processing techniques, produce a color image with as few visual artifacts as possible. To accomplish this, we extract the three color channel images and align them on top of each other such that they form a single colorized image.

Approach

Exhaustive search is used initially as a rudimentary method of aligning the different color filters on top of each other. Specifically in this project, red is aligned to blue and green is aligned to blue. Then, both of the results from these alignments are stacked depth-wise (on top of each other) with the blue filter to create the colorized image.

The metric used to indicate a good alignment was the normalized cross-correlation (NCC), where being closer to 1 indicates a high similarity. The exhaustive search alignment, align(channel_1, channel_2), involves:

  1. Keep track of the max_ncc_so_far (NCC)
  2. For all combinations of indices within a 20x20 window (max displacement of 20 pixels each direction), translate channel_1.

    a. Calculate the ncc of this newly translated channel with channel 2 and update max_ncc_so_far as necessary.

  3. Return image and amount of displacement in x and y direction corresponding to the max ncc of the entire procedure using a 20x20px window.

While this worked for the smaller images like the cathedral photo, exhaustive search does not scale well especially for more complex, larger images such that of Emir of Bukhara. Therefore, a more efficient approach using image pyramids will be used.

Image Pyramid

Image pyramids allow for faster computation in finding the best alignment of the color filters by resizing images by a small factor (0.5x in this case) and working off of these. Particularly, we do much of the work (using a 20x20px window for displacement in this case) for the smallest resized image and then restrict this window to be 5x5px in the larger, subsequent images. In this way, we avoid having to exhaustively search a large image pixel by pixel, and instead work off the results given by the smaller, resized images.

Specifically, the image pyramid includes:

  1. Base case: simply run exhaustive search if image size is less than 100x100px

  2. Run image pyramid algorithm on resized image at factor of 0.5x, returning best alignment steps in x and y direction.

  3. Run exhaustive search with 5x5 window, starting at the pre-resized image translated by the alignment found from the previous step. Returns best alignment.

  4. Adjust alignment from step 2 by alignment found in step 3.

  5. Return final translated image

With this, finding the alignment helped make computation time less than 30 seconds for the majority of the images. However, some images such as Emir of Bukhara were remarkably difficult to colorize.

One quick idea to help reduce noise before attempting to colorize was to trim off the borders by 12.5% each side of the photo, as all photos had stark borders (of white and black) which may interfere with our attempts to align color filters. These borders were also uneven and blurry, making it difficult to achieve a high NCC if included while searching.

Challenges

Despite the improvement of our colorization accuracy due to our usage of image pyramids and trimming of borders, the picture of Emir of Bukhara still proves to be difficult to deal with. Emir of Bukhara was especially difficult to colorize due to the vibrancy of his blue silky robes. The material of his robes give rise to sudden changes in luminance which is difficult for image pyramid search to find an optimal alignment. Furthermore, the clash of color intensity between the red of the flower design and the blue of his robes also make it difficult to find a good alignment for our red and blue filters together.

Our bells and whistles added to tackle this problem was to use edge detection. By using edge detection, we get rid of reliance on colors and additionally stark changes in radiance. We then use our alignment algorithm (with ncc) on the result of the Canny edge detection instead of the original image itself, and use our best displacement from this to translate our original image in the end.

cathedral
G: [5 2]
R: [12 3]

train
G: [48 8]
R: [96 32]

church
G: [25 4]
R: [64 0]

3_gen
G: [53 16]
R: [112 16]

melons
G: [84 16]
R: [180 16]

monastery
G: [0 2]
R: [4 4]

onion
G: [50 32]
R: [112 36]

icon
G: [48 17]
R: [96 24]

tobolsk
G: [4 4]
R: [8 5]

self_portrait
G: [80 32]
R: [177 40]

harvesters
G: [64 18]
R: [128 16]

lady
G: [50 16]
R: [112 16]

workshop
G: [56 0]
R: [105 0]

emir
G: [49 32]
R: [40 48]



Isfandiyar
G: [40 8]
R: [96 0]

Milanie
G: [64 16]
R: [128 24]

Milanie2
G: [56 0]
R: [128 0]

Lugano
G: [40 0]
R: [92 0]

Kostroma
G: [48 20]
R: [112 36]

Kapri
G: [28 0]
R: [74 0]



Bells and Whistles Exhaustive Search
Aligned Aligned with Canny Edge Detection Details
emir_before emir_after Before
G: [49 32]
R: [40 48]

After
G: [49 24]
R: [112 42]