Project 1

Colorizing the Prokudin-Gorskii photo collection

Overview

Prokudin-Gorskii (1863-1944) traveled around the Russian Empire photographing people, buildings, landscapes with the vision producing colored photographs. On one glass plate, he recorded three exposures of a scene, one through a red filter, another through and green filter, and finally through a blue filter.

Objective

The goal of this project is to use Prokudin-Gorskii’s glass plate images, digitized by the Library of Congress, to produce colored RGB representations of the scene photographed by automatically aligning the three color channels and applying imaging processing techniques to minimize visual artifacts.

Naïve Approach

The program takes in a glass plate image as an input and divides the image into the three color changes b, g, and r that then need to be aligned. My first approach aligned the green and red channels to the blue channels by shifting them vertically and horizontally in the range [-15, 15]. For every possible shift combination, I computed the normalized cross correlation between the shifted channel (green or red) and the reference channel (blue), and selected the shift that produced the largest NCC (i.e. was the most similar to the reference channel). This approach works relatively well for small images under 600 x 600px; however, processing larger images is costly because the pixel displacement is too large, requiring an arbitrarily large displacement search range.

Image Pyramid Approach

For larger images, I implemented a faster search procedure that scales the image (reducing them by a factor of 2) repeatedly until they are smaller than 600 x 600px. Starting at the smallest image up to the original size, I compute the best displacement using NCC and then apply the calculated displacement to the next scaled version as a starting point for the search at the next level. Thus, preceding alignments on coarser images provide an approximation for the search at the next level so that the range of shifts tested can remain small in the range[-15, 15]. To optimize this algorithm, I experimented with the following modifications:

  • Because the borders of the images hurt my results, for small images under 600 x 600px, I crop the image by a specified amount (20px) before computing my metric on the displacements.
  • To meet the time requirements, if the image (or any of its smaller scales in the pyramid) exceeds 600 x 600px, I only compute the metric on the innermost 600 x 600px.
  • Aligning the images to the blue channel seemed to work in most cases, except for emir.tif. I modified my algorithm to compare the NCC of the best displacement with blue as the reference channel to the NCC of the best displacement with green as the reference channel on the alignment at the smallest image size. From there, I used the method (aligning to blue or green) that produced the higher NCC on the larger images. My results showed that at the smallest level, aligning to the green channel always produces an equal or better result to aligning to the blue channel, so my final algorithm just aligns to the green channel instead of computing the better method.

Example Images

B: (-2, -5), R: (1, 7)

B: (-24, -48), R: (17, -48)

B: (-18, -60), R: (-2, 64)

B: (-18, -42), R: (6, 48)

B: (-8, -54), R: (4, 60)

B: (-2, 3), R: (1, 6)

B: (-1, -3), R: (-1, 4)

B: (-30, -78), R: (8, 98)

B: (0, -7), R: (-1, 8)

B: (-14, -50), R: (-4, 60)

B: (-6, -42), R: (26, 44)

B: (-22, -56), R: (6, 60)

B: (-12, -64), R: (10, 72)

Other Images from the Prokudin-Gorskii collection

B: (-28, -62), R: (18, 74)

B: (30, -56), R: (-26, 74)

B: (-38, -70), R: (24, 76)