Project 1 - Colorizing the Prokudin-Gorskii photo collection

By William Loo, Fall 2020


Background

Around 1905, Prokudin-Gorsky envisioned and formulated a plan to use the emerging technological advances that had been made in color photography to document the Russian Empire systematically. Through such an ambitious project, his ultimate goal was to educate the schoolchildren of Russia with his "optical color projections" of the vast and diverse history, culture, and modernization of the empire (Wikipedia)

The Prokudin-Gorskii photo collection is the result of this work. The computer vision algorithm developed in this project aims reconstruct the color photographs by methodically combining the three color channels together.

Process

red_emirgreen_emirblue_emir

The software is required to find an optimal alignment that maximizes correlation between the red and blue channels, and then the green and blue channels. Finally, the three channels are combined into one image. The development of the algorithm was further improved through the following steps:

  1. Loss Metric Selection
  2. Image Pyramid
  3. Border Removal

Loss Metric Selection

A few candidate metrics were evaluated:


As a baseline metric, SSD was helpful in getting the initial results. Using SSIM produced the highest quality image, but Zero Normalized Cross Correlation proved to be the best tradeoff between quality and speed.


Image Pyramid

Initially performing an extensive search over the smallest scaled image, subsequent searches in upscaled images search in the upscaled zone between two pixels in the previous scale. This significantly reduces our search space in larger images, and quickly yields higher-quality results.

Border Removal

During the process of matching images, the border contributes little to the final image, yet provides a constant source of distraction. For the matching process, the algorithm removes 20% from each side of the image, and performs the matching on the middle 60% of the image.

arches
Red: [24, 96] Green [8, 32]

castle
Red: [0, 96] Green [0, 32]

cathedral
Red: [2, 10] Green [2, 4]

emir
Red: [-200, 0] Green [24, 48] (see below for fixed version)

harvesters
Red: [16, 120] Green [16, 56]

icon
Red: [24, 88] Green [16, 40]

lady
Red: [8, 112] Green [8, 48]

lake
Red: [0, 6] Green [0, 2]

melons
Red: [16, 176] Green [8, 80]

monastery
Red: [2, 2] Green [2, -4]

onion_church
Red: [36, 108] Green [24, 48]

rocks
Red: [-2, -2] Green [0, -4]

self_portrait
Red: [40, 176] Green [32, 80]

three_generations
Red: [16, 112] Green [16, 48]

tobolsk
Red: [4, 6] Green [2, 2]

train
Red: [32, 88] Green [8, 40]

workshop
Red: [0,56] Green [-8,104]

Further Alignment Improvements

The emir image has brightness differences in the green channel which proved tricky to properly align upon the blue channel. Alignment could be fixed by replacing the loss metric implementation with SSIM, or alternatively aligning the other two color layers on the green layer as opposed to the blue layer, with the green color layer's inconsistencies being broadcast across all layers being aligned:

SSIM Result:

ssim_emir
Red: [41, 106] Green [24, 48]

Color Layer Alignment Result on the Green Layer instead of the Blue Layer:

shuffled_emir
Red: [16, 56] Blue [-24, -48]


Prepared for CS194-26, taught by Prof. Alexei (Alyosha) Efros, last updated 9/09/2020.