CS194-26: CS194-26: Image Manipulation and Computational Photography

Project 1 – Images of the Russian Empire

Colorizing the Prokudin-Gorskii Photo Collection

Dimitrios Vlachogiannis (St. ID: 3034311700)

Project Overview


In the early 19th century, Sergei Mikhailovich Prokudin-Gorskii traveled all around Russian Empire, recording three exposures of every scene onto a glass plate using a red, a green, and a blue filter. Being way ahead of his time, he envisioned special projectors to be installed in "multimedia" classrooms all across Russia where the children would be able to learn about their vast country. His RGB glass plate negatives, capturing the last years of the Russian Empire, survived and were purchased in 1948 by the Library of Congress.

This project aims to automatically colorize the original glass-plate images, using image processing techniques. More precisely, the input image gets split into its respective exposures, and then after the appropriate alignment the exposures are stacked together returning the RGB color image.

Methodology

As a first step for all images, the three color channels are separated by dividing vertically the image into 3 parts of equal height. The next step includes aligning the second and the third parts (G and R) to the first (B). Instead of using the entire part for the alignment, all sides were cropped by 15%. That way only the center pixels are used for comparison, making the procedure apart from quicker more effective as well, excluding noise around the boarders.

Low Resolution Image Inputs: Single-Scale Implementation

A naive exhaustive approach was implemented here as the size and resolution of the image returned the desired results in under a second. The window of evaluated displacements ranged from -15 to 15 in both vertical and horizontal direction and the metric of similarity was chosen to be the Sum of Squared Differences (SSD). The one out of 900 displacements returning the minimum SSD was selected for the alignment of the examined pair (G to B or R to B).

High Resolution Image Inputs: Image Pyramid

Considering how exhaustive search becomes prohibitively expensive when the pixel displacement is too large, in search for a faster alignment procedure, the method of image pyramid was implemented. An image pyramid represents the image at multiple scales (in our case scaled by a factor of 2) and the processing is done sequentially starting from the coarsest scale (smallest image) and going down the pyramid, updating your estimate as you go. At each stage, when the corresponding images of two pyramids are compared, again all sides are cropped by 15% for greater focus to be given in the center pixels.

 

Results

Low Resolution Image Inputs: Single-Scale Implementation

cathedral.jpg: G(5,2), R(12,3)                                              monastery.jpg: G(-3,2), R(3,2)                                                    tobolsk.jpg: G(3,3), R(6,3)

 

 

High Resolution Image Inputs: Image Pyramid

emir.tif): B(-49,-24), R(57,17)                                                         harvesters.tif: G(60,17), R(124,14)                                                 icon.tif: G(40,17), R(89,23)

 (To produce the output below, we matched everything with G rather than B)

 

 

lady.tif: G(52,8), R(112,12)                                       melons.tif: G(82,10), R(178,14)                                            onion_church.tif: G(51,27), R(108,36)

 

 

self_portrait.tif: G(78,29), R(176,37)                         three_generations.tif: G(52,14), R(112,11)                                                 train.tif: G(42,6), R(87,32)

 

 

village.tif: G(64,12), R(138,22)                                                 workshop.tif: G(53,0), R(105,-12)