CS194-26 Project 1: Colorizing the Prokudin-Gorskii Photo Collection

Brian Zhu

brian_zhu@berkeley.edu

Project Overview

Over 100 years ago, Sergei Mikhailovich Prokudin-Gorskii pioneered the development of color photography through his three-camera system, which created negatives in three color channels: red, green, and blue. Prokudin-Gorskii journeyed across Russia, documenting its various aspects and collecting over 10,000 of such novel exposures in the span of 10 years. His works have been saved in the Library of Congress, and with modern technology we are now able to composite Prokudin-Gorskii's negatives into color composites. The objective of this project is create our own homemade compositing algorithm.

Problem Statement

The input image is a stack of each of the three color negatives in the order of blue, green, red (top to bottom):

The goal is to split the input by channel (i.e. split into thirds) and stack the channels together to form an RGB image. For example, a good output would look like the following:

However, the results from doing a naive stack doesn't look very good:

Clearly some extra alignment needs to be done.

Methodology and Results

Starting Simple: Aligning with Single-Scale Search

First, establishing some quick terminology:

The main idea: we start with the source and target with their top left corners aligned

The target stays fixed, and we "slide" the source around a small area (e.g. it can only move from its original position by up to 15px in any direction).

We then take the area of overlap between the source and target and calculate a difference score. The best offset is one which creates an area of overlap with the best difference score.

Difference Scores

The base difference metrics are the following:

However, it is important to note that these metrics are affected by the number of pixels being compared (e.g. 1 pixel with a difference of 1.0 compared to 100 pixels with a difference of 0.01). Since the area of overlap is not always the same, it may be safer to take the average instead:

Single-Scale Search Results

Images used (from left to right): cathedral.jpg, monastery.jpg, tobolsk.jpg

Parameters:

SSD:

NCC:

SSD-Avg

NCC-Avg

We see that both SSD and SSD-Avg are able to align the color channels correctly, while NCC misaligns cathedral and monastery and NCC-Avg only misaligns cathedral. The final alignments are the following (negative offsets correspond to shifting left (x-axis) or upward (y-axis), positive offsets correspond to shifting right (x-axis) or downward (y-axis)):

Alignments on JPG Images

File nameOffset (red channel)Offset (green channel)
cathedral.jpgx: 3px, y: 12pxx: 2px, y: 5px
monastery.jpgx: 2px, y: 3pxx: 2px, y: -3px
tobolsk.jpgx: 3px, y: 6pxx: 3px, y: 3px

From this point on, the SSD-Avg metric is used for difference scoring

A Recursive Upgrade: Pyramid Search

It is important to note that the above images were only about 350 x 350 pixels in size. What happens if the image is much larger, say 3500 x 3500 pixels? If a 16 x16 search would suffice for the smaller images, it could mean that search space will need to be much larger, e.g. 160 x 160, which is 100 times larger!

To reduce the total amount of searching, we can try to take a coarse-to-fine approach (e.g. search a 100 x 100 area with a stride of 10 pixels, find the best offset so far, then search in a 10 x 10 area around the best offset with a stride of 1 pixel).

The idea is the following:

The repeated scaling of the image can be done recursively, and even with a small search area per layer, we can end up covering a large search space. For example, if we scale the image down by 2x 4 times and use a 2 x 2 search area, we still are effectively searching a 16 x16 area.

Using this method, we can align the larger tif images, which are about 3500 x 3500 pixels in size, without expending too much time.

Pyramid Search Results

Parameters:

At the same time I also tried varying the target channel between red, green, and blue

Using blue as target channel:

church.tif

icon.tif

onion_church.tif

train.tif

emir.tif

lady.tif

self_portrait.tif

workshop.tif

harvesters.tif

melons.tif

three_generations.tif

Using green as target channel:

church.tif

icon.tif

onion_church.tif

train.tif

emir.tif

lady.tif

self_portrait.tif

workshop.tif

harvesters.tif

melons.tif

three_generations.tif

Using red as target channel:

church.tif

icon.tif

onion_church.tif

train.tif

emir.tif

lady.tif

self_portrait.tif

workshop.tif

harvesters.tif

melons.tif

three_generations.tif

In all three cases, the pyramid search works well in general. However, looking at emir.tif in particular, we see that using the green target channel yields the most robust results:

Aligning to blue channel

Aligning to green channel

Aligning to red channel

The final alignments are the following (negative offsets correspond to shifting left (x-axis) or upward (y-axis), positive offsets correspond to shifting right (x-axis) or downward (y-axis)). The offsets reported are for alignment against the green channel:

Alignments on TIF images

NameOffset (red channel)Offset (blue channel)
church.tifx: -8px, y: 33pxx: -4px, y: -25px
emir.tifx: 17px, y: 57pxx: -24px, y: -49px
harvesters.tifx: -3px, y: 65pxx: -17px, y: -59px
icon.tifx: 5px, y: 48pxx: -17px, y: -41px
lady.tifx: 3px, y: 62pxx: -9px, y: -55px
melons.tifx: 3px, y: 96pxx: -10px, y: -81px
onion_church.tifx: 10px, y: 57pxx: -27px, y: -51px
self_portrait.tifx: 8px, y: 98pxx: -29px, y: -79px
three_generations.tifx: -3px, y: 59pxx: -14px, y: -53px
train.tifx: 27px, y: 43pxx: -6px, y: -43px
workshop.tifx: -11px, y: 52pxx: 0px, y: -53px

Custom Images 🙂

With pyramid search finding pretty good offsets at decent speeds, it's time to try running the same search on other images from Prokudin-Gorskii's collection.

Parameters used:

adobe_house.tif

creek.tif

flowers.tif

sculpture.tif

bridge.tif

floodgates.tif

marsh.tif

Conclusions

It was really fun to build an algorithm from scratch that allows us to take a more colorful (pun intended) view of the past! Looking at the same photos in the Library of Congress, there clearly still is a lot more progress that can be made in terms of improving image processing beyond alignment. Perhaps developing automatic cropping, color balancing, as well as ways to clean up artifacts in the image are further steps to look into!