Images of the Russian Empire: Colorizing the Prokudin-Gorskii photo collection

Approach

Alignment Metrics

Naturally, we need some kind of metric to measure how well two color channels align with each other. The two metrics I explored in this project are:

Sum of Squared Differences (SSD)

Normalized Cross-Correlation (NCC)

I wrote additional helper functions to perform chores such as cropping the images (to remove uninteresting borders) and applying the Sobel operator (to see if edge detection is useful).

Exhaustive Search

In the very beginning, I tried exhaustively searching over a window of possible displacements using a window size of 15. I would compute the alignment score (using one of the aforementioned metrics) and keep track of the best alignment vector encountered so far. This was a naive implmentation in that although it worked well for small JPEG images (under 200 KB), it was computationally infeasible for the large TIFF images (around 70 MB) to finish.

Constructing an Image Pyramid

A more efficient way to align the color channels is through the use of an image pyramid. An image pyramid is a hierarchical way of representing an image at different scales, where the top levels are smaller images that have been blurred and subsampled from the original.

"In a Gaussian pyramid, subsequent images are weighted down using a Gaussian average (Gaussian blur) and scaled down. Each pixel containing a local average corresponds to a neighborhood pixel on a lower level of the pyramid." --Wikipedia

As required by the project, I implemented an image pyramid without using existing high-level implementations. Specifically, I used recursion to reach the top level of the pyramid, where I would perform an exhaustive search on the image (which is a much smaller version of the original) and pass the alignment estimate one level down the pyramid.

This led to great runtime improvements. For a window size of 5, whereas the naive approach takes more than 2 minutes (and often much longer), the image pyramid can finish within 15 seconds.

(Auto-)Cropping is All You Need!

Aligning some color channels was initially difficult, even when I set the window sizes to high values, such as 15 pixels. Through experimentation, I soon realized that the major culprit was the borders surrounding the images. Since the images were scanned, many contained blemishes and handwriting, effectively introducing noise as the images were aligned. Thus, my intuition is that removing borders more accurately will help better color channel alignment.

To this end, I implemented a default cropping function, as well as an automated version that searches for the largest connected region using Otsu's method. I used the default cropping function to roughly remove the borders and then employed auto-cropping to fine-tune the result.

Handling Edge Cases

Although the pipeline now works efficiently and achieves good results on most images, it struggles with three inputs: emir, melon, and sculpture. I investigated the images more closely and found (r, b) was aligned differently from (g, b). To solve this, I made one simple fix: to align the color channels to the green channel instead. This way we are computing the alignment vectors for (b, g) and (r, g).

This turned out to be successful. Not only did it solve the edge cases, but it also worked well on all other images. I think the reasoning behind this is if we start off with an incorrect distribution, then it will be very hard for the alignment to be successful.

Results

Examples Images

Note: Offsets are listed as (y_offset, x_offset).

Cathedral.jpg

Offsets for (g, b): [-5, -2]

Offsets for (r, b): [7, 1]

Church.tif

Offsets for (g, b): [-25, -4]

Offsets for (r, b): [33, -8]

Emir.tif

Offsets for (g, b): [-49, -24]

Offsets for (r, b): [57, 17]

Harvesters.tif

Offsets for (g, b): [-60, -16]

Offsets for (r, b): [65, -3]

Icon.tif

Offsets for (g, b): [-40, -17]

Offsets for (r, b): [48, 5]

Lady.tif

Offsets for (g, b): [-53, -8]

Offsets for (r, b): [63, 3]

Melons.tif

Offsets for (g, b): [-82, -9]

Offsets for (r, b): [96, 3]

Monastery.jpg

Offsets for (g, b): [3, -2]

Offsets for (r, b): [6, 1]

Onion_Church.tif

Offsets for (g, b): [-51, -26]

Offsets for (r, b): [57, 10]

Sculpture.tif

Offsets for (g, b): [-33, 11]

Offsets for (r, b): [107, -16]

Self_Portrait.tif

Offsets for (g, b): [-79, -29]

Offsets for (r, b): [98, 8]

Three_Generations.tif

Offsets for (g, b): [-54, -11]

Offsets for (r, b): [58, -1]

Tobolsk.jpg

Offsets for (g, b): [-3, -3]

Offsets for (r, b): [4, 1]

Train.tif

Offsets for (g, b): [-43, -5]

Offsets for (r, b): [43, 27]

Chosen Examples from Prokudin-Gorskii Collection

School.tif

School in the village of Pidma named after His Imperial Majesty, Sovereign, Heir Apparent, Crown Prince, Grand Duke Aleksei Nikolaevich. [Russian Empire]

Offsets for (g, b): [-26, -9]

Offsets for (r, b): [36, 0]

Monument.tif

City of Lodeinoe Pole. Monument to Emperor Peter the Great. [Russian Empire]

Offsets for (g, b): [-22, -21]

Offsets for (r, b): [36, 8]

Study.tif

Ostrechiny. Study. [Russian Empire]

Offsets for (g, b): [-13, 5]

Offsets for (r, b): [120, -6]

Group.tif

Group of children. [Russian Empire]

Offsets for (g, b): [-66, -35]

Offsets for (r, b): [77, 17]

Sawmill.tif

View of the sawmill. Kovzha. [Russian Empire]

Offsets for (g, b): [-15, -22]

Offsets for (r, b): [42, 15]

Machine.tif

Stone-excavating machine of the multi-scoop type "Svirskaia pervaia." [Russian Empire]

Offsets for (g, b): [-33, -2]

Offsets for (r, b): [34, -21]

Bells & Whistles

Automatic Cropping

I implemented an automatic cropping function that takes in the three color channels and returns cropped versions of them. The cropping is based on the thresholding of the red channel using Otsu's method to segment the image into significant regions, and then using the bounding boxes of the two largest regions to determine the cropping boundaries.

This mechanism is combined with the default cropping function to achieve optimal results. In the following example, notice how automatic cropping leads to more a more optimal alignment vector for (r, b).

Before

Offsets for (g, b): [-5, -2]

Offsets for (r, b): [0, 1]

After

Offsets for (g, b): [-5, -2]

Offsets for (r, b): [7, 1]

Before

After

Automatic Contrasting

I also implemented automatic contrasting to map the darkest pixel to zero and the brightest pixel to (on its brightest color channel). This serves as a gentle way to rescale image intensities and improve image quality. Overall, images appear more natural and pleasant.

Before	After
Before	After

Images of the Russian Empire:

Colorizing the Prokudin-Gorskii photo collection

Ron Wang

CS 180, Fall 2023, UC Berkeley

Overview

Approach

Alignment Metrics

Exhaustive Search

Constructing an Image Pyramid

(Auto-)Cropping is All You Need!

Handling Edge Cases

Results

Examples Images

Cathedral.jpg

Church.tif

Emir.tif

Harvesters.tif

Icon.tif

Lady.tif

Melons.tif

Monastery.jpg

Onion_Church.tif

Sculpture.tif

Self_Portrait.tif

Three_Generations.tif

Tobolsk.jpg

Train.tif

Chosen Examples from Prokudin-Gorskii Collection

School.tif

Monument.tif

Study.tif

Group.tif

Sawmill.tif

Machine.tif

Bells & Whistles

Automatic Cropping

Before

After

Before

After

Automatic Contrasting

Before

After

Before

After