Images of the Russian Empire

Fully automated colorization of the Prokudin-Gorskii collection of digitized glass plates from the late Russian Empire.

A CS 194-26 project by Kevin Lin, cs194-26-aak

Emir in Blue
Emir in Green
Emir in Red
Emir

The Library of Congress has recently made available high-resolution scans of the Prokudin-Gorskii collection of color photographic surveys of the late Russian Empire ca. 1905 and 1915. Each of the raw glass negatives contains three filtered color components of the same image in blue-green-red order, but lacking proper alignment and optimization for digital display.

We present a fully automated colorization technique for taking glass negative scans and applying modern image processing and filtering techniques to identify the necessary displacement vectors to faithfully reproduce full-color images of the Russian Empire.

Getting Started

The colorization program is written in Python 3 and requires a recent version of numpy and scikit-image.

main.py [--crop] [--contrast] filenames

Alignment

The goal of plate alignment is to find the correct (x, y) displacement vectors to overlay the green and red channels over the blue channel. We compute the alignments for each color separately.

Emir in Blue
Emir under a blue-light filter.
Emir in Green
Emir under a green-light filter.
Emir in Red
Emir under a red-light filter.

Alignment works by first pre-processing the image to remove edges from consideration. 10% of the margins are removed so that borders and artifacts are removed from the picture.

Then, we search within a specified region for the best possible displacement vector, where we define the heuristic based on normalized cross-correlation, which takes the inner product between the two normalized vector representations of each green and blue or red and blue images. The higher the cross-correlation, the better the alignment.

Once the alignment is found, we return a new image that represents the final product.

Village
Lady
Teacher

Standard alignment using only normalized cross-correlation on pixel intensities is actually fairly successful with most of the input images. Some images, like emir, however, exhibit alignment issues as the natural light that is captured by the camera differs depending on the color of the original subject.

The difference in brightness between emir's blue and red filters creates a false positive for the normalized cross-correlation.

Emir in Blue
emir under a blue-light filter.
Emir in Red
emir under a red-light filter.
Emir intensities
emir processed using NCC on raw pixel intensities.

Image Pyramid

On small images, the simple alignment algorithm is fully capable of searching across a large range of displacement vectors very quickly. However, with larger images, even a small search range can be prohibitively expensive, so we implemented an image pyramid to reduce the search range and improve runtime.

Nativity
2.8s
Nativity displaced over a (-15, +15) box. Because this a small image, the search box covers about 10% of the width and height of the image.
Monastery
3.3s
Monastery displaced over a (-15, +15) box. Like Nativity, this is a small image, so the search box covers about 10% of the width and height of the image.
Emir
4m17.0s
emir displaced over a (-15, +15) box. Because this is a raw scan almost 4000x10000, the search box only covers a fraction of a percent of the width and height.

Image pyramids refer to a very general strategy for how we can rescale images and work with smaller copies of the same image to speed up execution. Let's put it into perspective. In a 400x400 image thumbnail, it takes only a few seconds to find the best alignment when scanning across the range of possible (x, y) displacements in (-15, +15).

But computing the same (-15, +15) square of possible displacements over the original 4000x4000 pixel scans takes over 4 minutes on a fast machine! And this is without yielding a good result because a displacement of 15 pixels represents a shift of less than half a percent, not enough to correct the image alignment.

Our image pyramind algorithm first recurses down to a thumbnail-sized image, solves the optimal alignment problem for the thumbnail copy, and then returns the displacement vector back to the caller before attempting to find the best alignment for larger images. This allows the larger image to pick up where the processing for the smaller image stopped and take advantage of the incremental progress made on the smaller image.

Emir
43.7s
emir with the optimized image pyramid algorithm.

Edge Detection

We found that, on the emir of Bukhara, the large differences in image intensity caused issues for the normalized cross-correlation.

To remedy the situation, we added an additional step to the alignment pre-processing stage. In addition to edge removal, we also added a Sobel filter to identify the gradients and edges in the image. The Sobel filter returns a black-and-white image containing the most relevant information for alignment: the edges.

Emir Sobel
emir after applying the Sobel filter.
Emir Sobel (enhanced)
emir after applying the Sobel filter (enhanced).
Emir
emir with the optimized image pyramid algorithm.

We chose the Sobel filter over the Canny edge detection filter as it provided better performance without sacrificing on results.

Contrast Adjustments

To enhance the visual appeal of the images, we added a slight contrast adjustment which boosted the contrast of the images by clipping black and white to the 2nd and 98th percentiles. This provided a slight but noticeable increase in contrast without introducing artifacts or blowing out the image.

Cropping

The most creative component of our implementation is in the automatic cropping. Inspired by the edge detection filter strategy, we researched additional methods for understanding and interpreting edges in an image before stumbling upon the Hough Transform.

The Hough transform in its simplest form is a method to detect straight lines [1].

We used the Probabilistic Hough Transform to identify the cropping boundaries. A line is part of the crop width if it's the case that the line is either somewhat vertical or somewhat horizontal, as determined by its start and end coordinates, and within 10% of the image margin.

This yielded great results on most images. But a few images were over-cropped due to the line detector picking up features of the scene and treating them as borders.

Emir
emir before cropping.
Emir
emir after cropping.
Teacher
teacher before cropping.
Teacher
teacher after cropping.

Large images were scaled down prior to applying the transform to improve performance.

Professional Restoration

In 2000, the Library of Congress commissioned WalterStudio to professionally retouch a representative set of 123 plates from the Prokudin-Gorskii collection. We’ve selectively rendered and reproduced a few of the photos to offer a comparison between our automated alignment program and the professionally retouched photos.

Emir
emir, professionally retouched.
Emir
emir, as reproduced by our algorithm.
Harvesters
harvesters, professionally retouched.
Harvesters
harvesters, as reproduced by our algorithm.
Icon
icon, professionally retouched.
Icon
icon, as reproduced by our algorithm.
River
river, professionally retouched.
River
river, as reproduced by our algorithm.
Self Portrait
self_portrait, professionally retouched.
Self Portrait
self_portrait, as reproduced by our algorithm.
Teacher
teacher, professionally retouched.
Teacher
teacher, as reproduced by our algorithm.
Three Generations
three_generations, professionally retouched.
Three Generations
three_generations, as reproduced by our algorithm.
Train
train, professionally retouched.
Train
train, as reproduced by our algorithm.
Turkmen
turkmen, professionally retouched.
Turkmen
turkmen, as reproduced by our algorithm.
Village
village, professionally retouched.
Village
village, as reproduced by our algorithm.

Provided Examples

Cathedral
cathedral
G: (5, 2) R: (12, 3)
Emir
emir
G: (49, 24) R: (107, 40)
Harvesters
harvesters
G: (60, 17) R: (124, 14)
Icon
icon
G: (42, 17) R: (90, 23)
Lady
lady
G: (56, 9) R: (120, 13)
Monastery
monastery
G: (-3, 2) R: (3, 2)
Nativity
nativity
G: (3, 1) R: (8, 0)
Self Portrait
self_portrait
G: (78, 29) R: (176, 37)
Settlers
settlers
G: (7, 0) R: (14, -1)
Three Generations
three_generations
G: (54, 12) R: (111, 9)
Train
train
G: (41, 2) R: (85, 29)
Turkmen
turkmen
G: (57, 22) R: (117, 28)
Village
village
G: (64, 10) R: (137, 21)

Additional Examples

Altar
altar
G: (20, 16) R: (60, 17)
Pulpit
pulpit
G: (62, 1) R: (135, -5)
River
river
G: (39, -1) R: (152, -7)
Students
students
G: (76, -4) R: (161, -21)
Sunset
sunset
G: (52, -14) R: (119, -35)
Teacher
teacher
G: (71, 39) R: (147, 62)