Images of the Russian Empire: Colorizing the Prokudin-Gorskii photo collection

by Sravya Basvapatri

The goal of this project is to colorize images taken by Prokudin-Gorskii over 100 years ago, before color photography printing was possible.

Background
Color Channel Alignment
1. Choosing a Base Color
2. Single-Scale Implementation
3. Multi-Scale Pyramid Implementation
4. Reflection: Challenges and Improvements
5. Alignment Algorithm Output Summary
Bells & Whistles

i. Background

Sergei Mikhailovich Prokudin-Gorskii (1863-1944) was a photographer well ahead of his time. He traveled across the Russian Empire to take color photographs in a time before it was possible to print color photos. He recorded three exposures using a red, green, and blue filter, imagining that special projectors could make his images viewable by children across Russia. His RGB glass plate negatives were purchased in 1948 by the Library of Congress. The LoC has recently digitized the negatives and made them available on-line.

My goal was to colorize these black and white images taken by Prokudin-Gorskii over 100 years ago, so that we can enjoy them digitally.

ii. Color Channel Alignment

In order to capture color images before his time, Prokudin-Gorskii captured 3 different images-- one for each different filter, R, G, and B. We can see this from the 3 replicas of each shot above. Unfortunately, these images are slightly misaligned. We need to find the alignment between the different color channels of each image using similarity metrics.

a) Choosing a Base Color

During my initial testing, I found that aligning red to green and blue to green (to a green base) worked better more consistently than the pre-suggested approach of aligning to a blue base. My hypothesis for why this might be the case links to the Bayer filter, which uses twice as many green elements as red or blue to mimic the physiology of the human eye. Perhaps this sensitivity to green light makes the green-aligned pictures appear better. If the display indeed uses 2 green pixels for every one blue and red, the green alignment could also hold more weight than the other two.

Below is alignment to a blue base (left) vs a green base (right) for two of our larger .tif images. These searches used Normalized Cross Correlatin as a metric, and scaled down by 50% at each level using the pyramid approach described below. We can see that the improvement is noticeable here, especially in our ability to align red with the base in Emir. For smaller images, it made less of an impact.

Harvester:
Blue base (left) - Green Displacement: (61, 18) , Red Displacement: (125, 14)
Green base (right) - Blue Displacement: (-61, -18) , Red Displacement: (64, -4)

Emir:
Blue base (left) - Green Displacement: (49, 25) , Red Displacement: (-164, 37)
Green base (right) - Blue Displacement: (-49, -25) , Red Displacement: (57, 16)

From there, I sought to find a general alignment algorithm for all the images.

b) Single-Scale Implementation

For the smaller .jpg images, I found it easiest to use an exhaustive search within a (-12, 12) range of offsets. These offsets were from the center of each image, and applied using np.roll. To avoid issues with lost data at the edges, I used the center 50% of the image to calculate my similarity metric.

I tried two different similarity metrics: Sum of Squared Differences (SSD) and Normalized Cross Correlation (NCC). For these small jpg images, I didn't notice much difference, so I used SSD, which was noted to be slightly faster.

Cathedral.jpg (far left) - Blue Displacement: (-5, -2) , Red Displacement: (7, 1)

Monastery.jpg (center) - Blue Displacement: (3, -2) , Red Displacement: (6, 1)

Tobolsk.jpg (far right) - Blue Displacement: (-3, -3) , Red Displacement: (4, 0)

Next, I decided to try my single-scale implementation on a few more images of my choosing downloaded from the the Prokudin-Gorskii collection.

Transfiguration.jpg (far left) - Blue Displacement: (-8, -3) , Red Displacement: (10, 2)

Exit.jpg (center) - Blue Displacement: (-8, -2) , Red Displacement: (8, 1)

Builldings.jpg (far right) - Blue Displacement: (-3, -1) , Red Displacement: (4, 1)

For the one titled "Transfiguration.jpg," the detail in the trees comes out a bit misaligned. However, I noticed the recreation on the Libary of Congress page appears the same way. I believe for this shot, there must have been a wind blowing that make the 3 channels impossible to align over the leaves using our rigid algorithms.

c) Multi-Scale Pyramid Implementation

For larger images (.tif format), the single-scale alignment proved infeasible. These images had larger misalignments because they spanned more pixels and were higher resolution. An exhaustive search would take much too long, and my smaller window of (-12, 12) is not nearly large enough for the pixel shifts needed for these images.

Instead, I employed a pyramid approach for the remaining pictures. This approach scales down larger images first, runs the alignment algorithm on the smaller image, and then uses that offset (scaled back up) as a starting point to conduct an exhaustive search on a finer resolution window. My first efforts were using SSD with a (-12, 12) range search for the .tif images, and a scaling factor of 50% during each downsize until the image was below 400 pixels in width. The 400 pixels was chosen because the initial .jpg format images we tried all had dimensions under 400, and were well-aligned with the single-scale approach.

One thing I struggled with here was getting the multi-scale pyramid implementation correct. This algorithm was a bit tougher to test than the single-scale thanks to the longer runtime. One bug that existed for a while was improperly rescaling the offset when I recursively moved up a level in the pyramid. Once this was fixed however, the alignment seemed to work fairly well-- accomodating a much larger offset range than the single-scale approach.

One way I think this algorithm could be improved is by allowing for error in the offsets calculated at smaller pyramid levels. When images are rescaled to be much smaller, it is likely some of our offset calculations are off when scaled up. By searching the neighboring offsets on the higher levels of the pyramid, we can mitigate this. I found that the visually-noticeable error in alignment for my images was minimal, so I didn't implement this.

Here are the .tif format images using a multi-scale pyramid approach, aligned on a green base, with a 50% scaling factor, and an alignment window of (-12, 12) at each pyramid level. As before, I used the middle 50% of the image for alignment using a SSD similarity metric. As before, I didn't find that the coordinates changes between using SSD and NCC as my similarity metric, but SSD ran faster than NCC.

I also tried it on one additional image (in the bottom right) of peonies from the LOC.gov webpage.

Church - Blue Displacement: (-25, -4) , Red Displacement: (33, -8); 46.61s

Emir - Blue Displacement: (-49, -25) , Red Displacement: (57, 16); 45.50s

Harvesters - Blue Displacement: (-61, -18) , Red Displacement: (64, -4); 46.78s

Icon - Blue Displacement: (-44, -18) , Red Displacement: (48, 5); 45.73s

Lady - Blue Displacement: (-56, -9) , Red Displacement: (62, 4); 43.98s

Melons - Blue Displacement: (-86, -11) , Red Displacement: (96, 2); 49.20s

Onion Church - Blue Displacement: (-53, -27) , Red Displacement: (56, 9); 49.45s

Self-Portrait - Blue Displacement: (-81, -29) , Red Displacement: (98, 7); 50.41s

Three-Generations - Blue Displacement: (-54, -16) , Red Displacement: (59, -5); 49.89s

Train - Blue Displacement: (-48, -7) , Red Displacement: (42, 26); 49.2s

Workshop - Blue Displacement: (-55, 0) , Red Displacement: (53, -12); 48.55s

Peonies - Blue Displacement: (-53, -4) , Red Displacement: (53, -10); 43.45s

d) Reflection: Challenges and Improvements

e) Alignment Algorithm Output Summary

Image	Blue Displacement	Red Displacement	Time Elapsed for Alignment
cathedral.jpg	(-5, -2)	(7,1)	0.35s
church.tif	(-25, -4)	(33, -8)	46.61s
emif.tif	(-49, -25)	(57, 16)	45.50s
harvesters.tif	(-61, -18)	(64, -4)	46.78s
icon.tif	(-44, -18)	(48, 5)	45.73s
lady.tif	(-56, -9)	(62, 4)	43.98s
melons.tif	(-86, -11)	(96, 2)	49.20s
monastery.jpg	(3, -2)	(6,1)	0.32s
onion_church.tif	(-53, -27)	(56, 9)	49.45s
self_portrait.tif	(-81, -29)	(98, 7)	50.41s
three_generations.tif	(-54, -16)	(59, -5)	49.89s
tobolsk.jpg	(-3, -3)	(4,0)	0.41s
train.tif	(-48, -7)	(42, 26)	49.2s
workshop.tif	(-55, 0)	(53, -12)	48.55s

Extra Images (From Library of Congress webpage: https://www.loc.gov/pictures/collection/prok/)

Image	Blue Displacement	Red Displacement	Time Elapsed for Alignment
transfiguration.jpg	(-8, -3)	(10, 2)	0.35s
buildings.jpg	(-3, -1)	(4, 1)	0.34s
exit.jpg	(-8, -2)	(8, 1)	0.32s
peonies.tif	(-53, -4)	(53, -10)	43.45s

iii. Bells and Whistles

After completing the main part of the project, I was curious whether it would be possible to crop images or equalize them to make the colors pop more. I started writing an auto-crop mechanism using the Sobel edge detection filters. It worked best for the smaller images, but not as well for the larger ones-- likely because the edges were easier to see on the smaller images. Some of the images generated are shown below:

From here, I wanted to continue to write a cropping algorithm, but I got stuck on detecting appropriate cropping indices. Instead, I implemented a naive cropping algorithm, so I could try to improve the images in other ways. I attempted to write a histogram equalization algorithm as well. My final results are below. Please note that these images were compressed significantly to allow them to be uploaded under a 25MB limit.