Images of the Russian Empire -- colorizing the Prokudin-Gorskii photo collection

Sergei Mikhailovich Prokudin-Gorskii (1863-1944) lived in a time before color photography. Convinced that it would exist in the future, he took greyscale photos of the Russian Empire in three exposures on a glass plate using red, green, and blue filters. In this project, we took these digitized Prokudin-Gorskii glass plate images to create colorized images using image processing. Our program builds off the provided python starter code that converts each provided image, which displays a set of three greyscale images representing each channel, to a numpy array and performs a simple 1/3 split to separate the three images into separate matrices.

Naive Offset Alignment

First, the greyscale images for the RGB channels must be properly aligned on top of each other to produce a colorized image.

We first cropped 10% off of each image. This was done to remove border artifacts in the original images that would interfere with alignment matching.

We then set the blue filter image as the base template and searched over a [-15,15] pixel window of displacement for each of the other two images to find the best-scoring offset through our image matching metric. To actually offset the red and green channel images, we looped through the range of x and y offsets, and performed np.roll operations on the two image matrices. We ended up using normalized cross-correlation between the blue channel image matrix and each of the other two as the alignment metric as it produced visually better offsets than Sum of Squared Differences.

Once the proper alignments were found, the original, non-cropped red and green channel images were shifted using np.roll with their respective offsets, and overlayed along with the original blue channel image into a matrix with depth 3 to produce the final color image.



cathedral.jpg	monastery.jpg	settlers.jpg

Although this method works for small images, the provided .tif examples were too large and thus the program could not finish processing them in a reasonable amount of time. It turns out performing 30^2 offsets and cross-correlations on images sized several thousand pixels all in a single for loop does not bode well for performance. It also did not align very well as the search window was really small relative to the image.

Image Pyramid

In order to speed up naive alignment process, we implemented and performed offset searching on an image pyramid. An image pyramid is a collection of a single image at different resolutions. The offset search can start from the coarsest scale and gradually work down to the largest scale to reduce the necessary search space.

In actual implementation, we first determined how many levels of image resolution we wanted, halving the width and height at each level. We set the minimum width to be 100 pixels and just took the ceiling of the log of the image width by the minimum width to get how many levels in the pyramid.

We then used sk.transform.rescale to resize the image to the current level resolution starting with the smallest. We ran the original naive offset function from before on the scaled images with a smaller [-5,5] pixel search window. The returned offsets were then doubled to maintain scale with the next resolution level and used as the center offset of the new search window.

By performing offset searching at multiple resolution levels, we could narrow down the best offset faster at lower resolutions. As a result, we did not need to use a large search window on the highest resolution, killing performance.



harvesters.tif	icon.tif	train.tif

At this point, most of the high resolution .tif files could be properly aligned. However, emir.tif proved to be a special case as the different channel images had significantly different brightnesses.


	3d emir.tif

Better Features: Edge Detection

To fix alignment on emir.tif, we decided to perform image matching on edge-filtered channel images. Our goal was to ignore brightness on the greyscale images. Since alignment is defined solely by how well the edges of images match, using edge filtering as the feature seemed natural.

Before searching offsets, we first convolved each of the input images with the Sobel filter using scipy.signal.convolve2d and took the absolute value of the result to create a simple edge version of the image. We could now perform alignment using just the edges.



red channel edges	green channel edges	blue channel edges

Applying this filter to the search images fixed alignment for emir.tif and had no effect on other images that already worked.


	fixed emir.tif

Contrast Enhancement

We used contrast stretching to perform a small bit of contrast enhancement. While searching for histogram equalization functions, we found skimage.exposure.rescale_intensity to be a nice fit here. Setting the bounds to be the 5th and 95th percentile made some minor differences to the resulting images.


original self_portrait.tif	contrast-enhanced self_portrait.tif

Border Cropping

This ended up being a slightly complex mess through a guess and check process. My initial thought process was that I could compress the final RGB image to a single greyscale image and then perform an edge filter on it. The borders would produce long clear vertical and horizontal edges with greater intensity values than edges on the actual image which would not be as continous and clear. Therefore, I could just average the pixel values along a row/column and do thresholding to find which row/column corresponded to edge boundaries and just crop off up to those values. The border edges would have the highest averages. This approach did not work very well as a lot of the edge boundaries were not clearly defined.

What I ended up doing was the inverse, mainly detecting the rows and columns with an extremely low average intensity. I found that the borders in the edge images resulted in the most consistently low-intensity space instead of high-intensity edges. Therefore I thresholded off of lowest average intensity rows/columns, and only included really high instensity rows/columns as insurance.

In terms of actual implementation, I first used skimage.exposure.equalize_adapthist to increase the contrast on the original image as much as possible to get as much noise as possible in the actual image. The values of the edge-filtered images were then normalized to [0, 1]. Afterwards, I averaged the values of all the rows/columns and found the indices of the ones with average intensities less than 0.03 and and greater than 0.4 respectively. I cropped at the detected rows and columns closest to the center on each side of the image.

This ended up working decently well for some images.



turkmen.tif	train.tif	harvesters.tif

It was not perfect as it sometimes left small remenants of border color that do not form proper borders. Occassionally this method would not pick up a boundary, especially if the border edge is extremely noisy. There were also cases in which too much was cropped.


original icon.tif	overcropped icon.tif

Experimental: Color Level Adjustment

I thought that certain images were too red so I looked to perform histogram equalization on the red channel with skimage.exposure.equalize_hist. I personally prefer the less red images after equalization. This only worked for certain overly red images, and made other images overly blue/green.

default emir.jpg

Red linearized emir.jpg

Offset Results


cathedral.jpg r: (12, 3) g: (5, 2)	emir.tif r: (107, 40) g: (49, 24)	harvesters.tif r: (124, 14) g: (60, 17)

icon.tif r: (90, 23) g: (42, 17)	lady.tif r: (120, 13) g: (56, 9)	monastery.jpg r: (3, 2) g: (-3, 2)

nativity.jpg r: (8, 0) g: (3, 1)	self_portrait.tif r: (176, 37) g: (78, 29)	settlers.jpg r: (14, -1) g: (7, 0)

three_generations.tif r: (111, 9) g: (54, 12)	train.tif r: (85, 29) g: (41, 2)	turkmen.tif r: (117, 28) g: (57, 22)

village.tif r: (137, 21) g: (64, 10)