Project 1: Images of the Russian Empire: Colorizing the Prokudin-Gorskii Photo Collection

Justin Chen

Overview

The primary goal of this project is to assemble colorized images by aligning RGB glass plate negatives, specifically those from the Prokudin-Gorskii collection of the Russian Empire from the early days of color photography. Since each of the three negatives are technically separate images in their own right, there are some offsets between them – we use different methods to systematically correct this via image processing! At its core, the fundamental objective of this project is to find four parameters: the (x, y) pixel shifts required for two of the three color negatives to align with the third.

Exhaustive Search

To start, I implemented a naive brute force search for the four offset values by trying each displacement value in [-15, 15] and computing the Sum of Squared Differences (SSD) of each (red and green negatives) with the blue negative in order to find the lowest scoring (best) values. Before any searching, I added a preprocessing step that just crops off 20% from the borders since I found that through experimentation some of the borders and artifacts were interfering with the performance of the metric (SSD). For the jpg images, this works fine as the files are not too large, but this approach becomes inefficient and infeasible with the larger tif images as the number of possible parameters to search through simply becomes too large. The results along with the found displacement values for the jpg images are displayed below.

Cathedral
G: [5, 2]; R: [12, 3]
Monastery
G: [-3, 2]; R: [3, 2]
Tobolsk
G: [3, 3]; R: [6, 3]

Image Pyramid Approach

To accomodate the large image sizes of the tif files, I switched to an image pyramid approach. The core idea here is that we can estimate the displacement offset by starting at a small, downscaled version of the original image and run the exhaustive window search from earlier to find an offset, then scale both the image and the found offset up by some factor (in this case, a factor of 2) until we get the original image back. Then, at the end I added a short window search with +/- 2 range from the final found offset in order to account for some rounding errors – I found that adding this extra part helped the quality of the resulting images by some margin. Furthermore, I kept the cropping and metric methods from the previous section as well. This is a recursive algorithm, with the base case set hit once the downscaled image reduces to below 100 pixels in either dimension. The output images are shown below along with their displacement values.

Church
G: [25, 4]; R: [58, -4]
Harvesters
G: [59, 16]; R: [123, 13]
Icon
G: [41, 17]; R: [89, 23]
Lady
G: [51, 9]; R: [112, 11]
Melons
G: [81, 10]; R: [178, 13]
Onion
G: [51, 26]; R: [108, 36]
Portrait
G: [78, 29]; R: [176, 37]
ThreeGen
G: [53, 14]; R: [112, 11]
Train
G: [42, 5]; R: [87, 32]
Workshop
G: [53, 0]; R: [105, -12]

Bells and Whistles

For the last of the example images, emir.tif, I explored many different approaches since just using the baseline SSD + image pyramid search does not work well with this image due to varying brightness levels across the color channels. I first tried switching up the order of alignment: I tried both aligning blue and green to red and aligning blue and red to green (which yielded, by far, the best results of the three) but the results were still not good enough. I then tried using various edge detectors such as the sobel filter and canny edge detector to preprocess the images before scoring them with the metric. At first, that also proved to be fruitless, but after removing the crop from the preprocessing step, the edge detector methods worked much better. However, it still wasn't perfectly aligned, but at least now there is only minor error, with sobel, canny, and roberts-cross methods all producing similar results both with NCC and SSD. From this, it could be possible that I was removing too much detail from the image and that there might be some important edge information near the borders. in the end, I realized I had a bug that was causing the second color channel to misalign after finding the displacement, so reverting back to a simpler approach of changing the order of alignment to Green as the base color worked. I still ran it with the edge detection additions, and the results were still very good and consistent.

EMIRB
Baseline Emir (g/r aligned with b)
EMIRG
b/r aligned with g
EMIRNCC
Green aligned Emir with NCC
EMIRCANNY
Non-cropped Emir with NCC and Canny
EMIRFINAL
B: [-49, -24]; R: [57, 17]

Additional Images from the Prokudin-Gorskii Collection

Greenhouse
G: [59, 28]; R: [126, 34]
Cliff
G: [39, -1]; R: [151, -6]
Bush
G: [49, -6]; R: [95, -25]