CS 194-26 Project 1: Colorizing the Prokudin-Gorskii Photo Collection

Sean Farhat

Overview

In 1907, Sergei Mikhailovich Prokudin-Gorskii travelled around the Russian Empire taking photos of everything that fancied him: landscapes, people, buildings, etc. But, he did something strange: he took 3 pictures of everything. Why? Well, at the time, color photography had not been invented yet, but our man was thinking hundreds of steps ahead of everyone: he knew that Alyosha Efros would eventually teach a class on Computational Photography, and in this class, he would get his students to use modern day technology to colorize them for him! He did this by taking each photo behind a red, blue, and green filter respectively onto a glass plate. Thus, here we are today, grabbing these glass plate negatives and colorizing them. This code is dedicated to you, Mr. Prokudin-Gorskii.

Method

First, I separated each image into the individual RGB images by diving it by 3. While this generally worked for the most part, the downside was that the remaining border around each image was a different thickness. As this ended up being problematic later on, this was accounted for by a constant cropping of 20% of the image on all sides.

For my alignment algorithm, I chose to use the Sum of Squared Differences metric between the images. While using the blue channel image as the baseline, I searched over a 15x15 window of displacements between the images until the best one was found.

This became intractable for the larger images, so I implemented an image pyramid as follows:

  1. Scale down the image by factors of 2 until a resolution below 100x100 was reached
  2. Here, at the coarsest scale, search over the 15x15 window
  3. Recurse up one level to a higher resolution. Double the displacement returned by the previous level to account for the factor of 2, and search the 5x5 window around that displacement
  4. Repeat step 3 until you reach the original image
This method worked pretty well. Below you can see the result of this method on the example images:

Name Displacement Aligned
cathedral.jpg G: [5 2]
R: [12 3]
Time: 0.345 seconds
emir.tif G: [49 24]
R: [0 -332]
Time: 13.336 seconds
harvesters.tif G: [59 17]
R: [123 14]
Time: 12.652 seconds
icon.tif G: [40 17]
R: [89 23]
Time: 14.761 seconds
lady.tif G: [49 8]
R: [109 11]
Time: 15.333 seconds
melons.tif G: [81 10]
R: [178 13]
Time: 15.232 seconds
monastery.jpg G: [-3 2]
R: [3 2]
Time: 0.341 seconds
onion_church.tif G: [51 26]
R: [108 36]
Time: 14.770 seconds
self_portrait.tif G: [78 29]
R: [176 37]
Time: 16.462 seconds
three_generations.tif G: [50 14]
R: [110 12]
Time: 15.803 seconds
tobolsk.jpg G: [3 3]
R: [7 3]
Time: 0.302 seconds
train.tif G: [42 6]
R: [87 32]
Time: 14.927 seconds
village.tif G: [ 0 -58]
R: [137 22]
Time: 15.400 seconds
workshop.tif G: [53 -1]
R: [105 -12]
Time: 15.488 seconds

Analysis

As we can see, the emir and village pictures didn't align well. So, I decided to move from the color values to edges as features. I accomplished this by first blurring each image with a 5x5 Gaussian filter, then running the resulting blurred image through a Sobel filter to extract its edges. I also kept the same cropping scheme. This turned out to fix the problem! Below you can see what the edge images look like and the resulting aligned images using these features:

Name Displacement Edges Aligned
emir.tif G: [49 24]
R: [107 39]
Time: 10.363 seconds
village.tif G: [64 12]
R: [137 22]
Time: 18.520 seconds

Other Images

Here is the result of running the overall algorithm, will all the improvements, on other images in the dataset.

Name Displacement Aligned
obelisk.tif G: [24 21]
R: [59 29]
Time: 11.052 seconds
flowers.tif G: [13 6]
R: [41 1]
Time: 14.424 seconds
mural.tif G: [48 13]
R: [112 19]
Time: 14.046 seconds

Conclusion

Overall, this project was a great introduction to the cool things image processing can do. For such a standard, straightforward method (SSD) to do so well was pretty amazing. As I wished to fix more problems though, things became more tricky. For example, I had to tune the Gaussian blur and image pyramid coarse resolutions until I got good values for the images I used, but this says nothing about how well it generalizes to other images. If I had more time, I would probably implement automatic cropping by doing high threshold edge detection, color normalization by taking the lightest color and normalizing to that, or doing translations and rotations in addition to just alignment shifting. All of these are pretty straightforward and not too difficult to implement, so I'm curious as to how much they would help.

I guess I should start the next project earlier so I'll be able to explore these things...