CS 194-26 Project 1: Colorizing the Prokudin-Gorskii Photo Collection
Sean Farhat
Overview
In 1907, Sergei Mikhailovich Prokudin-Gorskii travelled around the Russian Empire taking photos of everything that fancied him: landscapes, people, buildings, etc. But, he did something strange: he took 3 pictures of everything. Why? Well, at the time, color photography had not been invented yet, but our man was thinking hundreds of steps ahead of everyone: he knew that Alyosha Efros would eventually teach a class on Computational Photography, and in this class, he would get his students to use modern day technology to colorize them for him! He did this by taking each photo behind a red, blue, and green filter respectively onto a glass plate. Thus, here we are today, grabbing these glass plate negatives and colorizing them. This code is dedicated to you, Mr. Prokudin-Gorskii.
Method
First, I separated each image into the individual RGB images by diving it by 3. While this generally worked for the most part, the downside was that the remaining border around each image was a different thickness. As this ended up being problematic later on, this was accounted for by a constant cropping of 20% of the image on all sides.
For my alignment algorithm, I chose to use the Sum of Squared Differences metric between the images. While using the blue channel image as the baseline, I searched over a 15x15 window of displacements between the images until the best one was found.
This became intractable for the larger images, so I implemented an image pyramid as follows:
- Scale down the image by factors of 2 until a resolution below 100x100 was reached
- Here, at the coarsest scale, search over the 15x15 window
- Recurse up one level to a higher resolution. Double the displacement returned by the previous level to account for the factor of 2, and search the 5x5 window around that displacement
- Repeat step 3 until you reach the original image
This method worked pretty well. Below you can see the result of this method on the example images:
Name |
Displacement |
Aligned |
cathedral.jpg |
G: [5 2] R: [12 3] Time: 0.345 seconds |
|
emir.tif |
G: [49 24] R: [0 -332] Time: 13.336 seconds |
|
harvesters.tif |
G: [59 17] R: [123 14] Time: 12.652 seconds |
|
icon.tif |
G: [40 17] R: [89 23] Time: 14.761 seconds |
|
lady.tif |
G: [49 8] R: [109 11] Time: 15.333 seconds |
|
melons.tif |
G: [81 10] R: [178 13] Time: 15.232 seconds |
|
monastery.jpg |
G: [-3 2] R: [3 2] Time: 0.341 seconds |
|
onion_church.tif |
G: [51 26] R: [108 36] Time: 14.770 seconds |
|
self_portrait.tif |
G: [78 29] R: [176 37] Time: 16.462 seconds |
|
three_generations.tif |
G: [50 14] R: [110 12] Time: 15.803 seconds |
|
tobolsk.jpg |
G: [3 3] R: [7 3] Time: 0.302 seconds |
|
train.tif |
G: [42 6] R: [87 32] Time: 14.927 seconds |
|
village.tif |
G: [ 0 -58] R: [137 22] Time: 15.400 seconds |
|
workshop.tif |
G: [53 -1] R: [105 -12] Time: 15.488 seconds |
|
Conclusion
Overall, this project was a great introduction to the cool things image processing can do. For such a standard, straightforward method (SSD) to do so well was pretty amazing. As I wished to fix more problems though, things became more tricky. For example, I had to tune the Gaussian blur and image pyramid coarse resolutions until I got good values for the images I used, but this says nothing about how well it generalizes to other images. If I had more time, I would probably implement automatic cropping by doing high threshold edge detection, color normalization by taking the lightest color and normalizing to that, or doing translations and rotations in addition to just alignment shifting. All of these are pretty straightforward and not too difficult to implement, so I'm curious as to how much they would help.
I guess I should start the next project earlier so I'll be able to explore these things...