CS194-26 Project 1: Colorizing the Prokudin-Gorskii Photo Collection

Background

Sergey Prokudin-Gorskii was an innovative Russian photographer who experimented with color photography well ahead of his time. In the early 1900s, he sought to take photos all over Russia in the hopes of making a documentary of the Russian Empire to teach schoolchildren about the culture and diversity of the empire.

He captured hundreds of scenes, each yielding three photographs, one taken with a red filter, blue filter, and green filter. Though the resulting three photographs are black and white, if they are alligned and projected through filters of the same colors they reproduce a color version of the scene.

The above is an example of three photos taken of a cathedral, each representing a different color channel. After alligning and stacking the channels (and a few other tricks), we can get the following image:

Naive algorithm

We can brute force an alignment of the red and green channels onto the blue channel by exhaustively searching over a specified window and calculating the sum of squared differences for every possible alignment in that window. We can then take the alignment with the smallest sum of squared differences loss.

I used the window [-15, 15] for the search window for both x and y displacement.

This approach works really well on these small JPG files, but is slow on large images due to the repeated calculation of the sum of squared differences loss.

Pyramid search

In order to speed the algorithm up, we can use a course to fine image pyramid. Here we use the assumption that the alignment for two images is close to the alignment on the two downscaled images (after scaling the resulting displacement vector by 1/(sqrt(scale)) where scale is the scale factor we used when downscaling the image's resolution).

Different levels of the image pyramid, qualities getting worse down to the 13th level.

I used a 10 level image pyramid, downscaling the image resolution by .75 every time. I chose 10 levels because it was a good balance between speed and quality. As you get to lower and lower resolutions, your alignments get less and less accurate to what the best alignment is for the original image.

The results from using the image pyramid are pretty remarkable, producing good results for almost every image.

Bells and whistles

Histogram equalization

As learned in class, histograms with more uniform appearance will be higher contrast. We can use skimage's built in histogram equalization function for this. For some images, the results are very good!

The top row do not have histogram equalization, but the bottom row does, and the colors in the bottom row are more vibrant. I especially love the middle image of a church.

Automatic cropping

A naive approach would crop a fixed amount off of every side of the final image. We want to crop the parts of the image that are either black or a solid bar of color. The approach that I used was to take images and crop all rows and columns with average value less than .05 and greater than .95.

Examples on other photos in the collection.

Issues encountered

I had several issues while aligning. My worst image visually was probably the self portrait image, and I assume its because a very large image with very fine features, so small mistakes with alignment visually show a lot. Using a larger window for my base case would likely result in a more visually accurate result.

I tried canny edge detection, but it performed poorly on several images such as the lady.tif.

The problem spec asks us to align to the blue channel, but that gave me poor results on emir.tif. Aligning to green gave me better results on every image, including emir.tif.

Displacements and time taken

Image title B Displacement R Displacement Time taken
workshop.tif (-53, 0) (51, -11) 34.881289105
emir.tif (-48, -24) (54, 17) 33.859198490000004
monastery.jpg (3, -2) (6, 1) 0.43461654500001146
church.tif (-25, -4) (33, -8) 36.504083897
three_generations.tif (-52, -13) (56, -2) 47.809113333
melons.tif (-76, -6) (82, 2) 35.90759688699998
onion_church.tif (-50, -27) (54, 11) 34.932735506
train.tif (-42, -6) (43, 27) 34.723513911
tobolsk.jpg (-3, -3) (4, 1) 0.4902734819999637
icon.tif (-41, -17) (47, 5) 47.12127929000002
cathedral.jpg (-5, -2) (7, 1) 0.7231956809999929
self_portrait.tif (-72, -26) (83, 6) 39.33717431400004
harvesters.tif (-57, -16) (60, -3) 40.981361505999985
lady.tif (-52, -8) (57, 3) 47.164769616