Jason Zou - Project 1

Overview

In the early 1900s, Sergei Mikhailovich Prokudin-Gorskii travelled across the Russian Empire taking various photographs of people, buildings, landscapes, etc. While there was no way to print color photographs at that time, Prokudin-Gorskii captured these pictures by taking three exposures of the scene using red, green, and blue filters, envisioning that there would be a way to combine these filters into a fully colorized picture in the future. The goal of this project is to finish what he started.

Approach and Analysis

Each scene is provided as one large image file of the three filters which are ordered blue-green-red in descending order. After isolating each exposure and converting to float32, the filters are aligned relative to the green filter using a multi-scale processing scheme called an image pyramid (which defaults to single-scale processing if the image width is less than 350px). At each scale of the pyramid, the algorithm searches over a user-specified range of x and y displacements and identifies the "best" alignment/displacement as that which minimizes the L2 distance between the images (SSD). All three filters are then aligned w.r.t to these displacement vectors (x,y) to produce the final image.

Initially, the optimal alignment was found using an exhaustive search across all possible displacement vectors (x, y) in the range [-15, 15] w.r.t the blue filter. This process took up to 2 minutes for the large .tif files and generally performed well, barring a few instances such as Emir where the alignment was perceptually off and produced slight red/blue hues around the outline of the person.

Cathedral
Emir


In an attempt to find a better algorithm which can also align Emir succesfully, it was discovered that a greedy approach that aligned w.r.t to the green filter worked just as well and with much greater efficiency. The greedy approach is as follows: at each scale, the optimal y-displacement in the range [-15, 15] is first calculated w.r.t the green filter, and then the optimal x-displacement in the same range is calculated based off the optimally y-aligned image. Intuitively, since the exposures were captured using three lenses/filters stacked on top of each other, using the green filter as the 'center' and then first optimizing along the y-axis can produce a reasonable result even using a relatively small search window of displacements.

Cathedral
Emir


For most images, this switch between exhaustive and greedy search did not affect the end result noticeably. For the cathedral image, however, it did result in a poorer alignment in the greedy case, which is likely due to the differing brightness across R and G channel values creating non-intuitive local optima. On the other hand, the greedy method did result in better colorization of Emir than exhaustive search, which is the goal we wished to accomplish. So, it seems that in some cases, the greedy method leads us to poorer local optima than the exhaustive method, while in other cases, the greedy local optima are better than the global optima which are complicated by differing brightness values across channels. Overall, both methods did well on most images at producing a quality alignment, and I chose to stick with the greedy method simply on the basis of runtime.

Final Greedy Results

The following images were all colorized with a displacement search window of 15.

Cathedral

Blue Shift: (0,0) ; Red Shift: (1, 4)

Monastery

Blue Shift: (-2,3) ; Red Shift: (1, 6)

Tobolsk

Blue Shift: (-3,-3) ; Red Shift: (1, 4)

Village

Blue Shift: (-12, -64) ; Red Shift: (10, 73)

Onion Church

Blue Shift: (-26, -51) ; Red Shift: (10, 57)

Three Generations

Blue Shift: (-14,-53) ; Red Shift: (-3, 58)

Icon

Blue Shift: (-17, -40) ; Red Shift: (5, 48)

Emir

Blue Shift: (-24, -49) ; Red Shift: (17, 57)

Harvesters

Blue Shift: (-16,-59) ; Red Shift: (-3, 65)

Lady

Blue Shift: (-9, -51) ; Red Shift: (3, 61)

Self Portrait

Blue Shift: (-29, -78) ; Red Shift: (8, 98)

Workshop

Blue Shift: (0,-53) ; Red Shift: (-11, 52)

Melons

Blue Shift: (-10, -81) ; Red Shift: (4, 96)

Train

Blue Shift: (-5, -42) ; Red Shift: (27, 43)

Other Greedy Results

Man on Boat

Blue Shift: (22, -52) ; Red Shift: (-33, 55)

Jesus on a Cross in front of a House

Blue Shift: (-7, -21) ; Red Shift: (-6, 28)

Rope Bridge

Blue Shift: (-9, -14) ; Red Shift: (-1, 2)

Extras

Auto-crop

After an initial alignment, most if not all pictures still have 'color artifacts' near the edges due to offsets during alignment, faulty exposures, or degradation of the film. Being able to automatically detect these artifacts and crop the picture accordingly would produce a nicer result in most cases. Using image gradients, these artifacts are detected as horizontal/vertical lines of gradients whose absolute average value across the line lies above a certain threshold, and the image is cropped until the deepest occurence of such a line.

Auto-contrast

In the age of smartphones, fancy cameras, and social media, it is common for everyday photos to be auto-contrasted or modified by applying some sort of filter. Usually, the channel adjustments result in more "realistic" or "higher quality" photos that more closely resemble what we see. Given the older technology that was used to take these photographs, it would be interesting to see how they may look under a more mondern "lens". The following adjustments were made according to the methodology described here.

Some Greedy Results Before/After both Adjustments