Compsci 194-26 Project 1

Nicholas Figueira

Overview

Sergei Mikhailovich Prokudin-Gorskii (1863-1944) took a number of photographs through three different filters; one red, one blue, and one green. The purpose of this project is to use those photographs and combine them by aligning the images to create color photographs.

Approach

Overlapping the 3 filtered images for each picture directly as the three RGB color channels resulted in blurry images because the images were not exactly aligned. To find the displacement between each of the filtered images, I tried to minimize the L2 norm of the pixel values of the images. This assumes that the brightness of the images are close to each other in the same areas, and so tries to match those brightnesses. Initially, I did an exhaustive search over a small window of pixels using for loops to find the displacement of the the red and green filtered images from the blue filtered image.

This worked for jpg images, but took way too long for larger tif images. So, I switched to using an image pyramid, recursively scaling the image to half of its size until it reached a threshold pixel height, and then simply searching displacing the image by one pixel in each direction.

Once displacements were found, I stacked the images on top of each other with their respective RGB channels.

Problems

As mentioned in the previous section, I rescaled images until they reached a certain threshold for pixel height. This caused a minor problem in that different images seemed to work with slightly different thresholds. For example, church.jpg would work with a value of 100, but not 25 or 50, while self-portrait would align properly with a threshold of 25, and not 50 or 100.

Another problem I had was that emir.tif was not aligning properly, likely because there was a large amount of blue and very little red (specifically in the man's shirt), meaning that the assumption of similar brightnesses was invalid.

Because the green-filtered image seemed to be in between the red and blue brightnesses, I switched to aligning the red and blue filtered images to the green filtered image. This worked perfectly for emir.tif and also turned out to fix my problem of using different thresholds for rescaling as well, making 50 be optimal for all of the images. I suspect that it fixed the threshold problem because there were areas with a lot of blue in some other images as well, such as the sky and water.

Final Approach

After implementing the speedup and fixing my issues, my final algorithm involved aligning the red and blue filtered images to the green filtered image using an image pyramid until the scale reached a height of 50 pixels.

Gallery

Cathedral: Blue (-2 5), Red (1 7)

Monastery: Blue (-2 3), Red (1 6)

Tobolsk: Blue (-3 -3), Red (1 4)

Harvesters: Blue (-16 -59), Red (-3 65)

Melons: Blue (-11 -82), Red (4 96)

Self Portrait: Blue (-29 -79), Red (8 98)

Icon: Blue (-17 -41), Red (5 48)

Three Generations: Blue (-14 -53), Red (-3 58)

Workshop: Blue (0 -53), Red (-11 52)

Church: Blue (-4 -25), Red (-8 33)

Lady: Blue (-8 -56), Red (3 62)

Onion Church: Blue (-27 -51), Red (10 57)

Emir: Blue (-24 -49), Red (17 57)

Train: Blue (-6 -43), Red (27 43)

Others

River: Blue (1 -39), Red (-5 112)

Forest: Blue (31 127), Red (37 59)

Pond: Blue (2 -51), Red (-14 61)