CS 194:-26 Image Manipulation and Computational Photography, Fall 2018

Project 1: Images of the Russian Empire

Riku Miyao, CS194-26-acq



Overview

The goal of this assignment was to take the digitized Prokudin-Gorskii glass plate images, which each contain 3 grayscaleimages of the same scene taken with a separate color filter, and align the three images to generate a colorized reconstruction. To perform this, we needed to compare x-y offsets of a single color channel and compare it to another channel, using a specific metric. We take the offset of the green channel that most closely matches the blue channel, and the offset of the red channel that most closely matches the blue channel, and use that to align the three channels together. To speed up the process, I also implemented an image pyramid which would give us a log(n) speed up to query the correct offset

Approach

Initial Approach

Initially I implemented the naive approach, where I checked for image offsets of -15 to 15, comparing them using the normalized cross correlation (ncc) metric. To avoid noise from the borders, I only compared the pixels in the center within the rectangle starting from 25% the width off the left and 25% the height off the top, and going right and down 50% of the image's width and height. To implement this metric without using for loops, I used the map function in python3 to parallelize my code, using itertools.product to specify the range of pixels to compare. This worked well for aligning small images, but to use this algorithim on larger images required that I search over a far larger pixel range, which would require an image pyramid.

To speed up the algorithm on tiff images, I implemented an image pyramid through a small modification of the code. Using recursion, I repeatedly resized the image to be half the width and half the height until the image was less than 10 pixels tall. I then aligned the image using the same ncc metric, only testing offsets of (-1, 0, 1)x(-1,0,1). I then returned the most correct offset and passed it to the calling recursive function, which would use an image resized to be twice the width and height as the previous iteration, and performed the same alignment, testing offsets of (2*previous alignment)+(-1,0,1)x(-1,0,1). The factor of 2 is to account for the fact that the image we are aligning on is twice the width and height as the previous iteration. We keep repeating this process until we hit our original resolution, and return the resulting offset. With this, even the larger tiff images can be aligned in about 2 minutes. This is the approach I've utilized for most of the images. However, this algorithm fails to align the Emir image because of the difference in brightness of the red channel of his clothes with the other channels.

The issue with the Emir picture is that we cannot directly compare raw pixel values due to the difference in the color channels. However, we can utilize the differences between adjacent pixels in our metric instead. To do this, instead of using the raw pixel values, I used the magnitude gradient of the pixel to compute our metric. Our estimate of the magnitude of the gradient is simply

We can ignore the scaling factor of the gradient because everything is normalized by the ncc in the end. This would align Emir well, but doesn't work as well for the other images, so I didn't use the magnitude of the gradient then. I have shown both results for Emir below.

Results

Image Green Blue Offset (y, x) Red Blue Offset (y, x)
Cathedral
(-5, -2) (-12, -3)
Emir (Incorrect Alignment)
(-48, -24) (-429, 488)
Emir (Using magnitude of gradient)
(-49, -24) (-107, 40)
Harvesters
(-59, -17) (-123, -14)
Icon
(-41, -17) (-89, -23)
Lady
(-54, -8) (-115, -11)
Monastery
(3, -2) (-3, -2)
Nativity
(-3, -1) (-8, 0)
Self Portrait
(-78, -29) (-175, -37)
Settlers
(-7, 0) (-15, 1)
Three Generations
(-52, -14) (-111, -12)
Train
(-42, -6) (-86, -32)
Turkmen
(-55, -21) (-116, -28)
Village
(-64, -12) (-137, -22)
Lake
(-37, 6) (-144, 13)
Towers
(-56, -13) (-107, -24)

Conclusion

In the end, I was successful in aligning most of the images, and had i more time, I would optimize my code to remove some of the redundancy in computation, specifically in the gradient calculation. Though I did not have enough time to implement any bells and whistles, I am satisfied with my result, and this was a great opportunity to mess around with image manipulation techniques.