CS 194-26: Image Manipulation and Computational Photography, Fall 2017

Project 1: Images of the Russian Empire

Xin Yu Tan, cs194-26-aee

Overview

Sergei Mikhailovich Prokudin-Gorskii was Russian photographer who took separate red-, green-, and blue-filtered images of different locations around the world. These red, blue, and green prints have been stored in the Library of Congress, and we can post process them to a compute a single RGB image from the three separate channels.

The approach I used to align the red, green, and blue channels was to compute the normalized cross coefficient. Given im1 and im2 representing two channels of the same image as vectors in R^n, the normalized cross coefficient is defined as:

im1/norm(im1) • im2/norm(im2), where • is the vector dot product

Images that receive a higher NCC value are better aligned. Therefore, we can separately perturb the red and green channels by shifting to different values in a 30x30 grid and computing which shifted version of the red and green channels have a higher NCC value as compared to the blue channel. The shifted versions of the red and green channels that have the highest NCC are then stacked with blue channel to form an RGB image.

For images that are much larger, it is more efficient to find the best translational shift on each layer of an image pyramid for the red, green, and blue channels. We repeatedly scale images to 1/2 size until the image is no smaller than 64x64 pixels. We then find the best translation shift of this low resolution image and apply an appropriate shift to the original image. (For example, if the image was scaled down by a factor of 8, then we would want to apply 8 times the best translation shift of the lower resolution image back to the full size image.) We then continue with an image no smaller than 128x128, 256x256, etc. until we reach the original sized image. This way we can compute the NCC for a much larger range of translational shifts in a relatively fast manner.

The main challenge I faced was computing the NCC including the outer parts of red, green, and blue channels. In the naive implementation on small images, I had cropped the image by 15 pixels on each side and that worked well to discount the distortions at the edges of the images (which arise because the images were recorded on physical material whose quality decayed over the passage of time). However, I used this same 15 pixel cropping for the larger images; however, that didn't fully remove the edge distortions. What worked well was removing 10% of pixels on each side of the image instead.