CS 194-26 Project 1

Before the existence of colored photography, Prokudin-Gorskii hypothesized that he could create a colored image by taking three different exposures of a scene on a glass plate, applying a red, green, or blue filter. Aligning these three images on top of each other would produce an RGB colored photo. The main goal of this project is to produce colorized images from the digitized RGB glass plate negatives by finding the correct alignment of the images using image processing techniques.

Algorithmic Approach

A naive algorithm for finding the approximate displacement vectors for aligning the images is to exhaustively search over a window of range(-20, 20) and use Sum of Squared Distances as a heuristic to score how well matched the images are. I used this naive algorithm to colorize the given jpg images. I processed the images without the borders, and I manually removed borders by eyeballing how many pixels to subtract from each side and hardcoding it in (I removed 20 pixels from each side.). I also used StandardScaler to normalize the image arrays. The algorithm would first try to find the best displacement vector for aligning the blue and green channels then the blue and red channels, keeping the blue channel stable and moving the green or red channels within the [-20, 20] pixel range and calculating the best pixel displacement using the heuristic.

For the high-resolution images, the naive approach would take too long. In this case, I used Normalized Cross-Correlation as my heuristic for image matching. (Not for any particular reason. SSD would have worked similarly.) I create four different versions ("levels") of my image with the first having significantly lower resolution and the succeeding images with resolution that is better than the previous by a factor of 2. I perform an algorithm that is similar to the naive algorithm on each level, finding the best displacement for that particular image, then I improve on that displacement in the next iteration using the next higher resolution image. To increase processing speed, at every iteration, I reduce the window size by half since an image with higher resolution would have more pixels. Before processing the image, I also remove the borders, but this time I approximate it to be about 20% the size of the image. To further optimize the code, I centered the image on the green channel instead of the blue channel, which allowed for a smaller search window since the displacements were smaller.

Challenges

I was able to significantly improve the alignment of my images by processing the images without the borders. I did try using edge detection (roberts python library) but this was only successful for some images like monastery.jpg where the edges were more defined. Other images like cathedral.jpg remained blurry. Removing edge detection and borders was able to align all the photographs well.