CS194-26 Project 1 Website

In this project, we were given several images that were taken separately with blue, green, and red color filters. Our objective is to extract these three images and overlay them on top of each other such that a full color image is formed.

Approach

I first extracted the RGB images by splitting the given image into vertical thirds. I first used a simple method to find the displacement; since there are 3 images, I had to align both the red and the green filtered images to the blue image. I used np.roll() to shift the image up, down, left, and right within a range of +- 15 pixels then compared each shifted image to the "base" (in this case, the blue image). I used the L2 norm, or the Sum of Squared Differences (SSD), to calculate the distance between the two, and chose the offset with the smallest distance. This simple "exhaustive search" method worked for smaller images (.jpg files), but for larger images, I had to take a "pyramid search" approach.
The pyramid searching consisted of recursively scaling the original image down, finding an offset, then scaling the image back up to find finer and finer displacements. I scaled the image down by 2 on each level for a total of around 3 times. The window size was also smaller with a range of +-5.

Problems

As mentioned in the Approach section above, I had originally been aligning the red and green channels to the blue channel. However, after testing with using red and green as the bases as well, I changed my implementation to match red and blue to the green channel instead. There were 2 exceptions, the Melons and the Lady, that were aligned to the red channel, however.
Another change that I had to make was changing my SSD calculation to run only on the "center" part of the image rather than on the entire image. For example, if an image was shifted horizontally by 5px, I would run the calculations on the center (width - 2*5px) portion of the image to eliminate comparisons on the edges that I knew would not be relevant. I had to average out the resulting distance over the # of pixels each time or else this algorithm would naturally always return the largest offsets (which results in the smallest center portion and thus the smallest SSD).