CS 194-26: Project 1
Akshit Annadi
In this project, I developed an algorithm to take digitized Prokudin-Gorskii glass plate images, which are seperated into RBG image channels, and automatically produce color images from them using image processing techniques.
Low Resolution
I used the following process to convert low resolution glass plate images to a single color image:
  1. Seperate the image into its three color channels by splitting the image into thirds vertically
  2. Crop off a pre-specified amount of pixels from each side of the color channels, so that alignment only occurs on internal pixels. This ended up being 5% of the pixels on each side. This value was chosen after experimentation. Cropping was implemented because the borders of the images were confusing the algorithm into aligning on the borders rather than the internal pixels.
  3. As an additional bell and whistle, run a Canny edge detector on each of the three channel images, to featurize the images for alignment.
  4. Align the Green and Red images to the Blue image. This is done by translating the green/red image over a window of possible alignments(15 pixels vertically and horizontally) using circular rotation, and choosing the translation that maximized the NCC(Normalized Cross Correlation).
  5. Stack the aligned images of all the color channels to produce a color image
High Resolution
The process for high resolution images was nearly identical to the process for low resolution images, except for how different alignments were explored in step 4. In step 4, the amount of area that needed to be explored to find the optimal translation was very large for these images. To speed up the process, an image pyramid was used. This approach works by rescaling the image to a lower resolution so that exploring a large space can be made computationally feasible. This was implemented as follows:
all offsets are in the following format: (horizontal offset, vertical offset)

Green Offset: (2, 5)
Red Offset: (3, 12)

Green Offset: (2, -3)
Red Offset: (2, 3)

Green Offset: (3,3)
Red Offset: (3,6)

Green Offset: (23, 49)
Red Offset: (40, 107)

Green Offset: (16, 38)
Red Offset: (22, 88)

Green Offset: (18, 60)
Red Offset: (11, 117)

Green Offset: (10, 56)
Red Offset: (13, 120)

Green Offset: (9, 79)
Red Offset: (14, 175)

Green Offset: (24, 52)
Red Offset: (34, 107)

Green Offset: (29, 77)
Red Offset: (37, 175)

Green Offset: (12, 56)
Red Offset: (8, 111)

Green Offset: (2, 48)
Red Offset: (28, 84)

Green Offset: (-2, 50)
Red Offset: (-12, 105)

Green Offset: (4, 36)
Red Offset: (4, 95)
Self Selected
Image 1(jpg)

Green Offset: (0, 1)
Red Offset: (1, 4)
Self Selected
Image 2(jpg)

Green Offset: (1, 3)
Red Offset: (1, 6)
Self Selected
Image 3(jpg)

Green Offset: (1,3)
Red Offset: (4,8)
Bells and Whistles
I implemented one bell/whistle: using edges as features rather than the raw RGB values. This was done by running the RBG images through a Canny edge detector using the sklearn Canny implementation. This was particularly useful as the images were aligned on purely the shapes rather than how bright or dull the shapes were in each of the color channels. Here is a side by side comparison of aligning based on RBG values and edges.
No Edge Detection
With Edge Detection
The algorithm worked pretty well. It aligned all the images without a ton of visual defects. However, for images with a lot of detail, espeically human faces like harvesters.tif, some of the detail in the faces are a little blurry/defected. This is because in those photos, there are a lot of edges in close proximity, which can throw off the metric and confuse the algorithm.
Also, as I cropped a constant amount from each photo rather than intelligently detecting where the border was, actual content got cropped off in some of the photos and in other photos, some of the border still showed. However, the effect of this was minimal.