CS 194-26: Project 1
In this project, I developed an algorithm to take digitized Prokudin-Gorskii glass plate images, which are seperated into RBG image channels, and automatically produce color images from them using image processing techniques.
I used the following process to convert low resolution glass plate images to a single color image:
- Seperate the image into its three color channels by splitting the image into thirds vertically
- Crop off a pre-specified amount of pixels from each side of the color channels, so that alignment only occurs on internal pixels.
This ended up being 5% of the pixels on each side. This value was chosen after experimentation. Cropping was implemented because
the borders of the images were confusing the algorithm into aligning on the borders rather than the internal pixels.
- As an additional bell and whistle, run a Canny edge detector on each of the three channel images, to featurize the images for alignment.
- Align the Green and Red images to the Blue image. This is done by translating the green/red image over a window
of possible alignments(15 pixels vertically and horizontally) using circular rotation, and choosing the translation that maximized the NCC(Normalized Cross Correlation).
- Stack the aligned images of all the color channels to produce a color image
The process for high resolution images was nearly identical to the process for low resolution images, except for how different alignments were explored in step 4.
In step 4, the amount of area that needed to be explored to find the optimal translation was very large for these images. To speed up the process, an image pyramid was used. This approach works by rescaling the image to a lower resolution
so that exploring a large space can be made computationally feasible. This was implemented as follows:
- The image was repeatedly halved in scale until sufficiently small
- Once the image was either less than 500 pixels in height or halved more than 5 times, a search window of (-17,17) pixels was explored for the best alignment using the same process as the low resolution images
- Then, working backwards through recursion, the image at the previous scale was transformed using the result of the coarser alignment and realigned again using a smaller window size(decreased by 8 vertically and horizontally) to improve the estimate
- The backtracking continued until we reached the original resolution, at which point the final displacement was returned
all offsets are in the following format: (horizontal offset, vertical offset)
Green Offset: (2, 5)
Red Offset: (3, 12)
Green Offset: (2, -3)
Red Offset: (2, 3)
Green Offset: (3,3)
Red Offset: (3,6)
Green Offset: (23, 49)
Red Offset: (40, 107)
Green Offset: (16, 38)
Red Offset: (22, 88)
Green Offset: (18, 60)
Red Offset: (11, 117)
Green Offset: (10, 56)
Red Offset: (13, 120)
Green Offset: (9, 79)
Red Offset: (14, 175)
Green Offset: (24, 52)
Red Offset: (34, 107)
Green Offset: (29, 77)
Red Offset: (37, 175)
Green Offset: (12, 56)
Red Offset: (8, 111)
Green Offset: (2, 48)
Red Offset: (28, 84)
Green Offset: (-2, 50)
Red Offset: (-12, 105)
Green Offset: (4, 36)
Red Offset: (4, 95)
Green Offset: (0, 1)
Red Offset: (1, 4)
Green Offset: (1, 3)
Red Offset: (1, 6)
Green Offset: (1,3)
Red Offset: (4,8)
Bells and Whistles
I implemented one bell/whistle: using edges as features rather than the raw RGB values. This was done by running the RBG images through
a Canny edge detector using the sklearn Canny implementation. This was particularly useful as the images were aligned on purely the shapes rather than
how bright or dull the shapes were in each of the color channels. Here is a side by side comparison of aligning based on RBG values and edges.
No Edge Detection
With Edge Detection
The algorithm worked pretty well. It aligned all the images without a ton of visual defects. However, for images with a lot of detail, espeically human faces like
harvesters.tif, some of the detail in the faces are a little blurry/defected. This is because in those photos, there are a lot of edges in close proximity, which can
throw off the metric and confuse the algorithm.
Also, as I cropped a constant amount from each photo rather than intelligently detecting where the border was, actual content got cropped off in some of the photos
and in other photos, some of the border still showed. However, the effect of this was minimal.