Project 1

Overview

This project aims to automatically colorize images from Prokudin-Gorskii' photo collection, by aligning the the different image channels to create a color image in an era of black and white cameras. Prokudin-Gorskii had the foresight to take three separate photos of each scene through three different colored filters (red, green, and blue). Luckily, his RGB glass plate negatives, capturing the last years of the Russian Empire, survived and were purchased in 1948 by the Library of Congress. The LoC has recently digitized the negatives and made them available online.

Naive Implementation with SSD for alignment

The first attempt at aligning the images involved an exhaustive grid search with the sum of squared differences (SSD) image matching metric. My naive alignment algorithm used the blue channel as the reference channel for alignment. Next, I cropped the blue channel (25% of each border), to crop away borders and leave a smaller section of the image for alignment. Using numpy's built in np.roll, I shifted the other channel (red or green) vertically and horizontally, cropped the shifted channel, and then scored each displacement's alignment with SSD. I decided to crop the channel after shifting it because my implementation uses np.roll to shift the image matrices; np.roll performs a circular shift which means that shifting the image to the right would result in pixels that were originally on the right edge of the image to be circularly shifted to the left edge of the image. Cropping after circular shifting crops out the misplaced pixels. Using the displacement that corresponded to the lowest SSD, the red and green channels were aligned to the blue channel to create a single colored image.

However, this naive exhaustive search failed to work with the larger .tif images, as it ran very slowly especially with the larger required displacements. Therefore, a multi-scale image pyramid optimization was needed.

Pyramid Implementation

An image pyramid allowed for a quicker and more efficient search for the proper displacement. An image pyramid represents the image at multiple scales (at various lower resolutions using skimage.transform.rescale), and at the coarsest level the image was scaled down to under 100 pixels in height. The multi-scaled pyramid implementation still uses the naive implementation at each step, performing a limited search at each level of the pyramid. The displacement vector is passed on to the higher resolution levels, which searches the neighboring displacements; in effect, the estimation of the displacement vector is improved at each higher-resolution level. Additionally, when compared to the naive implementation, I was able to use a smaller search grid / interval, because the displacement vectors found from coarser image levels would have already shifted the channel closer to the proper alignment.

Bells and Whistles: Edge Operators

Next, I decided to try to use better features than simply RGB similarity, settling upon apply edge operators like Sobel filters and Canny edge detection to the channels. Skimage has a built in Canny edge detector, so I applied the transformation to the channel at each level of the pyramid before scoring the edge-detected alignment with SSD. However, emir.tif still was slightly misaligned, but was much better than without the Canny transformation. emir.tiff is hard to align with blue as the base reference channel, because Emir is wearing a blue robe in the image. Therefore, the robe's intensities in the three different channels is pretty different, which makes the SSD image alignment / matching to underperform.

Bells and Whistles: Contrast Adjustment

Finally I decided to apply some contrast adjustment, namely with Contrast Limited Adaptive Histogram Equalization (CLAHE). CLAHE enhances an image with low contrast, by spreading out the most frequent intensity values such that the resulting image has a "roughly linear cumulative distribution function". Specifically, CLAHE is an improved version of simple histogram equalization in that it adjusts contrast with neighboring pixels in mind, and so "local details can therefore be enhanced even in regions that are darker or lighter than most of the image". In the example images below, there are images captioned with "Contrast Adjusted, Adaptive Histogram Equalized", which are the CLAHE version of the preceding image.