CS194-26: Project 1 - Joe Zou

Project Overview

The Prokudin-Gorskii collection is a collection of glass-plate images. Each image includes three separate color plates that can be combined to a color image. In this project, I will use an algorithm to align the three RGB channels for images of the Prokudin-Gorskii collection.

My Approach

I started this project by coding a simple single-scale algorithm that uses the raw RGB color channels as features. I tried both sum of squared differences and normalized cross-correlation metrics to determine the best offset alignment, and ended up settling with nromalized cross-correlation as it produced better results. This simple algorithm worked well on the smaller jpg images where I could search over a significant window with respect to the overall image size, but I had to implement a pyramid search in order for the algorithm to scale up to a larger image size. This meant recursively calling my single-scale algorithm but with varying image sizes. I started with the smallest image size of approx size 16 with a search radius of 4 units in all directions. This image is then recursively scaled to 2x bigger with the same 4 unit search radius until we rescale to the image's original size. With this algorithm, I was able to get pretty good matches on all but emir.tiff with a runtime of under a minute. As stated in the project spec, emir.tiff is a unique image in that the person's clothing is very colorful and has colors in only a subset of RGB. This means that using raw image color features to align this image can be difficult with the current algorithm. I was able to properly match emir.tiff by using edge detection features.
Some additional notes on my implementation: 1) I cropped the image to 2/3 of the original size in order to remove the effect of the borders on the alignment algorithm. 2) When creating the final colored image, I layered the three channels and took the intersection of their offsetted positions to form the colored image. This way, I didn't need to zero pad any of the channels as I simply took the section of the image that was overlapped by all 3 color channels for the final result.

Bells & Whistles

Better Features with Edges: In order to properly align emir.tiff, I had to use an edge detection algorithm from the cv2 library. Specifically, the method I used was cv2.Canny() which is a commonly used edge detector. Before running the color channels through Canny, I also had to apply a Gaussian filter on the images in order to blur out some finer details so the edge detector could detect more prominent edges. I employed one more trick to get the edge detection algorithm to properly align the emir.tiff image which was to also apply a Gaussian filter after the edge detection as well. I did this because the pattern of the shirt the man is wearing in the image has a lot of designs which ended up getting picked up by the edge detection algorithm. I wasn't able to find a proper threshold that didn't pick up these designs and they were adding a lot of noise to the alignment process which made it difficult to properly align the images. After adding the gaussian blur to the edge features, I was able to get a better alignment as the details of the shirt was no longer contributing as much noise to the alignment process.

emir.tif result with raw color value features

edge features extracted from the raw color values

emir.tif result with edge features (much better)

Example Image Results

Additional Image Results

I selected a few additional images from the Prokudin-Gorskii collection to run my algorithm on and here are the results

Img1

Img2

Img3