Project 1: Colorizing the Prokudin-Gorskii photo collection

Naive Algorithm

A simple, brute-force approach to this alignment problem is defining a window of possible displacements, searching through this range, and returning the "best" alignment according to some metric. I used two metrics suggested by the project spec: SSD (sum of squared differences) and NCC (normalized cross-correlation coefficient). SSD measures the difference between two images, so we choose the displacement that minimizes this metric. NCC measures the similarity between two images with an output range between -1 and 1, -1 being very different and 1 being identical images, so we choose the displacement that maximizes this metric. With a window of [-15, 15], this naive algorithm produces the following results:

Cathedral SSD:
Green [1 -1], Red [7 -1]

Cathedral NCC:
Green [1 -1], Red [7 -1]

Monastery SSD:
Green [-6 0], Red [9 1]

Nativity SSD:
Green [3 1], Red [7 1]

Nativity NCC:
Green [3 1], Red [7 1]

Settlers SSD:
Green [7 0], Red [14 -1]

Tweaking the Monastery

The results were fair for most images. However, the Monstery image was not very well-aligned. A notable feature of this image is that it is predominantly blue, which could affect the alignment of red/green against blue. I tried instead aligning the blue and red images against the green image. The alignment was equally bad, if not worse. I then tried cropping the outer 10% margin of the image and re-aligning the red/green to the blue image. The result was significantly better.

Monastery NCC aligned with green:
Blue [6 0], Red [6 1]

Monastery NCC with 10% cropped:
Green [-3 2], Red [3 2]

Image Pyramid

The brute-force search algorithm works well for small images, but this search is expensive for larger high-resolution images. We can narrow the search space and apply the same search algorithm by implementing an image pyramid. An image pyramid consists of a pyramid of images that have been scaled down from the original image by some factor. If the image is 2^N by 2^N, the scaling factor is 0.5, and we want to reduce it to 1 x 1, then the pyramid has N levels. By creating such a pyramid, we narrow the search space such that we only apply the search algorithm only on N images. We first apply the algorithm to the smallest image and work our way down from the tip of the pyramid (smallest image) to the base (the largest/original image). The displacement vector found on the smallest image can be propagated to the lower level by first shifting this lower image by 2 * displacement, then applying the search algorithm. We repeat this until we reach the base of the pyramid and retrieve the final displacement vector. The following are results of this algorithm run on images with 10% of margins cropped off.

Results

Emir SSD:
Green [49 24], Red [95 -207]

Emir NCC:
Green [49 24], Red [95 -207]

Icon SSD:
Green [41 17], Red [89 23]

Icon NCC:
Green [41 17], Red [89 23]

Harvesters SSD:
Green [59 17], Red [123 13]

Harvesters NCC:
Green [59 17], Red [123 13]

Lady SSD:
Green [51 9], Red [112 12]

Lady NCC:
Green [51 9], Red [112 12]

Self portrait SSD:
Green [79 29], Red [176 37]

Self portrait NCC:
Green [79 29], Red [176 37]

Three Generations SSD:
Green [53 14], Red [111 11]

Three Generations NCC:
Green [53 14], Red [111 11]

Train SSD:
Green [42 5], Red [87 32]

Train NCC:
Green [42 5], Red [87 32]

Turkmen SSD:
Green [56 21], Red [116 28]

Turkmen NCC:
Green [56 21], Red [116 28]

Village SSD:
Green [64 12], Red [137 22]

Village NCC:
Green [64 12], Red [137 22]

Tweaking Emir
The Emir image was not at all aligned. Again, I tried aligning the blue and red images against the green image. The result improved significantly.

Aligned with blue
Green [49 24], Red [95 -207]

Aligned with green
Blue [-49 -24], Red [57 17]

Bells and Whistles

Automatic Contrasting

We can rescale pixel intensities such that the colors in the image are more contrasted. I tried using 1) contrast stretching and 2) histogram equalization. Contrast stretching rescales the image such that the intensities that fall in the tail ends of the image are omitted, while histogram equalization tries to spread out the most frequent intensity values.

I applied the contrast to each color channel image before alignment. The contrast made minimal changes to the alignment and the final color image.

Train: Before

Train: After contrast stretching

Train: After histogram equalization

Village: Before

Village: After contrast stretching

Village: After histogram equalization

Edges as features

Instead of using pixel intensities as the only feature, we could also use edges. To identify the edges in the image, I used a Sobel filter, which essentially inverts the image's colors such that the edges are displayed with white pixels and all else are dark. Alignment using edges was equally as good as alignment with pixel intensities.

Village: Using pixel intensities

Village: Sobel filter on red channel

Village: Using edges

Lady: Using pixel intensities

Lady: Sobel filter on red channel

Lady: Using edges

CS194-26 Project 1

Project Overview

Naive Algorithm

Image Pyramid

Extra data

Bells and Whistles

Automatic Contrasting

Edges as features