CS194-26: Image Manipulation and Computational Photography

Programming Project #1
Images of the Russian Empire: Colorizing the Prokudin-Gorskii photo collection

Kaiwen Zhou

Overview

The goal of this assignment was to take the digitized Prokudin-Gorskii glass plate images and, using image processing techniques, automatically produce a color image with as few visual artifacts as possible. I was to take three color channel images, place them on top of each other, and automatically align them in order to create a single RGB image.

Approach

In my approach, I read in the three-strip image and divided it into the three color channel component images: one for each primary color (blue, green, and red). The green channel is then aligned to the blue channel, and the red channel is then aligned to the resulting aligned green channel. The reason for this is because the red channel is typically more intense/dark than the other channels, and as such is often harder to align with the blue channel. This improved the result obtained from emir.tif slightly. I used an image pyramid to align images - the widest displacement search occurred at the top of the pyramid, and each iteratively decreasing level had only small displacement search ranges in order to fine tune my search. The most optimal image alignment was deemed so through SSD between the two images. Once the images were aligned, they are then compared into a resulting color image. My algorithm takes roughly 2 minutes to run on all of the provided images.

Image Pyramid

I constructed an image pyramid for each color channel image to align. I chose to base the number of levels on the size of the image using the formula

num_levels = log10(n)

where n is equal to the length of a row in the image. My pyramid function would return an array of size num_levels. Each increasing level of the pyramid was progressively scaled down by half. My alignment algorithm would start on the highest level of the pyramid, the scale with the lowest resolution, and searched within an x and y displacement of [-20, 20] each. Performing my widest search on the highest level was most cost-efficient. In successive lower levels, I would save my previously obtained displacement, multiply it by two to account for the pyramid scaling, and search within a new range of [-2, 2]. This improved run-time significantly on the higher resolution .tif format images.

Border Cropping

I did not implement automatic cropping, but I chose to crop a preset 15% of the image on each side before I compared using SSD. The reason for this was because the images had artifacts on the borders that were not part of the intended image and would have caused inconvenient noise. In addition, there was also noise from shifting each image according to the displacement vectors because my code used np.roll(). Cropping a preset portion of the image before cropping thus was a practical choice to reduce noise and make my alignments more accurate.

SSD

The formula I used to compare the alignment of images was SSD (Sum of Squared Differences), favoring alignments with lower SSDs. Specifically, my formula was defined as

∑ (image1 - image2)^2

Results

cathedral.jpg:
Blue - No Displacement
Green - X: 2, Y: 5
Red - X: 3, Y: 12
monastery.jpg
Blue - No Displacement
Green - X: 2, Y: -3
Red - X: 3, Y: 3
nativity.jpg
Blue - No Displacement
Green - X: 1, Y: 3
Red - X: 0, Y: 8
settlers.jpg
Blue - No Displacement
Green - X: 0, Y: 7
Red - X: -1, Y: 15
emir.tif
Blue - No Displacement
Green - X: 24, Y: 48
Red - X: 41, Y: 105
harvesters.tif
Blue - No Displacement
Green - X: 17, Y: 59
Red - X: 14, Y: 124
icon.tif
Blue - No Displacement
Green - X: 17, Y: 41
Red - X: 22, Y: 90
lady.tif
Blue - No Displacement
Green - X: 8, Y: 55
Red - X: 12, Y: 115
self_portrait.tif
Blue - No Displacement
Green - X: 29, Y: 78
Red - X: 37, Y: 174
three_generations.tif
Blue - No Displacement
Green - X: 14, Y: 52
Red - X: 11, Y: 111
train.tif
Blue - No Displacement
Green - X: 6, Y: 42
Red - X: 32, Y: 85
turkmen.tif
Blue - No Displacement
Green - X: 21, Y: 55
Red - X: 28, Y: 116
village.tif
Blue - No Displacement
Red - X: 22, Y: 134
Green - X: 12, Y: 64

Selected Examples

Black & White Channel Image
building.jpg
Blue - No Displacement
Green - X: 0, Y: 6
Red - X: 0, Y: 17
Black & White Channel Image
railroad.jpg
Blue - No Displacement
Green - X: 0, Y: 3
Red - X: 0, Y: 12
Black & White Channel Image
ship.jpg
Blue - No Displacement
Green - X: 4, Y: 4
Red - X: 8, Y: 13