CS 194-26 Project 1

Alvin Zhang

This project involves automatic alignment of RGB color channels from the Prokudin-Gorskii collection.

Approach

The image for each color channel was obtained by dividing the original plate into thirds. A Gaussian filter was applied to each of the channels before alignment.

The number of scales for an image pyramid was then determined such that a translation of one pixel in the smallest level would represent at least a 5% shift in the image.

Working from the smallest scale to the largest, I then tested progressively finer translations and rotations to align the R or G channel to the B channel. A shift of +/-1 px was tested for each level in the y- and x-directions, and a rotation of +/-0.05*2^zoom_level deg (+/-128px and +/-6.4 deg for the .tif plates at the smallest level).

At each level, the results from the previous level were saved, and applied to the images at the current level before the new, finer translations and rotations were tested.

The scores for the translations and rotations were determined by applying NCC to the gradient magnitudes of the scaled/shifted/rotated channels, where the gradient magnitude was calculated as G = sqrt(gx^2 + gy^2), where gx and gy were determined by applying the horizonal and vertical Sobel filters, respectively. The evaluation was performed on the central 0.8-scale crop, due to the image borders.

Once the best translation and rotation were determined, the resulting image was converted to greyscale. The top border was then determined by finding points in the top-10% crop where the gradient was vertically aligned (theta = arctan2(gy / gx) within pi/8 of +/-pi/2), applying a top-40% filter for gradient magnitude G to these points, and then outputting the highest y-value such that at least 33% of points in the corresponding row matched the previous criteria. A corresponding process was repeated for the left/bottom/right borders.

Cherry-Picked Ablations

Gradients for scoring alignments
-grad
-grad

+grad
+grad

Rotations for alignment
-rot
-rot

+rot
+rot

Cropping borders
-border
-border

+border
+border Note that the border detection works well for the top/left/right edges, but is too aggressive for the bottom edge. This can possibly be fixed with some parameter tuning.

Results

Offsets in the form: [y_disp, x_disp] rot

Cathedral
Cathedral, G2B:[5 2] -0.15, R2B: [12 3] 0.45

Emir
Emir, G2B: [49 24] -0.1, R2B: [106 40] -0.1

Harvesters
Harvesters, G2B: [60 18] -0.25, R2B: [124 14] -0.3

Icon
Icon, G2B: [43 18] -0.15, R2B: [91 23] -0.1

Lady
Lady, G2B: [56 10] -0.05, R2B: [119 14] -0.1

Monastery
Monastery, G2B: [-3 1] 0.85, R2B: [3 3] -0.15

Nativity
Nativity, G2B: [3 1] 0.05, R2B: [8 0] -0.4

Self-Portrait
Self-Portrait, G2B: [78 30] -0.2, R2B: [176 37] -0.25

Settlers
Settlers, G2B: [7 0] -0.45, R2B: [15 -1] -0.45

Three Generations
Three Generations, G2B: [53 14] -0.15, R2B: [111 11] -0.2

Train
Train, G2B: [46 9] -0.35, R2B: [89 34] -0.25

Turkmen
Turkmen, G2B: [57 22] -0.1, R2B: [117 29] -0.2

Village
Village, G2B: [65 12] -0.05, R2B: [137 22] 0.0