CS 194-26 Spring 2020

Project 1: Colorizing the Prokudin-Gorskii photo collection

Alex Shiyu Liu, cs194-26-afw

Overview

Sergei Mikhailovich Prokudin-Gorskii traveled across the vast Russian Empire to take color photographs of everything he saw. He recorded three exposures of every scene onto glass plate images using a red, green, and blue filter. Using image processing techniques, color images can be produced by extracting the three color channel images, place them on top of each other, and aligning them so that they form a single RGB color image.

Approach

Single-Scale Alignment

I processed the data from the glass plates by splitting the .jpg images into thirds, representing each color channel image. Then, I utilized an exhaustive search technique by iterating over an interval of pixel displacements, [-15, 15], for both the x and y dimensions. For each nested iteration, I used the L2 norm to compute the loss over the matrix of element-wise differences between either the displaced red or green image and the blue base image. I also computed the norm on only the middle 60% of the images so that the values on the borders where the images wrapped around from displacement would not contribute to the loss. $$ L2(img_1, img_2) = \sum_i\sum_{j} { (img_{i,j} - img_{i,j}})^2 $$

The (x, y) pixel displacement corresponding to the minimum L2 loss was the displacement I used to align each of the red and green filters onto the blue base filter.

cathedral

out_fname_1 — Green: [2, 5], Red: [3, 12]

monastery

out_fname_2 — Green: [2, -3], Red: [2, 3]

tobolsk

out_fname_3 — Green: [2, 5], Red: [3, 12]

Image Pyramid Alignment

I used an image pyramid technique to optimize runtime for computing the alignments on the larger .tif files. I iterated over an ordered set of values to reduce the resolution of the image, [64, 32, 16, 8, 4, 2, 1]. Starting from the highest power of two, i.e. the largest reduction, I utilized the single-scale alignment approach from above over a smaller range of pixel displacements[-5, 5] to find the optimal (x, y) pixel displacement for the respective resolution. Then, I scaled this displacement by a factor of two to correspond with the next higher-resolution image. At every step in the iteration over resolutions, the single-scale alignment in the previous step minimizes the range of possible displacement values that minimizes the L2 loss until the optimal displacement is computed for the original image. Note that the emir image alignment was unsuccessful. This is largely due to the subjects primarily blue clothing that dominates the center of the image. This definitive coloring is hard to distiguish against the red channel when aligning using pixel values and L2 loss. Additionally, the birghtness of each channel varies a significant amount. This is also a factor that is likely unsolvable with this specific implementation of alignment.

emir

out_fname_4 — Green: [24, 96], Red: [-210, 476]

harvesters

out_fname_5 — Green: [17, 118], Red: [14, 246]

icon

out_fname_6 — Green: [17, 82], Red: [23, 178]

lady

out_fname_7 — Green: [8, 110], Red: [12, 228]

melons

out_fname_8 — Green: [10, 162], Red: [14, 356]

onion_church

out_fname_9 — Green: [27, 102], Red: [37, 216]

self_portrait

out_fname_10 — Green: [29, 156], Red: [37, 350]

three_generations

out_fname_11 — Green: [14, 104], Red: [12, 222]

train

out_fname_12 — Green: [6, 84], Red: [32, 172]

village

out_fname_13 — Green: [12, 128], Red: [22, 274]

workshop

out_fname_14 — Green: [-1, 106], Red: [-12, 212]

Additional Examples

00504a

out_fname_15 — Green: [10, 94], Red: [-5, 206]

00869a

out_fname_16 — Green: [18, 4], Red: [20, -24]

01602a

out_fname_17 — Green: [2, 136], Red: [-6, 290]

01620a

out_fname_18 — Green: [24, 94], Red: [39, 220]