CS 194-26 Spring 2020

Project 1: Colorizing the Prokudin-Gorskii photo collection

Alex Shiyu Liu, cs194-26-afw

Overview

Sergei Mikhailovich Prokudin-Gorskii traveled across the vast Russian Empire to take color photographs of everything he saw. He recorded three exposures of every scene onto glass plate images using a red, green, and blue filter. Using image processing techniques, color images can be produced by extracting the three color channel images, place them on top of each other, and aligning them so that they form a single RGB color image.

Approach

Single-Scale Alignment

I processed the data from the glass plates by splitting the .jpg images into thirds, representing each color channel image. Then, I utilized an exhaustive search technique by iterating over an interval of pixel displacements, [-15, 15], for both the x and y dimensions. For each nested iteration, I used the L2 norm to compute the loss over the matrix of element-wise differences between either the displaced red or green image and the blue base image. I also computed the norm on only the middle 60% of the images so that the values on the borders where the images wrapped around from displacement would not contribute to the loss. $$ L2(img_1, img_2) = \sum_i\sum_{j} { (img_{i,j} - img_{i,j}})^2 $$

The (x, y) pixel displacement corresponding to the minimum L2 loss was the displacement I used to align each of the red and green filters onto the blue base filter.

cathedral
out_fname_1
Green: [2, 5], Red: [3, 12]
monastery
out_fname_2
Green: [2, -3], Red: [2, 3]
tobolsk
out_fname_3
Green: [2, 5], Red: [3, 12]

Image Pyramid Alignment

I used an image pyramid technique to optimize runtime for computing the alignments on the larger .tif files. I iterated over an ordered set of values to reduce the resolution of the image, [64, 32, 16, 8, 4, 2, 1]. Starting from the highest power of two, i.e. the largest reduction, I utilized the single-scale alignment approach from above over a smaller range of pixel displacements[-5, 5] to find the optimal (x, y) pixel displacement for the respective resolution. Then, I scaled this displacement by a factor of two to correspond with the next higher-resolution image. At every step in the iteration over resolutions, the single-scale alignment in the previous step minimizes the range of possible displacement values that minimizes the L2 loss until the optimal displacement is computed for the original image. Note that the emir image alignment was unsuccessful. This is largely due to the subjects primarily blue clothing that dominates the center of the image. This definitive coloring is hard to distiguish against the red channel when aligning using pixel values and L2 loss. Additionally, the birghtness of each channel varies a significant amount. This is also a factor that is likely unsolvable with this specific implementation of alignment.

emir
out_fname_4
Green: [24, 96], Red: [-210, 476]
harvesters
out_fname_5
Green: [17, 118], Red: [14, 246]
icon
out_fname_6
Green: [17, 82], Red: [23, 178]
lady
out_fname_7
Green: [8, 110], Red: [12, 228]
melons
out_fname_8
Green: [10, 162], Red: [14, 356]
onion_church
out_fname_9
Green: [27, 102], Red: [37, 216]
self_portrait
out_fname_10
Green: [29, 156], Red: [37, 350]
three_generations
out_fname_11
Green: [14, 104], Red: [12, 222]
train
out_fname_12
Green: [6, 84], Red: [32, 172]
village
out_fname_13
Green: [12, 128], Red: [22, 274]
workshop
out_fname_14
Green: [-1, 106], Red: [-12, 212]

Additional Examples

00504a
out_fname_15
Green: [10, 94], Red: [-5, 206]
00869a
out_fname_16
Green: [18, 4], Red: [20, -24]
01602a
out_fname_17
Green: [2, 136], Red: [-6, 290]
01620a
out_fname_18
Green: [24, 94], Red: [39, 220]