Sam Zhou
CS194-26: Image Manipulation and Computational Photography

Images of the Russian Empire: Colorizing the Prokudin-Gorskii Photo Collection

Project Overview

This project aims to take sets of three glass plate images corresponding to the red, green, and blue color channels and reconstruct a full color image from layering them together. The challenge lies in finding the correct alignment of the three channels and doing so in a reasonable amount of time.

Algorithm

In order to determine the best alignment, we need to search over a range of horizontal and vertical shifts by assigning each shift some metric. The metric we will use is the Sum of Squared Differences - SSD(A, B) = sum_elements((A - B) * (A - B)) where A and B are matricies - between the anchored channel and the shifted channel. In this program, we anchored the blue channel and tried to find appropriate shifts for the red and green channels.

First, we crop the image since the borders are not consistent across each color channel and what we want to align is the actual image and the features found in the center.

Then, we loop over horizontal and vertical shifts in the range [-15, 15]. For each shift, we calculate the SSD of the shifted plate with the blue plate and compare it to the smallest SSD found so far. By the end, we have the set of shifts which produced the smallest SSD. With this, we can shift the red and green plates and then stack the three together to form our final image.

This process worked relatively well and also relatively quickly for smaller .jpg images since the size of the image was on the scale of 400x400px. When we attempted this with larger .tif images, the search in the range [-15, 15] was too small to be able to find a good alignment, but increasing the range much more than that made the program too slow.

To fix this, for larger images we iteratively adjust the shift in increasingly finer steps so that we can first make large shifts to get close to our solution, and then fine tune from there. To do this, we first rescale the image to 1/32 the original dimensions and find the best shift at that resolution. Every iteration after that, we rescale the image to twice the size of the previous iteration and begin our search centered around the shift we have accumulated so far. We continue doing this until we are looking for a shift in the range [-15, 15] on the original resolution.

The image pyramid approach of adjusting the shifts with increasing granularity gave us good final alignment and did not take more than a few minutes.

Results

Every reconstructed image with the shifts for the red and green plates determined by the algorithm.

Cathedral:
Green shift = (5, 2)
Red shift = (12, 3)
Cathedral

Monastery:
Green shift = (-3, 2)
Red shift = (3, 2)
Monastery

Nativity:
Green shift = (3, 1)
Red shift = (7, 0)
Nativity

Settlers:
Green shift = (7, 0)
Red shift = (14, -1)
Settlers

Emir:
Green shift = (49, 24)
Red shift = (0, -310)
Emir

Icon:
Green shift = (40, 17)
Red shift = (89, 23)
Icon

Harvesters:
Green shift = (59, 17)
Red shift = (123, 14)
Harvesters

Lady:
Green shift = (49, 8)
Red shift = (109, 11)
Lady

Self Portrait:
Green shift = (78, 29)
Red shift = (176, 37)
Self Portrait

Three Generations:
Green shift = (51, 14)
Red shift = (110, 11)
Three Generations

Train:
Green shift = (42, 6)
Red shift = (87, 32)
Train

Turkmen:
Green shift = (55, 20)
Red shift = (116, 28)
Turkmen

Village

Green shift = (64, 12)
Red shift = (137, 22)
Village

Problems

The “emir.tif” image had the most noticable issue of all the reconstructions. This is likely because of the huge discrepancy between the three color channels in this image. The man’s outfit is very blue and is quite bright. This means on the blue channel, areas on the man’s outfit will have very high values. Despite it being bright, since the color is so strongly blue, the green and red channels likely do not have a large spike in values in the areas on the man’s outfit since there is not much red or green there. This makes it hard to find a proper alignment using SSD since it is just looking at differences between values. The other images aligned relatively well and aside from the noticable color bars near the edges of each image, the image looks like a proper color image with very small artifacts.

Additional Results

River

Green shift = (1, 2)
Red shift = (14, 3)
River

Shed

Green shift = (2, -1)
Red shift = (12, -3)
Shed

Glass

Green shift = (2, 2)
Red shift = (6, 2)
Glass

Flowers

Green shift = (2, 2)
Red shift = (5, 2)
Flowers