CS194-26: Project 1 - Colorizing the Prokudin-Gorskii Photo Collection

Images of the Russian Empire

by Leanna Yu (cs194-26-aff)


Project Overview

Sergei Mikhailovich Prokudin-Gorskii had the idea of capturing color in photography, by using three exposures of the same scene, through red, green, and blue filters. The goal of this project is to reconstruct a color image by aligning the three exposures using an efficient algorithm that would work well with large files. To breakdown this task, the input image needs to be separated into its respective exposures, then stacked on one another resulting in an RGB color image.

Approach

Processing Small Image Inputs: Single-Scale Implementation

Since the JPG files were of a reasonable size, a simple yet exhaustive align algorithm would make do in stacking the red, blue, and green exposures together. In terms of the 900 shifts, the different x and y positions were shifted both 15 to the left and right, as well as 15 to the top and bottom. With this single-scale implementation, all 900 possible shifts were tested, and at each shift, the metric I used to score how well the images matched was the Sum of Square Differences (SSD). The SSD represents the error or mismatch between the two images in a given shift position. The best match would be the one corresponding to the smallest SSD.

CathedralCathedral: [5, 2], [12, 3]

MonasteryMonastery: [-3, 2], [3, 2]

TobolskTobolsk: [3, 3], [6, 3]
























Processing Large Image Inputs: Image Pyramid

Because TIF files are significantly larger than the given JPG files, it would be far too exhaustive to try ever shift possibility in the same way as described above for larger files. For that reason, for finding the best alignment for these larger files, I implemented a function that utilizes the simpler algorithm as its base case. Each level analyzes the image shifts to find the best displacement when the images are scaled down by a factor of 2. At each level of recursion in this function, it takes the displacement as found from the level before it, and multiplies it by two as the image is now scaled up twice from before, and with that new image, the best displacement in the [1,-1] window around its borders is found. When combined, the result is the optimal offset at the current level of recursion. This algorithm worked fairly well on all the given images, except for on the image of Emir, where the result from the algorithm described above fails to align the red channel with the rest of the image correctly. The reason for this disparity is that compared to the other images given in the test set, the color channels have different brightness values, so the algorithm has a hard time aligning it with its blue layer base.

HarvestersHarvesters: [59, 16], [124, 13]

IconIcon: [40, 17], [89, 23]

LadyLady: [47, 8], [113, 11]

MelonsMelons: [82, 9], [178, 12]

Onion ChurchOnion Church: [51, 26], [108, 36]

Self PortraitSelf Portrait: [78, 28], [176, 36]

Three GenerationsThree Generations: [53, 14], [111, 10]

TrainTrain: [42, 5], [87, 31]

VillageVillage: [65, 12], [137, 22]

WorkshopWorkshop: [52, 0], [104, -12]

ClothingClothing: [25, -18], [115, -38]

Man on BoatMan on Boat: [52, -23], [107, -56]

SunsetSunset: [0, -46], [62, -55]

EmirEmir: [49, 24], [100, -204]