CS 194-26 Project 1

Colorizing the Prokudin-Gorskii Photo Collection
Kyle Hua, CS194-26-AGX

Intro

Sergei Mikhailovich Prokudin-Gorskii took various pictures of the Russian Empire in the early 1900s resulting in the Prokudin-Gorskii photo collection. He three black and white images taken in the R, G, and B filters. These filters allow us to retrieve a color image by stacking the three color image channels on top of each other. However, we must account for misalignments between each photographs that result from the photographer or the subjects moving ever-so slightly. In order to fix the misalignment of the photo by we can calculate the offsets to move each channel by, and then applying these offsets and stacking the images on top of each other.

Single Level Implementation

In order to find the correct offeset for the images, I implemented a search to check all x, y offsets from -20 to 20 between two color channels. To compare the offesets, I used the Sum of Squared Differences (SSD) distance between the image values at each offset. The offset that resulted in the smallest loss was the ideal offset. This worked well for low resolution jpeg images.

The offset vector is in [x,y] format.

cathedral.jpg
g: [2, 5] | r: [3, 12]
monastery.jpg
g: [2, -3] | r: [2, 3]
tobolsk.jpg
g: [2, 3]| r: [3, 6]

Multi Level Implementation

While the single level implementation works well for small images, on large ones it was not as good. A -20 to 20 pixel window would only search a miniscule percentage of the image and take a very long time. To optimize the function for larger images, an image pyramid was used to recursively call the single level approach on n scaled down versions of the image to get the best offsets for the smaller images.

My image pyramid was d depth and the images were scaled by a factor of s. Starting from the top level, the image would be rescaled to 1/s^d. I found the optimal scale and depth was 2 and 3. This resulted in 4 levels with the bottom level being the full sized image and higher levels being a rescaled version by a factor of 2.

The rescaled levels would result in a smaller image and thus be easier to run the single level search on it.

The single level search would then start at the best offsets from the higher levels with a window dependent on the level. The highest level would have the biggest search windows (-16 to 16), while the lowest would have the smallest search windows (-2 to 2). Reducing the search window would allow for faster, more refined searches. In addition, the offsets from prior levels would need to be multiplied by the scaling factor in order to maintain consistency. After the best offsets for the level were found, they would be added to the totoal offsets of prior levels.

Minor Improvements

church.tiff
g: [4, 25] | r: [-4, 58] | 12.3 s
emir.tiff
g: [23, 49] | r: [41, 70] | 13 s
emir.tiff
b: [-23, -49] | r: [17, 57] | 11.8 s | Aligned to green
harvesters.tiff
g: [16, 59] | r: [13, 123] | 12.8 s
icon.tiff
g: [17, 40] | r: [22, 89] | 12.4 s
lady.tiff
g: [8, 47] | r: [11, 113] | 11.1 s
melons.tiff
g: [9, 81] | r: [13, 178] | 18.2 s | Expanded window
onion_chuch.tiff
g: [26, 51] | r: [36, 108] | 11.5 s
self_portrait.tiff
g: [27, 77] | r: [35, 174] | 18 s | Expanded window
three_generations.tiff
g: [12, 52] | r: [10, 110] | 11.6 s
workshop.tiff
g: [-1, 52] | r: [-12, 103] | 11.7 s