CS 194-26: Colorizing the Prokudin-Gorskii Photo Collection

Overview

In the early 1900's, Sergei Mikhailovich Prokudin-Gorskii came up with a technique of capturing images in color. His plan was to take three photos of the same scene, each with a different filter applied to it such that each image would only capture light of a specific color – red, green, or blue. Although the technology did not exist during his time, he hoped that one day his images could be rendered in full color. This project aims to accomplish that goal, taking the digitized versions of his black-and-white glass plate images and transforming them into color photos.

Algorithms and Techniques

Extracting Three Color Channels

The first step was to take the image with all three channels and separate each color channel into separate images we could overlay onto each other. I did this just by dividing the total vertical height by 3 and indexing the full image by height. Then, I just laid the images on top of each other, like so:

Seems like simply laying the three channels on top of each other isn't enough! The shots weren't taken at exactly the same spot, so the alignment is a little off. The next step is to align the three channels so the image is clear.

Basic Alignment

In order to align the channels to each other, I first defined and error metric to score how well two channels were aligned to each other. The metric I used was the Sum of Squared Differences (SSD), which is defined as sum(sum(image1-image2)^2). Basically we're finding the total squared difference between each corresponding pixel in the two images. Using this metric, I scored the result of different vertical/horizontal shifts within a [-15, 15] window, and returned the alignment that resulted in the lowest difference score.

While this worked well for most images, this method seemed to fail on monastery.jpg.

It seemed as if the algorithm was trying to align the borders to each other, and since black is one extreme of the color spectrum, errors in alignment of the black border led to high error scores.

Cropped Alignment

To solve the problem of aligning to the borders, I decided to ignore the edges when calculating the error metric. I did this by calculating the SSD on a cropped version of each channel, cropping off 10% from each side of the image. This was able to fix the misalignment in monastery.jpg file. Tada!

Pyramid Alignment

While this method of exhaustively searching in a [-15, 15] window worked well for the smaller, compressed .jpg images, it proved to be extremely inefficient and/or too small of a window for the larger .tif images. In order to alleviate the inefficiency and improve runtime, I implemented the pyramid algorithm, which aligns on a scaled down version of the image first, gradually working up to the full-resolution image. This allowed me to only have to search a [-15, 15] window at the lowest resolution, and only search in a smaller [-2, 2] window at higher resolutions. I did this by recursively calling my original align function on rescaled versions of the two input channels, then multiplying the resulting alignments by the same scaling factor before tuning in the higher resolution.

Edge Detection

Even with the pyramid alignment algorithm, there was one image that struggled to align properly, emir.tif. Due to the high contrast of colors, the brightness of the different color channels is extremely variable, even in pixels that should be aligned to each other. As a result, we get strange alignments, like so:

To resolve this issue, I needed to find a better feature to compare between channels than raw pixel brightness values. I decided to use Sobel edge detection to find the edges in the two channels and then align based on the resulting image. I used skimage's built-in sobel filter, which produced a much better alignment for the problematic image.

Project 1: Colorizing the Prokudin-Gorskii Photo Collection