Images of the Russian Empire

Colorizing the Prokudin-Gorskii photo collection

Goal

The goal of this project was to colorize images taken by Prokudin-Gorskii during the early 20th century. This was before the advent of colored photos, so Prokudin-Gorskii took 3 photos of the same scene with 3 different physical filters: red, blue, and green. However, because the three pictures were taken separately, they do not align exactly. It required some additional work to align the 3 images so that the original colored scene could be recreated. The original glass plate negatives looked like this, with the top photograph corresponding to the blue channel, the middle corresponding to green channel, and the bottom corresponding to the red channel.

Error Metrics Considered

One of the first design decisions was to choose an error metric to compare how well two images aligned. Here are the metrics I considered:

Sum of Squared Differences: this metric is essentially the l2 norm of the difference between the pixel values of the two images. The smaller the l2 norm, the more aligned the two images are.
Normalized Cross-Correlation: this metric is calculated as the dot product between the two normalized image vectors. The higher the dot product, the more aligned the two vectors, and subsquently the images, are.

Search Procedure

Once I decided on an error metric, I needed to decide how to actually identify the displacements that would yield the smallest error in accordance with the chosen error metric. These are the two methods I implemented:

Naive Search: in this method, I shifted the images within a specified range, and checked which displacement yielded the lowest error value. This is a naive approach, and would not be fast enough if the actual displacements exceeded 20-30 pixels, especially on large images.
Image Pyramid: this method was intended for much larger images where naive search would not be feasible. I implemented a recursive method, where the images passed into each recursive call are scaled by 0.5. At the smallest level, naive search is applied between the two coarse images and the displacement is sent back up the call stack. For the the middle recursive calls, the larger displacement received from the coarser levels is then used to shift the original image. Now that the image is shifted, naive search is run on a much smaller region to get a more localized displacement - I chose a region of 100 x 100 pixels from the center of the image. This process is repeated until the displacement for the original, full-sized image is recovered. Hyperparameters for this method include: the scale of the downsizing for the recursive calls, the maximum number of pyramid levels & the minimum image size before naive search is performed (the base case), and the size of the window to perform localized displacement.

Problems Encountered

After implementing the image pyramid search method, most of the given images were aligned well enough that I could not detect any blurriness with my naked eye except for the following ones.

Emir: The algorithm did not work as well on this image because the different color channel images were not all taken with the same brightness. Therefore, the pixel values themselves did not yield enough information to adequately align the photos. The original result of the algorithm is shown below.

In order to fix this issue, I first added a Sobel filter to the images before passing them through the alignment algorithm. The Sobel filter detects the edges within the image, which is a more representative metric for alignment than the original pixel values. After doing this, the image alignment looked a lot better:
Melons: This image was not the best suited for the current image alignment algorithm because the melons were all clustered together and similar colors. It was hard for the algorithm to distinguish between the different melons and properly align because neighboring pixels all had similar values.