CS194-26 Project 1: Images of the Russian Empire

Michael Park - Fall 2020

Objective

The main objective of this project is to reconstruct a colorized image from the Prokudin-Gorskii glass plate collection. A colorized image can be created by aligning the three color channel images from the glass plate negatives. While this task is fairly straightforward, handling large images (e.g. TIF files) can impose several challenges with efficiency. For this project, I explored several ways to minimize computation for faster rendering of colorized images.


Naive Method

The main algorithm used for alignment is an exhaustive search of possible displacements through a window. I determined that [-15px, 15px] window is the best fit for the bounds. Small bounds would not be able to find the fit, and large bounds would be redundant. In order to quantify the fit of a color channel on another, I calculated Sum of Squared Differences (SSD). By iterating through possible alignments with nested for loops, I determined that one that yields the smallest value of SSD would be the best fit.

To extract the three color channel images from given file, I had to separate them by height. Initially, those images contain visible black borders. Unfortunately, this hinders us from getting SSD values that accurately reflect the fit. To address this issue, I implemented a crop function that retrieves the inner 75% of the images. While automatic border detection would better preserve the original image, I chose manual cropping for simplicity.

I decided to align green and red images on blue before stacking them together. Below are the results for small images:

Image Offset Result
Cathedral G[5, 2]
R[12, 3]
Tobolsk G[3, 3]
R[6, 3]
Monastery G[-3, 2]
R[3, 2]

Image Pyramid

While the naive method works well for small images, it becomes too computationally expensive for large images. In the naive method, I used nested for loops to query through possible displacements. This would become unrealistic over large window of bounds. To address this issue, I had to devise an algorithm that eliminates unnecessary queries.

I implemented the image pyramid, where I rescale the original image multiple times by a factor of 2 to create a "pyramid" of images. Starting with the smallest image, I performed an exhaustive search over [-15px, 15px] window, the same algorithm used for the naive search. With the optimal displacement, I first scaled it by 2 before applying the alignment on the next smallest image of the pyramid. Then, I recursively performed the same alignment algorithm on the aligned image until I found the optimal displacement for the original image. For this particular collection of images, I set the minimum resolution of images to be 400px by 400px, which led to 3-layer image pyramids. By doing this, instead of searching for tens of thousands of possible alignments, I could narrow the search down to around thousand alignments.

Like I decided to align green and red images on blue before stacking them together. I applied the same cropping algorithm before alignment. Below are the results for large images:

Image Offset Result
Self Portrait G[78, 29]
R[175, 37]
Melons G[82, 10]
R[178, 13]
Castle G[34, 2]
R[98, 5]
Onion Church G[50, 27]
R[108, 37]
Icon G[41, 18]
R[90, 23]
Train G[42, 6]
R[86, 32]
Lady G[52, 8]
R[112, 12]
Workshop G[53, -1]
R[106, -12]
Three Generations G[50, 14]
R[110, 12]
Emir G[48, 24]
R[24, -4]
Harvesters G[59, 17]
R[124, 15]

Possible Issues

While the algorithm generates most images correctly under 1 minute, some images like "Emir" do not seem to work. This is most likely due to a significant difference in brightness between the three color channel images. When primary colors are readily visible in the actual image, SSD does not accurately reflect whether images are aligned properly.

One possible solution to this issue would be using a different metric other than brightness to quantify the alignment. For this particular image, using Canny edges detection may be a smarter way of aligning it. While edge detection is not always superior to SSD (consider more abstract images with less defined shapes), it may serve well for images from the Prokudin-Gorskii collection.


Extra images

I performed the image pyramid algorithm on several other images from the Prokudin-Gorskii collection. Below are the results for such images:

Image Offset Result
Beans G[40, -37]
R[108, -80]
Candle G[47, 2]
R[106, -6]
Dam G[37, -4]
R[89, -10]
Religious G[43, 18]
R[94, 16]
Mannequins G[54, 0]
R[113, -16]