CS 194-26 Project 1

Overview

In the early 1900s, Sergei Mikhailovich Prokudin-Gorskii (1863-1944) won the Tzar's special permission to travel across the vast Russian Empire and take color photographs of anything and everything he saw. Although color photography was not yet possible at the time, he recorded three exposures of every single scene onto a glass plate using a red, green, and blue filter. He imagined that special projectors would be installed in classrooms all across Russia in which children could learn about their country through his images. The goal of this project is to take Prokudin-Gorskii's glass plate images and, using image processing techniques, produce a color image with as few visual artifacts as possible.

Details

Basic Approach: Exhaustive Search

My goal was to build a program that would take in a glass plate image as input, and produce a single color image as output. First, I divided the image into three equal parts. Then, my goal was to align the second and third parts (G and R) to the first (B).

The simplest way to accomplish this is to search over a window of displacements to find the optimal alignment between the different color channels, and then take the displacement with the best score. The image matching metric that I used is the L2 norm, also known as the Sum of Squared Differences (SSD). The downside for this type of exhaustive search is that for very large images it takes a really long time to run. So I then attempted to create a faster way to find the optimal alignment between the R, G, and B channels.

Improved Approach: The Image Pyramid

For high-resolution glass plate images, in which the pixel displacement is very large, the naive approach is very costly. To combat this I implemented an image pyramid to cut down runtime. An image pyramid represents an image at multiple scales (I used a factor of 2) and processes in order from lowest to highest resolution, updating the estimate for the optimal alignment at each level. While this approach had better runtime, it still failed for some images, such as that of Emir of Bukhara. In the case of that image, the R, G, and B channels do not have the same brightness values, which could be a situation in which this approach doesn't work as well.

Input

Alignment using the Image Pyramid

Even Further: Canny Edge Detection

I wanted to try to improve upoon my image pyramid solution to better align images such as that of Emir of Bukhara. I turned to edge detection, with the idea that if I align the channels based on their detected edges rather than just their color values, this could eliminate the issue where the different color channels have different brightness levels. This approach did seem to work for Emir, although in the end there were still a couple images that weren't perfectly aligned. I think this may be because my search window was still too small for images where the R, G, and B channels were really unaligned from each other. In the future, I would implement a way to determine how many levels I should create for my image pyramid depending on the size of the input image to prevent my search window from being to large or small. Nevertheless, I'm happy that the photo of Emir ended up lining up well.

Input

Bells and Whistles

Automatic Cropping

I created a function to crop the areas of the resulting images where there were unnattural streaks of color. For each side of the image (top, bottom, left, right) I took a subset of the image in that direction and searched for rows or columns in which a high percentage of the R, G, or B values were either all below 0.1 or above 0.9. This approach seemed to work pretty well, and I'm happy with the results.

Cathedral (Aligned)

Cathedral (Aligned and Cropped)

Results

Shifts will be written in terms of (y, x)

Cathedral, G: (5, 2), R: (12, 3)	Church, G: (25, 3), R: (58, -4)	Emir, G: (49, 24), R: (107,40)
Harvesters, G: (60, 17), R: (124, 14)	Icon, G: (39, 16), R: (89, 23)	Lady, G: (56, 10), R: (120, 13)
Melons, G: (80, 10), R: (139, -58)	Monastery, G: (-3, 2), R: (3, 2)	Onion Church, G: (52, 24), R: (107, 35)
Self Portrait, G: (77, 29), R: (140, 49)	Three Generations, G: (55, 12), R: (111, 8)	Tobolsk, G: (3, 3), R: (7, 3)
Train, G: (41, 0), R: (85, 29)	Workshop, G: (53, -1), R: (105, -12)