CS194-26: Image Manipulation and Computational Photography Project 1

Overview

Sergei Mikhailovich Prokudin-Gorskii (1863-1944) lived in a time when colored photographs would only exist in the far future. However, he wanted to capture the colorfulness of the present so that it can be remembered and enjoyed in the years to come. With special permissions from Tzar, he was allowed to travel across the Russian Empire and capture everything he saw. He envisioned that one day, technology would be advanced enough to take colored filtered stills and combine them into a colored picture. Thus, he took 3 exposures of each scene onto a glass plate using a red, a green, and a blue filter. His ambitions proved successful as these exposures survived the fall of the Russian Empire and we are now able to reconstruct the colored images. In this project, we will take some of these RGB filters and combine them to retrieve the original image. We first will align the filters using a simple naive exhaustive search that will work on low resolution images such as JPGs. For higher resolution and more computationally expensive images such as TIFs, we will implement the image pyramid optimization on top of the original exhaustive search to align the images.

Single Scale Approach

I first implemented a simple exhaustive search over shifting over the red and green filters over the blue filter. I calculated the NCC between the alignments of a red/green filter over the blue filter for possible shifts between [-15, 15] pixels in the vertical and horizontal directions (combinations of them) to judge how well aligned the filters are. I initially added a 15 grey pixel border on each side of the filters, but found that this hurt my performance as the NCC of a colored pixel on top of the grey pixel skewed my results. Instead, I cropped a fixed amount of the image off on each side and used a rectangular middle part of the picture for alignment. I also tried using SSD as my heuristic for how "well" a red/green filter is aligned with the blue filter and it produced the same results as well as ran faster. I also tweaked the paremeters of how far to exhaustively search (ex. [-30, 30]) but found that for the JPGs, [-20, 20] was fine. This algorithm would take far too much time for the TIF images however, so I only ran this on the JPGs. The results are shown below with the corresponding best displacement of the red and green filters on top of the blue.

Monastery G on B: [-3, 2], R on B: [3, 2]	Cathedral G on B: [5, 2], R on B [12, 3]
Settlers G on B: [7, 0], R on B: [14, -1]	Nativity G on B: [3, 1], R on B [7, 0]

Multi Scale Approach

Because TIF images are much larger than JPGs, we need a fast way to exhaustively search large displacements of pixels. In addition, the resolution is higher so calculating the SSD for many displacements is costly by virtue of there being just more SSD calculations necessary per iteration. Thus, we implement an optimization known as an image pyramid. We scale down the original image by 2 multiple times to get a series of images from large images of high resolution to smaller images of lower resolutions. We can find the best displacement of a filter on top of another starting on the smaller lower resolution images to inform where we should search on higher resolution images. Thus, I create a series of images from high resolution to low resolution of the red, green, and blue filters and use the single scale approach to align the corresponding images. Some new parameters introduced are: what interval should we exhuastively search over, how small should the lowest resloution image be, how much to crop an image by when finding the best alignment, and how should we use a best displacement from a low res image to inform where we should search over the next align search.

Initially, I ran into problems with the third parameter listed above as I wasn't being too aggressive in how much a best displacement should inform the next exhuastive search interval and thus the interval would not be large enough to find the best displacement. I began scaling the interval depending on where the best displacement lay in the previous interval which helped a lot. In addition, I was cropping too much which led to poor performance (1/5 of the image size). I changed this to 1/8 which also improved my results. Lastly, I decided to scale down until the image was less than 100 pixels by 100 pixels.

The results on the TIF images and their best displacements are shown below.

Harvesters G on B: [54, 17], R on B: [123, 14]	Icon G on B: [41, 17], R on B [89, 23]
Lady G on B: [52, 9], R on B: [112, 12]	Self Portrait G on B: [79, 29], R on B [176, 37]
Three Generations G on B: [51, 14], R on B: [110, 11]	Train G on B: [42, 6], R on B [87, 32]
Turkmen G on B: [55, 20], R on B: [116, 28]	Village G on B: [64, 12], R on B [137, 22]

"The following image I had trouble with as the red filter would not align with the blue filter using the previous approach. According to our GSI, Tae, the blue channel is this photo has high intensity whereas red does not. SSD assumes that a high intensity in one filter will correspond to a high intensity in another and thus misaligns the red filter. Instead, I aligned the red and blue filters against the green filter for a more consistent alignment. This change worked and produced the following result:"

Emir B on G: [-49, -24], R on G: [57, 17]

Additional Images

Snow G on B: [5, 0], R on B: [10, -1]	Lilacs G on B: [5, -1], R on B [9, -3]
Children G on B: [6, 4], R on B: [14, 6]

CS194-26: Image Manipulation and Computational Photography, Fall 2018

Project 1: Images of the Russian Empire

Lavanya Mittal

Overview

Single Scale Approach

Multi Scale Approach

Additional Images