CS 194-26: Computational Photography Project 1

Akash Singhal

Project Overview

The first project was to align the Prokudin-Gorskii tri-color filtered panels so that when superimposed, they form one colored image. In order to properly colorize the Prokudin-Gorskii collection, each of the three filtered photos must be extracted and then aligned on top of each other. The real challenge lies in the fact that the three images are not exactly of the same perspective. As a result a simple superposition is not possible. In order to generate color photos that are as clear as possible, I attempted to superimpose the images and calculated a displacement metric to evaluate how good a match the superposition is.

Naive Low-Res Algorithm

The first attempt was to properly align the lower resolution jpeg images provided. I split the tricolor pane into its components and then individually aligned the red and green panes to the blue pane. For each pane alignment, I used simple for loop iteration and selected a 15x15 box of displacements of the top pane (red/green) to superimpose. For each displacement, I calculated the SSD (sum of squared differences) metric. I selected the displacement with the least SSD. This method seemed to work decently well, but the noise at the borders was severly impacting the SSD metric. As a result I tweaked the algorithm to used a small portion of the center of the image to calculate the displacement metrics. This allowed me to avoid calculating SSD on the edge noise. After implementing this addition, the low-res alignment worked well.

High-Res Algorithm Using Pyramid Optimization

The naive algorithm works well for lower resolution images since the number of SSD calculation to be made per displacement is still manageable. However, the high resolution .tif images make the above algorithms almost unusable. In order to solve this issue, I decided to implement a pyramid speedup optimization. Essentially, I made a recursive version of the naive algorithm which would first search at lower scaled images and refine displacement estimate with every increase in resolution until the original resolution was reached:

Rescale the image by 0.5 until we reach the desired level of rescaling
Search over a 15 x 15 displacement area and find displacement with lowest SSD
With new alignment, scale the estimates by 2 and shift the image with 2x higher scale by the new scaled estimates
Repeat process until we reach the original scaled image and return the final estimate

This method worked well for some images, but I noticed images with large border noise struggled. I decided to implement a crop method which would crop the original color panes by 900px on all 4 sides. The displacement estimate generation above is run only only on this cropped image. I then take the final estimate and shift the original images by the estimate to output the final colorized and aligned image.

Image Results

Naive Algorithm on Low-Res Images

red shift (x, y): [3, 12] green shift (x, y): [2, 5]

red shift (x, y): [2, 3] green shift (x, y): [2, -3]

red shift (x, y): [3, 7] green shift (x, y): [3, 3]

Pyramid Algorithm on High-Res Images: Provided Images

red shift (x, y): [5, 98] green shift (x, y): [2, 34]

red shift (x, y): [34, 52] green shift (x, y): [24, 49]
This photo was the only one that did not correctly align using this algorithm. The issue is that each color pane has drastically different brightness and thus the displacement SSD is completely skewed.

red shift (x, y): [14, 123] green shift (x, y): [17, 59]

red shift (x, y): [11, 110] green shift (x, y): [8, 51]

red shift (x, y): [13, 178] green shift (x, y): [10, 82]

red shift (x, y): [37, 108] green shift (x, y): [27, 51]

red shift (x, y): [37, 175] green shift (x, y): [29, 78]

red shift (x, y): [12, 110] green shift (x, y): [14, 50]

red shift (x, y): [32, 85] green shift (x, y): [6, 41]

red shift (x, y): [-12, 105] green shift (x, y): [-1, 53]

Pyramid Algorithm on High-Res Images: Extra Images

red shift (x, y): [7, 84] green shift (x, y): [4, 28]

red shift (x, y): [27, 27] green shift (x, y): [18, 10]

red shift (x, y): [-13, 133] green shift (x, y): [-8, 12]