CS 194: Computational Photography, Spring 2020

Project 1

Abby Cohn

onion_church.tif


Overview

In this project, the goal was to colorize Prokudin Gorskii’s photographs. Before color photography, he took three pictures fo the same scene with red, blue and green filters. Our task was to align the three channels to produce one color photograph. I used an image matching metric to score various displacements and created an image pyramid to find the best displacements for larger images.

Naive Alignment Implementation

First, I implemented the image alignment by searching over a [-15, 15] window of displacements. I used the Sum of Square Differences formula for my image matching metric, shown below.

When aligning one channel to another, it was simple to determine that the smaller the SSD, the more similar the two images were. From there, I saved my best SSD value and the displacement corresponding to it. I then used the numpy roll function to displace the pixels acordingly. I aligned the green to blue first and followed by aligning red to blue. Below are my results from running this implementation on the smaller images from the photo collection.

tobolsk.jpg Offset G:(3, 2) R:(6, 3)
monastery.jpg Offset G:(-7, 0) R: (8, 1)
cathedral.jpg Offset G:(1, -1) R:(7, -1)

Image Pyramid Implementation

While the naive implementation worked well on smaller images, it was very slow to search exhaustively over larger images. A good way to speed up the search for the best offset was to recursively search over smaller scales of the same image to approximate a good displacement across various levels. Starting from the smallest (coarsest) level, I found a displacement using the SSD again. I then used that displacement in the next level, scaling up by a factor of 2 using the skimage rescale function, to offset the displacement search window to find another estimate. Repeating this process until the largest (finest) level was reached proved to be a much more efficient way to align large images.

At first, my image pyramid took about 2.5 minutes per .tif file. To optimize my code further, I minimized the displacement search window for the higher levels of the pyramid. Since the image alignment was already narrowed down, only a very small displacment window needed to be tested for the later iterations. My revised implementation was reduced down to about 30 seconds per image.


three_generations.tif Offset G:(52, 5) R:(108, 7)
melons.tif Offset G:(83, 4) R:(176, 7)

lady.tif Offset G:(57, -6) R:(123, -17)
icon.tif Offset G:(42, 16) R:(89, 22)

self_portrait.tif Offset G:(-30, 27) R:(-40, 35)
onion_church.tif Offset G:(52, 22) R:(108, 35)

train.tif Offset G:(111, -7) R:(107, 1)
village.tif Offset G:(79, -7) R:(184, -14)

While most of the larger images lined up nicely, some still were not completely aligned. For example, I was not able to align the emir.tif file because I did not implement a more accurate metric for edge detection to account for the variance in brightness. Some other alignment issues could be solved with cropping a predefined margin, exemplified below with the harvesters.tif file.


uncropped
harvesters.tif Offset G:(118, -3) R:(120, 7)
cropped
harvesters_cropped.tif Offset G:(-56, 16) R:(-108, 13)

emir.tif Offset G:(45, -2) R:(45, -5)
workshop.tif Offset G:(53, -5) R:(69, -16)

Additional Photos From The Collection


chapel.jpg Offset G:(0, 1) R:(2, 2)
pinkhus.tif Offset G:(-27, -24) R:(-50, -56)
eggs.tif Offset G:(19, 0) R:(110, 2)