CS 194-26 Fall 2020, Project 1: Colorize Me!

By: Vincent Lyau

Introduction

In this project, glass plate negatives taken by photographer Sergei Mikhailovich Produkin-Gorskii are aligned to create colorized images. Produkin-Gorskii used a three-stack camera, with each camera capturing exposures with a different color filter. Combining these exposures enables us to see the original image in colorized form. The main obstacle to retrieving these colorized form, however, are the small offsets and misalignments in each exposure, preventing a naive stack of the three color layers on top of each other from generating a quality image.

Methodology

The project specification contained many, very helpful hints and tricks. To that end, I initially followed the spec to the letter. The first part of the project involved a single-scale implementation that worked well for the smaller images (.jpg files) included. Here, the primary implementation tasks were coding the heuristics and setting up the exhaustive search. A major decision point here was which heuristic to choose from the two provided in the spec: sum of squared differences or normalized cross correlation. I experimented with both, and realized that for smaller images, they could result in dramatically different results, but one was not necessarily superior.

In fact, I noticed that images were not very clear at all, and so I took the next step of a fixed crop before applying the heuristics. This dramatically improved performance and actually caused both SSD and NCC to return the exact same results for all the small images. I choose to crop 15 off every side of the images, as during the exhaustive search, I followed the project spec's advice to use a [-15, 15] search range.

I then implemented the image pyramid approach in order to tackle the much larger .tif images. The primary technical challenge was figuring out what this meant at all; after following Piazza comments made by students and TAs alike, I grasped the concept. I decided to use a scale factor of 2 and a depth of 5 for my pyramid. This was made upon the observation that with this scale factor and depth, I was achieving good results on the majority of the images. I tried other scale factors, but either saw no realistic improvement or saw substantial slowdown. Increasing depth too much would sometimes cause problems with the way I was still applying the fixed crop. Additionally, at the deepest layer, I chose to use a default search range of [-30, 30], which is larger than the search range from before. This helped with accuracy a good amount at minimal additional time cost.

A very interesting note here is that I was still alternating between using SSD and NCC as my heuristic. However, upon reaching emir.tif, I decided to go with the SSD heuristic. The NCC heuristic would cause the red layer of emir.tif to appear significantly offset from the blue and green layers. As mentioned in the project specification, I believe this is due to the brightness differences of the layers. However, the SSD heuristic performs excellently and has no problems with emir.tif. This is of significant interest to me because I am not entirely sure why SSD would deal with the brightness issues any better. I suspect this is because at the smallest, coarsest version of emir.tif, SSD manages to find the correct range to align emir.tif because the differences here are more coarse, and this naturally has a ripple effect downward through the pyramid in choosing the right displacement. Besides emir.tif, SSD and NCC heuristics performed approximately the same with minor differences.