CS 194 Fall 2020

Project 1: Colorizer

Andrew Aikawa

Overview

The goal of this project was to create a single color image by using 3 photo negatives each taken with a red, green, and blue filter. To do this, the 3 photo negatives have to be aligned. I chose green to be the one color that I align the other colors to. The reason for this is that the green absorption spectra shares a good overlap with both red and blue, reducing the likelihood of having regions in the green template with a significantly different exposure relative to the other color negatives. This choice made the algorithm work on all the example images whereas other choices would fail on the 'emir.tif' image because the red and blue negatives were so characteristically different from having different adsorption spectra.

Process

Alignment is done pairwise. I try to find the pixel shift necessary to make either the red or blue negative align to the green. I used a similarity metric of Mean Squared Error (MSE) which calculates the average squared difference elementwise of pixel values of the aligned image against its corresponding pixel in the green template. Initially, I just tested all pixel shifts on the cartesian grid [-15, 15] x [-15, 15] and take the one with the lowest MSE. However, this doesn't scale well since calculating the MSE for large images is expensive. To remedy this, I used an image pyramid. What this does is coarsen my search by scaling down the two images being compared by a factor of 2 at each level of the pyramid, cheapening the cost of comparing two images. Once I have an estimate of the shift at one level, I can move up to a higher resolution level. The estimate of the shift is scaled up by 2, since each level differs by a factor of 2. This estimate is successively refined by considering the image at successively higher resolutions until reaching the actual resolution of the image. This makes it possible to search over a smaller grid at each level, which I chose to be [-4, 4] x [-4, 4] after implementing the search pyramid. The number of levels for the pyramid I calculated (arbitrarily) from the following equation

pyramid level = floor(log(# of pixels + 1) / 4) + 1

Aligned Example Images

Below are the results of colorizing the images captioned below with the associated shifts in the red and blue channels written as blue(xb, yb) : red(xr, yr) with (xb, yb) being the horizontal and vertical pixel shifts respectively for the blue channel and similarly for the red channel.




blue(3, 33) : red(1, -64)
blue(4, 4) : red(-1, -7)
blue(24, 49) : red(-18, -57)
blue(16, 60) : red(2, -65)
blue(18, 40) : red(-6, -49)
blue(8, 55) : red(-1, -62)
blue(12, 82) : red(-3, -97)
blue(4, -3) : red(-2, -6)
blue(27, 50) : red(-10, -59)
blue(30, 79) : red(-6, -99)
blue(14, 52) : red(4, -59)
blue(0, 3) : red(0, -3)
blue(6, 41) : red(-24, -46)
blue(0, 51) : red(10, -53)

Extra Colorized Images


blue(-6, 59) : red(16, -71)
blue(-18, 88) : red(32, -96)
blue(17, 17) : red(-13, -73)