Nitzan Orr Project 1 -- CS 194-26 UC Berkeley

For this project I aligned sets of 3 images of the same scene to create a single color image. The 3 images of each set were originally taken using a Red, Blue, or Green filter in front of the camera, thus recording those channels respectively. To combine them, for each pixel in the final image I concatenated the values of each of the original 3 pixels on top of eachother. Together, they formed a tuple (R, G, B) which when displayed on a color screen forms a colored pixel. However, the 3 images were not perfectly aligned so I had to use the linear transformation of translation to best align the 3 channels.

The first method I tired was doing Sum of Squared Distances to find the smallest distance between the 3 image channels. However, it wasn't invairant to differences in brightness between the 3 channels, thus being inaccurate for some images. Next, I tried normalized Cross Correlation. That was successful because it normalized each color channel before comparing it to the others. I would then essentially slide the different channels on top of the "reference" channel to find the height correlation. It wasn't perfect yet because the images had borders. So I made sure to look at only the center of each image in order to exclude the noisy, unhelpful borders when finding the best correlation. At that point, it was the best alignment for each of the 2 channels in comparison to the reference channel

However, simply sliding images proved to be slow, especially once there were hundreds of thousands of pixels in an image. So then I used an image pyramid in order to make coarse alignment adjustments when the image was scaled down, working my way up to the original size of the image. At each step of the image pyramid my alignment became finer and finer, eventually needing only a final adjustment of just a few pixels in the X and Y directions. Although for some images the final shift was close to 200 pixels, most of the translation could be calculated on a smaller image size within the image pyramid, thus saving time.

I ran into trouble for certain image scales, where the center area of the image that I was analyzing was too large compared to the current size of the image. To mitigate that issue I cropped out differently-sized center areas depending on the size of the image at a given scale. As scales decreased so would the center portion of the image I was analyzing. As stated above that was done in order to exclude the borders of the images.

cathedral.jpg (341, 390, 3) Green Roll (X, Y) 2.0 5.0 Red Roll (X, Y) 3.0 12.0

emir.tif (3209, 3702, 3) Green Roll (X, Y) 22.0 49.0 Red Roll (X, Y) 41.0 100.0

harvesters.tif (3218, 3683, 3) Green Roll (X, Y) 16.0 60.0 Red Roll (X, Y) 13.0 124.0

icon.tif (3244, 3741, 3) Green Roll (X, Y) 17.0 41.0 Red Roll (X, Y) 23.0 89.0

lady.tif (3212, 3761, 3) Green Roll (X, Y) 3.0 55.0 Red Roll (X, Y) 7.0 103.0

melons.tif (3241, 3770, 3) Green Roll (X, Y) 5.0 82.0 Red Roll (X, Y) 11.0 177.0

monastery.jpg (341, 391, 3) Green Roll (X, Y) 2.0 -3.0 Red Roll (X, Y) 2.0 3.0

onion_church.tif (3215, 3781, 3) Green Roll (X, Y) 24.0 52.0 Red Roll (X, Y) 35.0 109.0

self_portrait.tif (3251, 3810, 3) Green Roll (X, Y) 26.0 78.0 Red Roll (X, Y) 34.0 175.0

three_generations.tif (3209, 3714, 3) Green Roll (X, Y) 9.0 53.0 Red Roll (X, Y) 9.0 110.0

tobolsk.jpg (341, 396, 3) Green Roll (X, Y) 3.0 3.0 Red Roll (X, Y) 3.0 7.0

train.tif (3238, 3741, 3) Green Roll (X, Y) 1.0 42.0 Red Roll (X, Y) 30.0 90.0

village.tif (3270, 3819, 3) Green Roll (X, Y) 10.0 65.0 Red Roll (X, Y) 21.0 137.0

workshop.tif (3209, 3741, 3) Green Roll (X, Y) -1.0 53.0 Red Roll (X, Y) -13.0 100.0

Here are some images from the library of congress that I picked out myself. Please enjoy!

gate.jpg (341, 398, 3) Green Roll (X, Y) 1.0 6.0 Red Roll (X, Y) 0.0 14.0

hut.jpg (341, 399, 3) Green Roll (X, Y) -1.0 4.0 Red Roll (X, Y) -1.0 11.0

mosque.jpg (341, 395, 3) Green Roll (X, Y) 3.0 6.0 Red Roll (X, Y) 5.0 14.0