Project 1: Images of the Russian Empire

Yena Kim - cs194-26-aax

Basic Implementation

Given the red, green, and blue channels, we can overlay them to get the full color image. With small images, it is possible to thoroughly search through a window of displacements and find the best displacement given a certain scoring metrics. I found the best displacement for both the green and red channels when compared to the blue channel. For the following images, I used a window of [-15, 15] pixels and normalized cross correlation to score each displacement.

cathedral unaligned
cathedral r [12, 3] g [5, 2]
nativity unaligned
nativity r [7, 0] g [3, 1]
monastery unaligned
monastery r [3, 2] g [-3, -2]
settlers unaligned
settlers r [14, -1] g [7, 0]

Pyramid Alignment

For larger images, such as the tif files, the basic implementation above would take too long. In order to find the best alignment, I needed to use an image pyramid. I kept scaling down the image by 2 and used the same metric as before to score the displacements. As I scale back up, I use a smaller and smaller window of displacements so when the image is really large, I don't need to compare as many alignments. Also, as I go up, I displace the image before scaling by the displacement vector of the scaled image times two, which improves the image at each iteration. The images below show the tif files that have been aligned using this implementation.

For both implementations, I also cropped the images by 1/15th of the width or length on each side so that borders don't affect the scores. After finding the displacements, and aligning all three channels, I also do a post crop so that the parts that are rolled over to the other side are cut off.

icon unaligned
icon r [90, 23] g [41, 17]
emir unaligned
emir r [107, 40] g [49, 24]
self_portrait unaligned
self_portrait r [176, 37] g [79, 30]
train unaligned
train r [85, 29] g [41, 0]
village unaligned
village r [137, 21] g [64, 10]
harvesters unaligned
harvesters r [124, 11] g [60, 17]
lady unaligned
lady r [120, 13] g [57, 9]
three_generations unaligned
three_generations r [111, 9] g [54, 12]
turkmen unaligned
turkmen r [117, 28] g [57, 22]

Bells & Whistles

Better Features

At first, I compared the original channel images to each other to score them. However, I noticed that the channels may have variances in brightness, which makes it harder to accurately score the displacements, so I decided to compare the gradients of the images instead. I computed the absolute value of the gradients in both the x and y direction, then scored them. I added the two scores together to get the new scoring metric. As you can tell below, this really helped improve the emir.tif image.

emir unaligned
emir r [0, -1073] g [49, 24]

Additional Images

building unaligned
building r [9, 0] g [1, 0]
room unaligned
room r [15, 3] g [8, 2]
tower unaligned
tower r [124, 0] g [56, 0]
owls unaligned
owls r [90, 84] g [39, 51]