Name: Praveen Batra

Overview

In this project, we had images taken with separate color channels (R, G, B) and had to align them to create a color image.

To do this, I used two different approaches. For images of 512 or smaller width, I directly aligned them at a single scale with search range of 15. For images of greater than 512 width, I used an image pyramid with search range of 6 at each level and scales progressively increasing by 2. The number of scales was a function of the logarithm of the image size.

The actual aligning process was fairly straightforward. The image's three "channels" were extracted, and red and green were separately aligned to blue. The alignment was chosen based on varying the x and y offsets by +/- search range, and choosing the alignment that minimized the L2 error between the normalized edge maps of the two "channels" (monochrome images).

The edge map was calculated by subtracting an image from itself shifted slightly. I used the edge map function after normalizing for mean and standard deviation. These two transforms were necessary so that even if the image had different intensities in different channels (e.g. the blue shirt of the emir), they would be aligned correctly based on edges/shapes which are invariant across color channels.

One other tactic was to, after shifting the image, crop out the edges (1/6 of the image width/height on each side) which both removed the troublesome borders and any potential shifting artifacts from np.roll

Then, for the image pyramid, the calculated ideal offset would be scaled up for the next image and used as the basis for the next search.

Results on example images

All offsets are in the form [Green Y, Green X], [Red Y, Red X].

Workshop: [53, -1] [105, -13]


Emir: [49, 23] [120, 49]


Monastery: (-3, 2) (3, 2)


Church: [25, 4] [58, -4]


Three generations: [53, 13] [113, 10]


Melons: [51, 27] [108, 36]


Onion church: [80, 10] [178, 13]


Train: [40, 8] [85, 33]


Tobolsk: (3, 3), (7, 3)


Icon: [40, 17] [89, 23]


Cathedral: (5, 2) (12, 3)


Self portrait: [81, 30] [176, 37]


Harvesters: [60, 16] [124, 12]


Lady: [56, 9] [119, 13]


Results on other images

Here is a result on a JPG from library of congress. (-3, 1) (-4, 1)


Here is a result on a TIFF from library of congress. [40, 33] [94, 59]


Failed images

The emir image is slightly off despite using normalization and edge detection. I think that's a sign that my features are still not perfect for the case of the emir, which is difficult because the blue shirt is different across channels. However, the alignment is not terrible, and mostly visible in fine features like eyebrows -- the features still created a somewhat decent alignment for the emir.

Bells and whistles

The main bell/whistle I did was using the primitive edge map (subtracting the image from itself, shifted by YX of (1, 2)) and normalizing the image by mean and standard deviation before feeding it into the edge map function. Here's a before and after for the emir. This edge map is then compared between two channels via standard L2 norm error.

Without the edge map and normalization:


With normalization only:


With the edge map only:


With both:


And for more bells and whistles:

(source: https://upload.wikimedia.org/wikipedia/commons/b/b7/Parts_of_a_Bell.svg)


(source: https://upload.wikimedia.org/wikipedia/commons/3/37/Bird_shaped_whistle.jpg)