Project 1



Overview:

In this project, we learned about producing color images by placing and aligning the three color channel images on top of each other.

Approach (Small Images):

I first cropped off 1/6 of each side of the channel images in order to get rid of the borders that could affect calculations when comparing the images with each other. To align both G and R to the B channel image, I used SSD to find the best displacement, where displacement means rolling the images, in the range [-15, 15] for both (x, y) axes. The optimal displacement was the one that returned the smallest SSD value. After finding the optimal displacement vector (x, y) for both G and R, I would use their corresponding values to roll the un-cropped G and R images, and use these rolled images to align on top of B.

Approach (Big Images):

Like the small images, the sides of the big images needed to be cropped to ignore the borders. Because using the above approach would take too long on the big images, I used the pyramid approach. I first scaled the image down to 1/16 of its size. I then used the naive approach of finding the best displacement within the range [15, 15]. Then I scaled the image back up by a factor of 2, and I doubled the vector that is returned from the previous search and used this new vector to roll whichever image (either R or G) that would be aligned with B. The reason I doubled the vector is because we just doubled the size of the image. I then repeated these steps until I rescaled the image back to its original size, adding the offset found at each iteration and doubling this sum when scaling back up. In addition, in every iteration, I decreased the size of the search range, which started at [-15, 15] for the lowest resolution, because if I left the same search space, especially for the full scale image, the algorithm would take too long (it would be the same as just using the small image approach). (See challenges for the approach for emir.tif).

Challenges:

I had difficulty aligning the R channel image onto the B channel image for emir.tif. The intensity of the robe in the R channel image threw off the calculations for SSD. To solve this problem, instead of using the B channel image as the point of reference, I used the G channel image. The robe in the G channel image has a medium intensity that is between both B and R channel images, so SSD worked a lot better. I still used the big images pyramid approach, I just switched the "base" image.



Small Images

Cathedral (G: (5, 2), R: (12, 3))

Monastery (G: (-3, 2), R: (3, 2))

Nativity (G: (3, 1), R: (7, 0))

Settlers (G: (7, 0), R: (14, -1))

Big Images

Emir (B: (-48, -24), R: (56, 16))

Harvesters (G: (60, 16), R: (124, 12))

Icon (G: (40, 16), R: (88, 22))

Lady (G: (52, 8), R: (112, 12))

Self Portrait (G: (78, 28), R: (176, 36))

3 Generations (G: (52, 12), R: (110, 12))

Train (G: (42, 4), R: (86, 32))

Turkmen (G: (56, 20), R: (116, 28))

Village (G: (64, 12), R: (136, 22))

Other Images

The following are images of my choosing from the Prokudin-Gorskii collection.
"Clothing" and "Panagias" are both small images, whereas "Capri" is a larger image.

Clothing (G: (2, -2), R: (11, -4))

Panagias (G: (6, 1), R: (14, 0))

Capri (G: (38, -12), R: (100, -12))