cs194-26 project1

Mandi Zhao; SID: 3032880866

Overview

The goal of the project is, given three seperate images on different color channels, how to align them so that all three images can be stacked as well as possible. Put in matrix terms, given three same-sized images, the problem is searching for a way to shift values in one matrix, so that the similarity between two matrices is maximized. Therefore, we need both a metric function to measure this similarity and a efficient searching algorithm that iterates over all possible offset values.
I used NCC for metric function and found that to be working well. And pyramid searching, decribed in more details below, works well on larger images.
For SOME images below, move mouse OVER to see aligned output images.

I. Smaller Images

I applied a simple grid search

Small Images

Offsets used on g and r: (-3, 2) (3, 2)

Small Images

Offsets used on g and r: (5, 2) (12, 3)

Offsets used on g and r: (3, 3) (6, 3)

II. Larger Images

To implement pyramid alignment, I implemented a recursive function that, given a starting factor (which implicitly gives the number of iterations to do) and a range of offsets to search within, start with a coarse image and update the search range after every update. On a large image, typically a 20 by 20 window works well.

Offsets used on g and r: village.tif: (64, 12) (138, 22); harvester.tif: (60, 16) (124, 14);onion_church.tif: (52, 26) (108, 36)

Offsets used on g and r: three_generations.tif: (64, 12) (138, 22); icon.tif: (40, 18) (90, 22) ;emir.tif: (50, 24) (70, 42)

Offsets used on g and r: train.tif: (64, 12) (138, 22); lady.tif:(54, 8) (116, 12) ;melons.tif:(82, 10) (178, 14)

Offsets used on g and r: workshop.tif: (64, 12) (138, 22); self_portrait.tif: (78, 30) (176, 38)

Above are given example images, below are some extra ones downloaded from LoC website, optimized by both basic alignment techniques and some bells_n_whistles described in the next section:

Offsets used on g and r: rail.tif: (28, -8) (112, -16); cross.tif: (56, 26)(124, 32); church.tif:(56, -4)(124, -2)

III. Bells n Whistles

1. Automatic cropping

Defining a border to be a color bar on the edge of aligned image outputs, I designed a recursive function to detect and crop these borders. It's intuitive and simple in implementation: assume the outmost array is border, then its similarity with its neiboring array will be high if they form a wider colorbar together.
Again using NCC to measure similarity between image arrays, I use iterations that allows comparision between not just the outmost array of the original image, but every outmost array of the cropped image from the previous stage.
However, this method requires some parameter tweaking, such as number of iterations and threshold of similarities to detect, and it works well on some images but not too well on others, as examples shown below:

However, sine many images natrually have very consistent coloring on the border (such as sky, water, or grass), this cropping method might accidentally crop more than it should. Thus the threshold parameter should be set relatively higher to make sure the colorbar it detects is "pure". Below is an example of what happens if set the threshold value too high: (here thres=0.99)

2. Adjust Contrasting

I used gamma_correction from cv2, a pretty convenient function to use.
Move mouse over to see original images after contrast adjustment:

3. Adjust Saturation

Saturation is better represented by HSV channels rather than RGB, so I designed a simple saturation boosting transformation based on HSV representation. Using cv2 package, I was able to convert an aligned rgb image to hsv first, then add an offset to saturation channel. Note that it's important to upper-bound the adjusted image so that pixel values don't exceed 255. Then converting the hsv image back to rgb, I obtained a saturated version of the output image.
Below I experimented with offset values 10, 20, and 40.