Hyun Jae Moon's CS194-26 Project 1

The goal of this project is to extract the three color channel images (R, G, B), place them on top of each other with an appropriate alignment using both exhaustive and pyramid implementation.

Normalized Cross Correlation (NCC)

The Nomrmalized Cross Correlation is widely used for checking image similarity, and it checks whether two datas contain a similar pattern of values. To implement the formula above in terms of Python, we would first obtain the mean and standard deviation of the two images we are about to align using np.mean and np.std. Since we can extract the part regarding
sqrt(1.0 / (n-1)) from both coefficients, so that we can multiply, np.sum, and then multiply the result by 1.0/(n-1).

Exhaustive Method

For jpg files, which are lot smaller in file size, I've implemented an exhaustive algorithm to detect pixel offsets and align the three color channels appropriately. Firstly, I would crop the image by a fixed value of 20px to cut the black borders off. Then, the exhuastive algorithm would check in the range of [-15, 15] pixels, and search for the largest Nomralized Cross Correlation. I would first align green channel to blue channel and find the appropriate pixel offset of the green channel. Then, I would do the same for red channel to blue channel. Finally, I would stack all three channels into one colored image. Here are the results for the jpg files after alignment with their respective offsets.

cathedral.jpg | G: (2, 5), R: (3, 12)

monastery.jpg | G: (2, -3), R: (2, 3)

tobolsk.jpg | G: (3, 3), R: (3, 6)

Pyramid Method

The downside of the exhaustive method is that it takes too much time to process large image files such as tiff files. To fasten the process, we must rescale the image to a smaller pixel margins, such that alignment can be processed at a much faster rate. Firstly, I've cropped the image with a fixed value of 200px to remove the black borders. Then I've continuously rescaled the image by half per step and proceed 5 times, which would result in 1/32 of the original image size. Then, we would perform the exhaustive method on a smaller image, and come back to the greater version to check for the range of [offset * 2 ± 1]. Then, it will produce the correct offset.

church.tif | G: (4, 25), R: (-4, 58)

emir.tif | G: (24, 49), R: (-511, 126)

harvesters.tif | G: (16, 59), R: (13, 124)

icon.tif | G: (17, 40), R: (23, 89)

lady.tif | G: (8, 47), R: (11, 113)

melons.tif | G: (9, 82), R: (12, 178)

onion_church.tif | G: (26, 51), R: (36, 108)

self_portrait.tif | G: (28, 78), R: (36, 176)

three_generations.tif | G: (14, 53), R: (10, 111)

train.tif | G: (5, 42), R: (31, 87)

workshop.tif | G: (0, 52), R: (-12, 104)

kurmy.tif | G: (-18, 25), R: (-38, 115)

lugano.tif | G: (7, 55), R: (32, 120)

zakat.tif | G: (-41, 75), R: (-68, 114)

Alignment Time Spent on Each Image

cathedral.jpg took 1.53 seconds
monastery.jpg took 1.61 seconds
tobolsk.jpg took 1.55 seconds
church.tif took 5.98 seconds
emir.tif took 5.66 seconds
harvesters.tif took 5.89 seconds
icon.tif took 6.06 seconds
lady.tif took 5.75 seconds
melons.tif took 6.06 seconds
onion_church.tif took 5.95 seconds
self_portrait.tif took 5.9 seconds
three_generations.tif took 6.15 seconds
train.tif took 6.1 seconds
workshop.tif took 6.13 seconds
kurmy.tif took 6.21 seconds
lugano.tif took 6.48 seconds
zakat.tif took 6.11 seconds

On average, the exhaustive method on jpg files took 1.56 seconds.
On average, the pyramid method on tif files took 6.03 seconds.

What is wrong with 'emir.tif'?

(Bells & Whistles: Edge Detection)

You probably realized that all my other photos look good except 'emir.tif'. Why is that? The reason why is because each color channels of the image don't necessarily have the same brightness value. In such case, we will use edge detection to consider such situations, and produce a better version. Here is the image produced before adding edge detection.

As you can see, the red channel of the image is far off to the left. However, with edge detection, this is not going to be an issue at all. Here, I will be utilizing Canny Edge Detector under skimage. It uses a filter based on the derivative of a Gaussian in order to compute the intensity of the gradients. Since skimage.feature.canny produces boolean values, we should multiply the values by 255 to convert into a grey scale. Then, we perform the pyramid implementation to align the image. The edge detection takes some time (around 5 more seconds), but it works like magic.

Red Edge of 'emir.tif'

Green Edge of 'emir.tif'

Blue Edge of 'emir.tif'

Images of the Russian Empire: Colorizing the Prokudin-Gorskii photo collection