Kehan Wang Proj 1

CS194-26 Project 1

Kehan Wang

Overview

For this project, we are given images from three color channels (R, G, B) that are not aligned to each other. Our goal is to align R and G to B by displacing them.

Approach

My approach, given an image to align to the target, is:

Remove Outlier:
Crop out the edges to remove outliers. Only use the inner two-thirds for alignment.
Normalize brightness:
To counter the differences in brightness in all three channels, we average the brightness in each channel by dividing all pixels with the average pixel value in a channel.
Search for displacement:
Simple for loops that search for the displacement that generates the smallest SSD (Sum of Squared Distance)
Image pyramid:
Use an image pyramid to make 2 faster:
- If image is smaller than 500 by 500, we run 2 on the image, searching (-100, 100) for each axis.
- Otherwise, we downscale the image(by a scale of 2 in my case) to try again.
- After returning from the base case, we can multiply the displacement by the downsampling scale (2 in my case), and run 2 with a smaller range, searching (-2, 2) for each axis.
- In essence, we go up to top of the pyramid, search extensively for the best displacement, and go down each layer of the pyramid to refine the displacement with a smaller searched range. At the bottom of the pyramid, we have our final displacement value.

Challenges / Failures

Using the aboved approach, the image emir is not aligned correctly. I think the reason is because the cape that emir had displayed inconsistent values in different channels - while the beard is always black and the cap is always white, the cape is in white, grey and black in three different channels, as showed below. This can result in misalignment because a properly aligned picture can have a higher SSD. With our approach, it tries to align the darker background of the cape to the dark, X-shape patterns in the blue channel, ultimately generating a wrong displacement.

Blue	Green	Red

Results on example images

castle.tif
Green: (34, 3)
Red: (98, 4)

cathedral.jpg
Green: (5, 2)
Red: (12, 3)

emir.tif
Green: (49, 24)
Red: (36, 48)

harvesters.tif
Green: (59, 17)
Red: (124, 14)

icon.tif
Green: (41, 17)
Red: (89, 23)

lady.tif
Green: (55, 9)
Red: (117, 12)

melons.tif
Green: (82, 11)
Red: (179, 14)

monastery.jpg
Green: (-3, 2)
Red: (3, 2)

onion_church.tif
Green: (51, 27)
Red: (108, 36)

self_portrait.tif
Green: (78, 29)
Red: (176, 37)

three_generations.tif
Green: (52, 14)
Red: (111, 11)

tobolsk.jpg
Green: (3, 3)
Red: (6, 3)

train.tif
Green: (42, 6)
Red: (87, 32)

workshop.tif
Green: (53, 0)
Red: (105, -12)

Results on self-selected images

desert.tif
Green: (32, 5)
Red: (80, 8)

rural.tif
Green: (44, 37)
Red: (97, 46)

ancient.tif
Green: (50, -7)
Red: (114, -13)

landscape.tif
Green: (54, 7)
Red: (121, 32)

Bells and whistles

Better features:
By applying opencv’s canny edge detector, we convert our input imags into images of edges. This way, if there are clearly defined edges in all channels, then we can run our approach on the images of edges to find our desired displacement, and align the original images.

Before	After