Project 1 - Images of the Russian Empire

Name: Tzu-Chuan Lin

Description

When I first saw this projects I immediately came out of these questions:

·      Which base channel should I use? Blue, green or red?

·      Is matching by raw pixels a good idea? I feel the better representation of two images will be their gradients.

·      SSD and NCC seems are all good metrics. Which should I use?

 

 

In this project, I explore these questions by doing some experiments.

 

Approach

Basically my algorithm looks like this:

·      Load the image

·      Crop the white border

·      Split the image into three parts

·      Align two channel images to a given base channel by exhaustive search over a given range of offset.

o   For multiscale pyramid, I first search [-20, 20] on the smallest scale.
And use the previous scale’s offset (times 2) as the base offset and perform [-2, 2] search on the second smallest scale. So on and so forth.

 

Which base channel should I use?

 

Experiments:

·      Single scale + SSD, displacement = [-20, 20]

 

Single scale + SSD, displacement = [-20, 20]

Blue as base

G as base

R as base

cathedral.jpg

monastery.jpg

tobolsk.jpg

 

·      Single scale + NCC, displacement = [-20, 20]

Single scale + NCC, displacement = [-20, 20]

Blue as base

G as base

R as base

cathedral.jpg

monastery.jpg

tobolsk.jpg

 

Conclusion:

For me, I think overall choosing B is the worse because if you zoom in and see the boundaries of the objects when using B as base, you will see some purple artifacts and for tobolsk.jpg. It is a little bit blurry when using B as base.

However, for me I cannot really see the differences between using G or R.

So I decide to just randomly pick one and I choose G as the base.

SSD vs NCC

 

From the above table, I don’t really see the difference between SSD and NCC.

So l decide to run some experiments to compare SSD and NCC when using multiscale algorithm.

 

Experiment (using G as base):

NOTE: I use this equation to compute NCC:

Because this NCC will automatically balance the illuminance (by subtracting the mean for each image).

 

·      Multiscale SSD vs Multiscale NCC (displacements = (-20, 20) for the smallest image in the pyramid, (-2, 2) for the rest of the images in the pyramid)

 

Multiscale SSD

Multiscale NCC

church.tif

 

B: (dx, dy) = (-4, 3), R: (dx, dy) = (-8, 5)

 

B: (dx, dy) = (-4, 3), R: (dx, dy) = (-8, 5)

emir.tif

 

B: (dx, dy) = (-24, 7), R: (dx, dy) = (17, 1)

 

B: (dx, dy) = (-24, 7), R: (dx, dy) = (17, 1)

harvesters.tif

 

B: (dx, dy) = (-16, -7), R: (dx, dy) = (-3, 13)

 

B: (dx, dy) = (-17, -8), R: (dx, dy) = (-3, 13)

melons.tif

 

B: (dx, dy) = (-11, -2), R: (dx, dy) = (4, 16)

 

B: (dx, dy) = (-11, -2), R: (dx, dy) = (4, 16)

 

Conclusion: From the images and the offset above, SSD and NCC doesn’t vary a lot. So I just choose SSD.

Extra credits

 

·      Center masking: only compute the pixel difference in the center

o   Conclusion: using a center mask to compute the SSD is far more better than not using it.
I think the reason why not using mask will fail is because the algorithm will tend to match black borders(pixel = 0).
If some black borders are misaligned, it often generates large error.

 

Multiscale w/o center mask

Multiscale + center mask

train.tif

 

B: (dx, dy) = (2, 13), R: (dx, dy) = (4, -9)

 

B: (dx, dy) = (-6, 12), R: (dx, dy) = (27, -12)

self_portrait.tif

 

B: (dx, dy) = (2, 26), R: (dx, dy) = (7, 17)

 

B: (dx, dy) = (-29, 1), R: (dx, dy) = (8, 18)

emir.tif

 

B: (dx, dy) = (-9, 18), R: (dx, dy) = (9, -35)

 

B: (dx, dy) = (-24, 7), R: (dx, dy) = (17, 1)

 

·      Automatic deciding pyramid depth:

o   In my code, if the image gets bigger, the number of images in the pyramid will automatically increase.

·      Automatic corner detection and cropping

o   I found that the raw image contains 4 obvious corners. So I decided to preprocess the image by first detecting the corners and then using a bounding rectangle to crop the image.

o   Using icon.tif as an example:

Before

After

 

·      Automatic white balancing:
I have tried: 1. Histogram equalization, 2. CLAHE(Contrast Limited Adaptive Histogram equalization) 3. Gray World Algorithm
But I found 1. and 2. did not produce good results (I guess it is because noise are amplified). So I only place 3. results here.

Some look better but some look worse. I guess this is because not every image’s mean pixel value is gray.

 

Before balancing

After balancing (Gray World Algorithm)

cathedral.jpg

harvesters.tif

melons.tif

self_portrait.tif

emir.tif

 

·      Gradient alignment: I only give the 4 offset results because two methods produce almost the same result for all images.

 

 

Multiscale + SSD

Multiscale + SSD + Gradient as the feature(over two gradient images)

church.tif

B: (dx, dy) = (-4, 3), R: (dx, dy) = (-8, 5)

B: (dx, dy) = (-4, 3), R: (dx, dy) = (-8, 5)

emir.tif

B: (dx, dy) = (-24, 7), R: (dx, dy) = (17, 1)

B: (dx, dy) = (-24, 7), R: (dx, dy) = (17, 2)

harvesters.tif

B: (dx, dy) = (-16, -7), R: (dx, dy) = (-3, 13)

B: (dx, dy) = (-17, -8), R: (dx, dy) = (-3, 13)

icon.tif

B: (dx, dy) = (-17, 1), R: (dx, dy) = (5, 6)

B: (dx, dy) = (-17, 2), R: (dx, dy) = (5, 6)

 

Result

 

 

Images

Multiscale + SSD

Multiscale + SSD: Result image (no white balance)

cathedral.jpg

B: (dx, dy) = (-2, 0), R: (dx, dy) = (1, 2)

church.tif

B: (dx, dy) = (-4, 3), R: (dx, dy) = (-8, 5)

emir.tif

B: (dx, dy) = (-24, 7), R: (dx, dy) = (17, 1)

harvesters.tif

B: (dx, dy) = (-16, -7), R: (dx, dy) = (-3, 13)

icon.tif

B: (dx, dy) = (-17, 1), R: (dx, dy) = (5, 6)

lady.tif

B: (dx, dy) = (-8, 6), R: (dx, dy) = (3, 0)

melons.tif

B: (dx, dy) = (-11, -2), R: (dx, dy) = (4, 16)

monastery.jpg

B: (dx, dy) = (-2, 9), R: (dx, dy) = (1, 0)

onion_church.tif

B: (dx, dy) = (-27, 10), R: (dx, dy) = (10, -4)

self_portrait.tif

B: (dx, dy) = (-29, 1), R: (dx, dy) = (8, 18)

three_generations.tif

B: (dx, dy) = (-14, 9), R: (dx, dy) = (-3, -4)

tobolsk.jpg

B: (dx, dy) = (-3, 1), R: (dx, dy) = (1, 0)

train.tif

B: (dx, dy) = (-6, 12), R: (dx, dy) = (27, -12)

workshop.tif

B: (dx, dy) = (0, -14), R: (dx, dy) = (-11, 13)