Project 1 - Images of the Russian Empire
Name: Tzu-Chuan Lin
When I first saw this projects I immediately came out of these questions:
· Which base channel should I use? Blue, green or red?
· Is matching by raw pixels a good idea? I feel the better representation of two images will be their gradients.
· SSD and NCC seems are all good metrics. Which should I use?
In this project, I explore these questions by doing some experiments.
Basically my
algorithm looks like this:
· Load the image
· Crop the white border
· Split the image into three parts
· Align two channel images to a given base channel by exhaustive search over a given range of offset.
o
For multiscale pyramid, I first search [-20, 20]
on the smallest scale.
And use the previous scale’s offset (times 2) as the base offset and perform
[-2, 2] search on the second smallest scale. So on and so forth.
Experiments:
· Single scale + SSD, displacement = [-20, 20]
Single scale +
SSD, displacement = [-20, 20] |
Blue as base |
G as base |
R as base |
cathedral.jpg |
|
|
|
monastery.jpg |
|
|
|
tobolsk.jpg |
|
|
|
· Single scale + NCC, displacement = [-20, 20]
Single scale +
NCC, displacement = [-20, 20] |
Blue as base |
G as base |
R as base |
cathedral.jpg |
|
|
|
monastery.jpg |
|
|
|
tobolsk.jpg |
|
|
|
Conclusion:
For me, I think overall choosing B is the worse because if you zoom in and see the boundaries of the objects when using B as base, you will see some purple artifacts and for tobolsk.jpg. It is a little bit blurry when using B as base.
However, for me I cannot really see the differences between using G or R.
So I decide to just randomly pick one and I choose G as the base.
From the above table, I don’t really see the difference between SSD and NCC.
So l decide to run some experiments to compare SSD and NCC when using multiscale algorithm.
Experiment (using G as base):
NOTE: I use this equation to compute NCC:
Because this NCC will automatically balance the illuminance (by subtracting the mean for each image).
· Multiscale SSD vs Multiscale NCC (displacements = (-20, 20) for the smallest image in the pyramid, (-2, 2) for the rest of the images in the pyramid)
|
Multiscale SSD |
Multiscale NCC |
church.tif |
B: (dx, dy) =
(-4, 3), R: (dx, dy) = (-8, 5) |
B: (dx, dy) =
(-4, 3), R: (dx, dy) = (-8, 5) |
emir.tif |
B: (dx, dy) =
(-24, 7), R: (dx, dy) = (17, 1) |
B: (dx, dy) =
(-24, 7), R: (dx, dy) = (17, 1) |
harvesters.tif |
B: (dx, dy) =
(-16, -7), R: (dx, dy) = (-3, 13) |
B: (dx, dy) =
(-17, -8), R: (dx, dy) = (-3, 13) |
melons.tif |
B: (dx, dy) =
(-11, -2), R: (dx, dy) = (4, 16) |
B: (dx, dy) =
(-11, -2), R: (dx, dy) = (4, 16) |
Conclusion: From the images and the offset above, SSD and NCC doesn’t vary a lot. So I just choose SSD.
· Center masking: only compute the pixel difference in the center
o
Conclusion: using a center mask to compute the
SSD is far more better than not using it.
I think the reason why not using mask will fail is because the algorithm will
tend to match black borders(pixel = 0).
If some black borders are misaligned, it often generates large error.
|
Multiscale w/o
center mask |
Multiscale +
center mask |
train.tif |
B: (dx, dy) = (2,
13), R: (dx, dy) = (4, -9) |
B: (dx, dy) =
(-6, 12), R: (dx, dy) = (27, -12) |
self_portrait.tif |
B: (dx, dy) = (2,
26), R: (dx, dy) = (7, 17) |
B: (dx, dy) =
(-29, 1), R: (dx, dy) = (8, 18) |
emir.tif |
B: (dx, dy) =
(-9, 18), R: (dx, dy) = (9, -35) |
B: (dx, dy) =
(-24, 7), R: (dx, dy) = (17, 1) |
· Automatic deciding pyramid depth:
o In my code, if the image gets bigger, the number of images in the pyramid will automatically increase.
· Automatic corner detection and cropping
o I found that the raw image contains 4 obvious corners. So I decided to preprocess the image by first detecting the corners and then using a bounding rectangle to crop the image.
o Using icon.tif as an example:
Before |
After |
|
|
·
Automatic white balancing:
I have tried: 1. Histogram equalization, 2. CLAHE(Contrast Limited Adaptive
Histogram equalization) 3. Gray World Algorithm
But I found 1. and 2. did not produce good results (I guess it is because noise
are amplified). So I only place 3. results here.
Some look better but some look worse. I guess this is because not every image’s
mean pixel value is gray.
|
Before balancing |
After balancing
(Gray World Algorithm) |
cathedral.jpg |
|
|
harvesters.tif |
|
|
melons.tif |
|
|
self_portrait.tif |
|
|
emir.tif |
|
|
· Gradient alignment: I only give the 4 offset results because two methods produce almost the same result for all images.
|
Multiscale +
SSD |
Multiscale +
SSD + Gradient as the feature(over two gradient images) |
church.tif |
B: (dx, dy)
= (-4, 3), R: (dx, dy) = (-8, 5) |
B: (dx, dy)
= (-4, 3), R: (dx, dy) = (-8, 5) |
emir.tif |
B: (dx, dy)
= (-24, 7), R: (dx, dy) = (17, 1) |
B: (dx, dy)
= (-24, 7), R: (dx, dy) = (17, 2) |
harvesters.tif |
B: (dx, dy)
= (-16, -7), R: (dx, dy) = (-3, 13) |
B: (dx, dy)
= (-17, -8), R: (dx, dy) = (-3, 13) |
icon.tif |
B: (dx, dy)
= (-17, 1), R: (dx, dy) = (5, 6) |
B: (dx, dy)
= (-17, 2), R: (dx, dy) = (5, 6) |
Images |
Multiscale + SSD |
Multiscale + SSD:
Result image (no white balance) |
cathedral.jpg |
B: (dx, dy) = (-2,
0), R: (dx, dy) = (1, 2) |
|
church.tif |
B: (dx, dy) = (-4,
3), R: (dx, dy) = (-8, 5) |
|
emir.tif |
B: (dx, dy) =
(-24, 7), R: (dx, dy) = (17, 1) |
|
harvesters.tif |
B: (dx, dy) =
(-16, -7), R: (dx, dy) = (-3, 13) |
|
icon.tif |
B: (dx, dy) =
(-17, 1), R: (dx, dy) = (5, 6) |
|
lady.tif |
B: (dx, dy) = (-8,
6), R: (dx, dy) = (3, 0) |
|
melons.tif |
B: (dx, dy) =
(-11, -2), R: (dx, dy) = (4, 16) |
|
monastery.jpg |
B: (dx, dy) = (-2,
9), R: (dx, dy) = (1, 0) |
|
onion_church.tif |
B: (dx, dy) =
(-27, 10), R: (dx, dy) = (10, -4) |
|
self_portrait.tif |
B: (dx, dy) =
(-29, 1), R: (dx, dy) = (8, 18) |
|
three_generations.tif |
B: (dx, dy) =
(-14, 9), R: (dx, dy) = (-3, -4) |
|
tobolsk.jpg |
B: (dx, dy) = (-3,
1), R: (dx, dy) = (1, 0) |
|
train.tif |
B: (dx, dy) = (-6,
12), R: (dx, dy) = (27, -12) |
|
workshop.tif |
B: (dx, dy) = (0,
-14), R: (dx, dy) = (-11, 13) |
|