In this project, we take RGB color channels and accumulate them into one. What makes this task so difficult is that all these pictures were not taken in the exact same way, there were some level of alignment issues and simply stacking the RGB channels will not produce the optimal results.
In particular, the main contributions of this project are in improvement of vanilla alignment algorithm (using SSD) using cropping, image pyramid method for larger pictures, and (bells and whistles) further improvement resulting from image filters.
Vanilla Implementation: We calculate offsets between the RGB channels (we align G and R channels to B channel) doing an exhaustive grid search over 15-by-15 window. Then, we calculate the horizontal and vertical offsets using sum of squared differences. Doing a vanilla exhaustive search worked well but it gave me issues with some images not lining up, because there were lots of noise in the edges.
Image Pyramid: Image pyramids were used, and the coarsest layer had comparable input sizes to smaller images (400 by 400). At that stage, the same alignment algorithm was used as the vanilla implementation. In general, the algorithm is recursive and cuts the layers by a factor of two. The offsets are effectively calculated bottom-up from the coarsest layers (where the same alignment is used) and these are taken into account as we calculate further offsets in the next layers.
Improvement using Cropping: Because information in the middle of the picture gave me the most information, we cropped the image such that the intensity-based metric will only account for the center information in the image. We cropped approximately 10% of the images from each edge of the image. For example, village picture suffered because In the next section, we show the effects of cropping edges.
(Bells and Whistles) Improvement using Edge Detection: Instead of alignment using RGB similiarity, we instead align using similarity based on edges. This gave drastic improvements on picture alignment. Based on using Canny edge detection, Robert's cross operator, Sobel filter, we found that there were no significant differences between them, and in the pictures below, we use Roberts cross operator.
Colorize without Cropping | Colorize with Cropping |
---|---|
Colorize without Edge Detection | Colorize with Only Cropping | Colorize with Edge Detection |
---|---|---|
Name and Offsets [x, y] | Results |
---|---|
cathedral G: [5, 2] R: [12, 3] |
|
monastery G: [-3, 2] R: [3, 2] |
|
tobolsk G: [3, 3] R: [6, 3] |
|
emir G: [48, 23] R: [58, 25] |
|
harvesters G: [60, 18] R: [124, 17] |
|
icon G: [40, 18] R: [89, 24] |
|
lady G: [54, 6] R: [112, 10] |
|
melons G: [86, 0] R: [179, 12] |
|
onion_church G: [50, 26] R: [108, 36] |
|
self_portrait G: [77, 29] R: [115, 72] |
|
three_generations G: [49, 17] R: [108, 13] |
|
train G: [42, 6] R: [85, 32] |
|
village G: [64, 12] R: [136, 23] |
|
workshop G: [53, 0] R: [105, -12] |
Name and Offsets [x, y] | Results |
---|---|
cathedral G: [5, 2] R: [12, 3] |
|
monastery G: [-3, 2] R: [3, 2] |
|
tobolsk G: [3, 3] R: [6, 3] |
|
emir G: [48, 23] R: [106, 41] |
|
harvesters G: [59, 19] R: [123, 18] |
|
icon G: [40, 18] R: [89, 24] |
|
self_portrait G: [77, 29] R: [175, 37] |
Name and Offsets [x, y] | Results |
---|---|
house G: [59, 8] R: [126, 5] |
|
church G: [-1, 10] R: [25, 11] |
Name and Offsets [x, y] | Results |
---|---|
house G: [59, 8] R: [125, 6] |
|
church G: [-2, 10] R: [25, 11] |