The goal of this assignment was to take the digitized Prokudin-Gorskii glass plate images and, using image processing techniques, automatically produce a color image with as few visual artifacts as possible. I was to take three color channel images, place them on top of each other, and automatically align them in order to create a single RGB image.
In my approach, I read in the three-strip image and divided it into the three color channel component images: one for each primary color (blue, green, and red). The green channel is then aligned to the blue channel, and the red channel is then aligned to the resulting aligned green channel. The reason for this is because the red channel is typically more intense/dark than the other channels, and as such is often harder to align with the blue channel. This improved the result obtained from emir.tif slightly. I used an image pyramid to align images - the widest displacement search occurred at the top of the pyramid, and each iteratively decreasing level had only small displacement search ranges in order to fine tune my search. The most optimal image alignment was deemed so through SSD between the two images. Once the images were aligned, they are then compared into a resulting color image. My algorithm takes roughly 2 minutes to run on all of the provided images.
I constructed an image pyramid for each color channel image to align. I chose to base the number of levels on the size of the image using the formula
num_levels = log10(n)
where n is equal to the length of a row in the image. My pyramid function would return an array of size num_levels. Each increasing level of the pyramid was progressively scaled down by half. My alignment algorithm would start on the highest level of the pyramid, the scale with the lowest resolution, and searched within an x and y displacement of [-20, 20] each. Performing my widest search on the highest level was most cost-efficient. In successive lower levels, I would save my previously obtained displacement, multiply it by two to account for the pyramid scaling, and search within a new range of [-2, 2]. This improved run-time significantly on the higher resolution .tif format images.
I did not implement automatic cropping, but I chose to crop a preset 15% of the image on each side before I compared using SSD. The reason for this was because the images had artifacts on the borders that were not part of the intended image and would have caused inconvenient noise. In addition, there was also noise from shifting each image according to the displacement vectors because my code used np.roll(). Cropping a preset portion of the image before cropping thus was a practical choice to reduce noise and make my alignments more accurate.
The formula I used to compare the alignment of images was SSD (Sum of Squared Differences), favoring alignments with lower SSDs. Specifically, my formula was defined as
∑ (image1 - image2)^2
Blue - No Displacement Green - X: 2, Y: 5 Red - X: 3, Y: 12 |
Blue - No Displacement Green - X: 2, Y: -3 Red - X: 3, Y: 3 |
Blue - No Displacement Green - X: 1, Y: 3 Red - X: 0, Y: 8 |
Blue - No Displacement Green - X: 0, Y: 7 Red - X: -1, Y: 15 |
Blue - No Displacement Green - X: 24, Y: 48 Red - X: 41, Y: 105 |
Blue - No Displacement Green - X: 17, Y: 59 Red - X: 14, Y: 124 |
Blue - No Displacement Green - X: 17, Y: 41 Red - X: 22, Y: 90 |
Blue - No Displacement Green - X: 8, Y: 55 Red - X: 12, Y: 115 |
Blue - No Displacement Green - X: 29, Y: 78 Red - X: 37, Y: 174 |
Blue - No Displacement Green - X: 14, Y: 52 Red - X: 11, Y: 111 |
Blue - No Displacement Green - X: 6, Y: 42 Red - X: 32, Y: 85 |
Blue - No Displacement Green - X: 21, Y: 55 Red - X: 28, Y: 116 |
Blue - No Displacement Red - X: 22, Y: 134 Green - X: 12, Y: 64 |
|
Blue - No Displacement Green - X: 0, Y: 6 Red - X: 0, Y: 17 |
|
Blue - No Displacement Green - X: 0, Y: 3 Red - X: 0, Y: 12 |
|
Blue - No Displacement Green - X: 4, Y: 4 Red - X: 8, Y: 13 |