Rajita Pujare
In this project, we were tasked with the digitized Prokudin-Gorskii glass plate images and, using image processing techniques, automatically produce a color image. The way to do this was to divide the image into three equal parts and align the second and the third parts (G and R) to the first (B).
For the smaller JPGs, I used the sum of squared distances metric, or SSD. This value was minimized between two images to find the best alignment. I initially chose the blue color channel as by base as it was given. To align the parts, I exhaustively searched over a window of possible displacements in a range of [-15, 15], i.e. vertical and horizontal shifts. I tried both SSD and NCC (normalized cross correlation) as a metric to identify similarity between the two channel images, and SSD ultimately gave better results. By minimizing SSD, the best displacements were found.
SSD = ( sum(sum((image1-image2)^2)))
NCC = dot_product((image1./||image1|| and image2./||image2||))
After the two channels were appropriately shifted, all three were stacked to produce the final image.
For the larger tif images, it was clear that the exhaustive searching heuristic was not going to suffice.
This is because it only examined displacements up to 15 px, and running larger exhaustive searches would take too
long and be inefficient.
To address this, a coarse-to-fine pyramid speedup was used to handle large tif images.
The speedup found the coarsest displacement by scaling down the image's resolution by a factor of 2. I did this
until the image was scaled down to 1/16 of its original size,
where an exhaustive search was run to find the offset. The image was shifted, and within the next coarsest
resolution (scaled down by 1/8 of original size), the images were aligned using
the search to adjust and update the calculated offset.
Finally, the top level is reached where we return the optimal displacement.
As we can see above, the emir photo was not successfully aligned: this is likely because each of the color channels
has different levels of brightness, preventing the image matching metric from identifying similarity.
However, by setting the base channel as green (and aligning the red and blue channels), we see a much
better result.