For this project, we needed to colorize Sergei Mikhailovich Prokudin-Gorskii's glass plates from the Russian Empire.
For the lower resolution images, I used an exhaustive search algorithm as suggested in the project spec. I ran a nested for loop to find the optimal displacements between -15 and 15 for both the x and y axis. The metric I used to align images was Sum of Squared Distances (SSD). I aligned the green and red channel to the blue channels. It works pretty fast and pretty good on the low-resolution images. We see some very good results.
The exhaustive search method is not feasible for high res images as the channel matrices are extremely big. For this I implemented a multiscale pyramid approach which resizes the images by a factor 2 every time for 5 levels. It calculates the optimal displacement at the top most level and then scales it and sends it to the lower level. Then we look over a range of (optimal displacement - 1/scaling_factor, optimal displacement + 1/scaling_factor) to again find the optimal displacement. We repeat this until we reach the bottom most level. To align the image, I used Normalized Cross Correlation as the metric to calculate how close two images are. It is simply the dot product of the two normalized images. I implemented this in the form of a recursive function. Once we find the optimal offsets, I simply align the green and red channels to the blue channel. Finally since we roll the images and actually displace them , we need to crop the borders. I crop the 10% of the image from each side.
Initially, I faced a lot of problems trying to understand Normalized Cross Correlation. Finally, I stumbled upon this which helped me understand cross correlation and implement the code for it.
Next, I tried to implement Canny Edge Detection. I used the OpenCV library to implement Canny's edge detection. The problem I ran into here was calculating the low and high threshold for deciding edges. I found Otsu's method (here) to help calculate the threshold for finding the edges. I found the edges for each of the RGB channels and then the optimal displacement for the edges. Then, I rolled the original RGB channels
I tried to implement three different approaches.
1) Canny Edge Detection: To improve the quality of images, I used a different feature to align the images. Instead of aligning the original RGB channels I first detected the edges in the channels and then aligned the channels. This significantly improved Emir's quality.
2) Automatic Contrasting: I also implemented automatic contrasting to make the images look better. I used the rescale intensity function with an out range of 5th percentile and the 95th percentile.
3) Automatic Cropping: My code also automatically crops 10% from each side of the image.