Alignment and Color Pyramids
For the naive solution to align the different colored glass plates, I essentially brute forced different [x, y] offsets between the plates and tried to measure how well they aligned. I used the default blue slide as the default image, and tried fitting the green and red images on top of it. After recording the ideal offsets for the green and red slides, I would use them to move the appropriate RGB color channels. The two different metrics to measure image alignment that I tried were Sum of Squared Differences (SDD) and Normalized Cross Correlation (NCC).
While SSD measures the differences between the pixel values themselves, NCC treats them as vectors and measures difference in terms of the angle separating the vectors. Through trying both examples, I found the SSD worked better.
For smaller images, I was able to use this brute force solution. For larger images, simply brute forcing our way and trying large sets of offsets was simply too slow. For this, I implemented an image pyramid which would recursively scale down the size of the image by 2 until it reached a minimum size (in my case under 500 pixels). Then, I would brute force an offset range of [-20, 20] pixels on this smallest image and find the best offset. Once we have the offset at the small scale and as the recursion undos and the images become bigger, I scale up the offset as well and perform a smaller localized search around the predicted offset [-2, +2] to finetune my results. This resulted in significant speedup, since it scaled the runtime by log(n) where n was the original runtime.
Example Images
As you can see the alignment only works in most situations. In emir.tif
in particular, it does quite poorly. In the next section, I will go over certain fixes I used and the end results.
To fix aligment issues, I implemented a few fixes that were not mentioned in the specifications.
Normalizing Color Vectors
Here, I took each color slide as a matrix and performed an element-wise operation to normalize the values by subtracting the mean value of the matrix and dividing by the standard deviation. I did this because in some images, the color ranges are not fully identical and thus not comparable. Converting them to standard units proved to be helpful.
Using Green as Base
The starter code used blue as the base image on which we were to align the red and blue images. However, through experimentation I found that using green tended to work better. Specifically for
emir.tif
, it seems like different parts of the image have different color distributions and matching pixel color values would not work. However, green seemed to be the color whose intensity was typically between the blue and red colors, so it worked as a good base.
Cropping Image
The final trick I used was to only focus the matching near the center of the images. I cropped the images so that the edges were only 65% of their original lengths. Many of the images provided had artifacts at their edges, so removing them and focusing on the main subject of the image (which tend to be near the center) was really helpful.
Using Edge Detection
While I did not try using edge detection, I believe it would have been really useful in handling most of the problems faced when colors themselves had different brightnesses. I did end up using edge detection for autocropping, which I cover in the "Bells and Whistles" section.
Comparing Before and After Fixes
Hover to see the before images