In this project we aim to use Sergey Prokudin-Gorsky's work to recreate color photos from the early 20th century. We do this by taking advantage of his ingeunity: he left us with photos on a glass plate where, for a given scene, he would take three seperate single-channel photographs of the same scene with a Blue, Red, and Green filter. With this information we have enough to reconstruct a colored depiction of the scenes by overlaying the three images.
The problem? The images are not laid out in perfect alignment, and naively splitting the glass panes into three and directly overlaying them results in terrible color images. And so our task is to write an algorithm that, given the raw pixel values for each channel, determine the best offset such that the final color reconstruction is aligned.
The general idea for my approaches is to independently attempt to overlay two channels onto the remaining channel (e.g. red and green onto blue). I find the offset for each by making a pass over some range of offsets, and for each, calculating some image matching metric (where higher is better). I then simply return the offset that produced the best image matching metric.
For smaller, low-resolution imanges, this can be acheieved by simply iterating over some reasonable range of pixels ([-15, 15] in my case). This is computationally too expensive on larger images, so I've implemented a faster search with an image pyramid, where offsets are calculated on scaled-down versions of the image recursively. This worked reasonably well for most images.
Exhaustively search over the range of [-15, 15] pixels, return offset that maximizes the normalized cross-correlation (NCC), which is the dot product between the two normalized images.
Offset (x, y) is defined as shifting x rows down, y columns to the right.
With high-resolution images, instead of starting a search over a large range, I downsize the image and start a search over a range on the downsized image to find an offset for the downsized image, then use this offset as the center for a smaller search range when computing offset image matching metrics on the original image. Of course, this process can be done multiple times, and represents an image pyramid.
Initially I tried this with overlaying Red and Green onto Blue, but I found that for 'emir', this actually worked quite poorly. Based on the glass plates, I think the reason is that in terms of pixel values, Blue and Red are on opposite ends (especially for the clothes, where one is dark while other is bright). Meanwhile Green is somewhere between the two in terms of brightness, so finding the dot product between the other colors and Green would be the best compromise.
After some expiermentation I found that overlaying Red and Blue onto Green actually worked quite well, even with the standard image pyramid and NCC approach!
Instead of comparing images based on their pixel intensitie, I ran a Canny Edge Detector on each before feeding into the algorithm. In theory this would allow images in different color channels to be compared more consitently. However, the results were much better than the original, but simply changing to overlay Red and Blue onto Green worked better:
I also implemented automatic white balance with a white world assumption, forcing the minimum and maximum pixel values to stretch to 0 and 255. To do this I wanted to keep the ratio of a particular value from minumum to stay the same, and decided on the formula b = (a - low) / (high - low). The results were very subtle, since the maximum was almost always already 255 and the minumum was usually already very close to 0. Part of this is due to the extreme colors on the border, so I cropped the outer 15% of each side. Here, it's clear that the clothes are darker: