Russia, 1907. Sergei Mikhailovich Prokudin-Gorskii travels all over Russia, taking photographs with red, green, and blue filters over glass plates. In this project, the resulting black-and-white negatives can then be aligned to produce color photographs, achieving Sergei's vision.
To run the code, run: python main.py
To run with aligning edges for the bells & whistles portion, run: python main.py --edges
The alignment process for a single scale approach (for the smaller .jpg images) I used is as follows:
For the larger .tif files, searching over this offset range would take a very long time to run (never ran to completion before I got impatient and stopped the runs). Instead, I implemented the multi-scale recursive search. I set an initial offset range as 10% of the channel height as before. However, I resized the image and this offset range by half until the image height was at least 128 pixels. The best offset (to max the NCC as in the single scale) was found at this smallest range. Call it (x_offset, y_offset). When the function returned to one level up in the image pyramid, the new offset range was 2x_offset +/- (x_offset/4), 2y_offset +/- (y_offset/4). This drastically reduced the runtime since the search space of offsets was so much smaller. It took an average runtime of 26.4 seconds when running through all the images.
Originally I searched over offset ranges of +/- offset/2, yet this offset window still had long runtimes (consistently over a minute) for the large images. Because the selected offsets were generally close to 2*offset anyway, I narrowed the search window to +/- offset/4.
The multi scale algorithm clearly failed to align the Emir image. It might due to the negatives having different brightnesses and contrasts, making it more difficult for NCC to be an appropriate metric, which just compares raw pixel values. The image of the harvesters also has visual artifacts. There is a visible abrasion on the left hand side of the negatives of this image, which may have thrown off the NCC, again because it is directly comparing pixel values.
I downloaded the .tif files of the three separated negatives from this collection. Here are the results:
To find better offsets, I first used Canny edge detection to convert each channel from raw pixels into an image of its edges. An example of these edges is shown below, on the 3 channels of the Emir image. The rest of the implementation was the same; I used the same original offset range (10% of the image height) and still used NCC to find the best offset. The results look much cleaner, as shown below.
The left images show the original aligned image (from the initial single/multi-scale implementation), and the right images are the ones when pre-processing the images into their edges. The single scale images found exactly the same offsets, and so I do not show them here. I've resized the .tif to (512, 512) just for this webpage, to make them fit side-by-side. However, the script still produces the full-size aligned images. The average runtime for this section for all the images was 20.7 seconds (faster than the original implementation, since the offset windows happened to be smaller during the multi-scale with this preprocessing).