Bringing life to images from the Russian Empire! Submission by Harish Palani.
In the early 1900s, Sergei Mikhailovich Prokudin-Gorskii was granted permission to undertake a now historic project: photograph scenes from Russian life in full color such that later generations could and visualize that time in history.
While the technology to colorize his experiences didn't exist back then, Prokudin-Gorskii's RGB filtered exposures miraculously survived a revolution and multiple wars to be purchased by the Library of Congress in 1948, and have now been digitized and made available for public access.
With these negatives in hand, we can relive Prokudin-Gorskii's journey in full color!
For the low-resolution images in the set, I implemented a naive alignment algorithm which exhaustively searches a pre-defined window of potential displacements — in my case, (-15, 15) pixels — to find the shift values which best align the red and green plates with the blue negative.
Sum of squared differences and normalized cross-correlation loss functions both performed admirably, producing similar displacements for all three .jpg files when tested on cropped input images.
Though the brute-force approach worked well for low-resolution inputs, the majority of the dataset was composed of high-resolution images for which an exhaustive search would be far too expensive. To combat this, I implemented an image pyramid to represent the inputs in a multi-scale manner, finding the optimal displacements iteratively.
This operation proved far more efficient and yielded strong results, outputting aligned color images on par with the naive algorithm using the normalized cross-correlation loss function on cropped inputs.
Red: (66, -801)
Green: (49, 24)
Though this algorithm performed well for the majority of high-resolution images, the Emir of Bukhara proved particularly difficult due to inconsistent brightness values among the three glass plate images.
This can be addressed by tweaking the alignment method, moving away from the SSD and NCC loss functions to an edge-based scoring metric. Using an algorithm like Canny edge detection to generate features will likely yield better results for this image, reducing dependence on these brightness values.
When considering bells & whistles to add, I wanted to build on my deep learning background and apply new image processing research in practice. Seeing as a number of the low-resolution .jpg images were quite blurry, I found Zhang et al. 2018 particularly fascinating, detailing a novel approach to image super-resolution that could be applied to enhance these outputs. Pre-trained RDNs available via the ISR library were used, with the noise-cancel model weights performing best.
As shown below, the results were quite strong, with image clarity improving significantly across all inputs. The cathedral was perhaps the weakest of the three, appearing more like a painting than an actual image at times due to the density of trees and grass. Since these effects are primarily contained to a particular family of colors and textures, however, task-specific fine-tuning of what are currently off-the-shelf model weights would likely yield major improvements.
isr_tobolsk.jpg | scaled up via ISR with noise-cancel weights
tobolsk.jpg | aligned via exhaustive search with NCC loss
isr_monastery.jpg | scaled up via ISR with noise-cancel weights
monastery.jpg | aligned via exhaustive search with NCC loss
isr_cathedral.jpg | scaled up via ISR with noise-cancel weights
cathedral.jpg | aligned via exhaustive search with NCC loss