Sergei Mikhailovich Prokudin-Gorskii (1863-1944), a color photographer pioneer, traveled across the Russian Empire and captured color photographs of his surroundings. He was a man ahead of his time and envisioned that the future would have color photographs. He captured the images by recording three exposures of every scene onto a glass plate with the Red, Green, and Blue filters. His RGB glass plate negatives survived and the Library of Congress (LoC) purchased them and recently made the digitized versions available online.
The goal of this project is to use image processing techniques to automatically produce a color image from the digitized glass plate images captured by Prokudin-Gorskii.
The digitized glass plate images provided are a set of hi and low-resolution images. See below for examples.
The digitized glass plate images were colorized by firstly splitting the images into three color channels, Red (R), Green (G) and Blue (B), and then aligned and stacked to form a single RGB color image. The B channel was used as the base channel to align the R and G channels. The use of the B channel as the base channel will be later investigated. This investigation was inspired by Saurav Mittal’s Fall 2020 project submission. Saurav experimented on using different base channels and obtained interesting results.
Alignment of the images can be achieved in many ways. The exhaustive search approach was used to align low-res images and the image pyramid approach was used on hi-res images.
To align the RGB channels, the following matching metrics were considered:
Sum of Squared Difference(SSD) or \(L_2\) norm: \[ SSD = \sum_{i=1}^{n}\sum_{j=1}^{m} (Image_1(i,j) - Image_2(i,j))^2 \]
Normalized Cross-Correlation (NCC): \[
NCC = \frac{Image_{1}}{||Image_{1}||} \cdot \frac{Image_{2}}{||Image_{2}||}
\]
SSD appears to align the images better than when using NCC and the program runs faster when using SSD. Hence SSD will be the matching metric used in this project to align the images.
Exhaustive search is a brute force method for aligning images. It involves splitting the image into RGB channels, then searching over a window of possible displacements, using a range of [-20,20]. Each displacement is scored using a matching metric to identify the best displacement vector (x,y) that will result in the lowest SSD score. The identified best displacement vector (x,y) is used to shift the original image to align with the base channel.
The images below were aligned using the Blue channel as the base channel.
Running an exhaustive search on the hi-res images took a long time to complete. This is why using the Image pyramid approach for hi-res images is recommended because it efficiently tries to align the channels from low level to high-level scales of the image.
The Image Pyramid approach involves rescaling the channels by a factor of 2 and then, starting with the smallest scale level pyramid, the best displacement vector (x,y) for alignment is identified by minimizing the matching metric at each level. This displacement vector is then accumulatively applied to shift the original image.
There are many ways that the images can be visually improved. Two ways of improving the appearance of the images were considered, namely, base channel selection and cropping. These adjustments were implemented from scratch in python.
As mentioned, this adjustment was inspired by Suvvatal’s project submission to test the appearance of the images for different base channels. From the images below, aligning the Red and Blue channel to the Green channel improved some of the images’ appearance. It can be seen from the monastery image below that using the Green channel as the base channel has drastically improved the image’s appearance.
The black and white border around the images makes the image less appealing. As seen below, removing the border improves the images’ alignment and also the image’s appearance. 20% of the border was cropped out of the RGB channels and after the final image’s alignment. This has improved the images’ appearance.
This Emir image below further supports the selection of the Green base channel for alignment and cropping channel images during and after alignment. The image demonstrates the effects of cropping the borders of images with different base channels. It is observed that using the Green Base Channel dramatically improved the image’s appearance.
These are additional photos I ran through the multiscale algorithm.
Aligning images using the green base channel drastically improved the appearance of the images. To the point where minimum visual adjustment is required for some of the images. Cropping the channels before and after alignment for both low and high-resolution images improved the appearance of the images. Some of the images have blemishes that can be fixed by denoising the images, and some are dull and flat and white balancing and contrasting will help make the color more realistic and pop. As a challenge for myself, I will look into creating these image adjustments from scratch.