The Sergei Mikhailovich Prokudin-Gorskii Collection features color photographic surveys of the vast Russian Empire made between ca. 1905 and 1915. Frequent subjects among the 2,607 distinct images include people, religious architecture, historic sites, industry and agriculture, public works construction, scenes along water and railway transportation routes, and views of villages and cities. An active photographer and scientist, Prokudin-Gorskii (1863-1944) undertook most of his ambitious color documentary project from 1909 to 1915. The Library of Congress purchased the collection from the photographer's sons in 1948. (https://www.loc.gov/pictures/collection/prok/)

Introduction to the Task:

The goal of this assignment is to digitalize the images taken by Prokudin-Gorskii. To do this, I need to use image processing techniques to convert the three color channel images into a single well-aligned RGB image.

Implementations:

The base idea of this project is to minimize the differences between color channels through search methods. For large images, an exhaustive search will become prohibitively expensive. A much faster and more efficient choice is using an image pyramid: we first determine the number of levels of this pyramid based on the size of the image, rescale the size of the image on each level of the pyramid, then compare the difference of alignments in a predefined search range on each level and pass the information to the next level, and the result we get on the bottom layer is the information we need. The loss functions to measure the differences include the Sum of Squared Differences (L2 norm) and Normalized Cross-Correlation.

Bells & Whistles:

Instead of aligning based on RGB similarity, we can also use edges and gradients. One technique is the Canny Edge Detector, an edge detection operator that uses a multi-stage algorithm to detect a wide range of edges in images. Before calculating the differences between color channels on each pyramid level, we implement canny, which only keeps the edges above a certain threshold and removes most noises. This technique improves the alignment in most cases: