Project 1 - Images of the Russian Empire

Mian Zhong

Overview

This project is focusing on different methods to align three color channels and colorize some historical photos. These photos were taken with red, green and blue filters separately on glass plate. Thanks to this insightful Russian artist, Sergei Mikhailovich Prokudin-Gorskii, I digitalized and colorized some of his masterpieces to view the life in Russian Empire. The input data is a combined black-and-white image of three channels in order of B, G, R. After manipulation, the output is a relatively aligned color photo. I explored different metrics for accurate alignment. Moreover, I also want the alignment to be efficient for high-resolution images.

Approach

Before utilizing any metrics to align three channels, I first read the combined three-channel image divide and simply stack the three channels. The effects are not ideal as we can see the faces are not properly aligned. The image thus looks very blurry. The original input data roughly has two types of size. The small images are in jpge format and a single channel has a dimension about 400px by 400px. The large are in tiff format and a single channel has about 4000px by 4000px! See the pictures Monastery (small), Harvesters (large) and , Emir (large):

p.s. though not accurate I enjoy some aesthetic of such naive alignment to add another flavor for documentary photos

Metrics

    Since the pixel value in each channel implies the brightness, to align three channels by fixing one channel and move around two other channels to match the brightness information should be expected to produce more accurate alignment. Therefore, we need to come up with estimation to quantify the "brightness matchness". I mainly implemented two metrics for the evaluation, SSD and NCC.

    1. Sum of Squared Differences (SSD) Distance

    SSD is L2 norm which computes the sum(sum((image1 - image2)^2)). The intuition here is to acquire SSD as low as possible so that image 1 and 2 are indeed very close to each other.

    2. Normalized Cross-Correlation (NCC)

    NCC computes the dot product between two normalized vectors, namely, image1/||image1|| and image2/||image2||. When using NCC, we have to maximize the score because the closeness implied from a larger the product.

Some Observations

Comparison between SSD v.s. NCC

L: SSD, R:NCC

L: SSD, R:NCC

Comparison between different base color channel

L: Blue Fixed; R: Green Fixed

L: Blue Fixed; R: Green Fixed

L: Red fixed, M: Blue Fixed; R: Green Fixed

Efficiency Improvement

While the exhaustive search took seconds for small images, it is a headache to run this search on the large image. Therefore, an image pyramid structure can be really helpful. I used an image pyramid to recursively search on alignment in order to reduce search space when the image gets larger.

Full Gallery of Colorized Images

Staff Pick

Small Size Photos

Displacement:

Cathedral:R: (-1, 7), G: (-1, 1) / Monastery: R:(1, 6), B:(0, 6)

Displacement:

Nativity: R:(1, 7), G: (1, 3) / Settlers: R:(-1, 14), G:(0, 7)

Large Size Photos

Displacement:

Harvester:R: (9, 112), G:(9, 56) / Lady: R:(1, 112), G:(1, 48) / Turkmen: R:(17, 112), G:9, 56)

Displacement:

Emir: G:(-16, -56), B:(-48, -96) / Train: R:(25, 88), G:(1, 40) / Icon: R:(17, 88), G:(9, 40)

Displacement:

Self Portrait: R: (1, 96), B:(-24, -87) / Three Generations: R:(1, 112), G:(9, 56) / Village: R: (1, 72), B: (-16, -64)

My Pick

I picked several more photos from the archive to have them colorized. They portrait some blades, a cute vase and another beautiful church. If you are interested in this archive, please checkout here: Prokudin-Gorskii Collection from Library of Congress.

Displacement:

Blades: R:(49, 96), G:(25, 40) / Vase: R: (9, 80), G: (1, 8) / Church: R: (1, 64), G: (1, 16)

Challenges

The first around try-outs did not output well-aligned colorized for some large images using NCC and image pyramid. Then I checked those photos and decided to use a threshold to cut off some pixels from the borders. I tried to manually cropped 200 px border from top, bottom, left and right before alignment algorithm. The visual outputs are much better. I believe that the noise of border significantly disturbs the metrics evaluation and thus did not give good alignment at first. Therefore, I definitely should implement automatic cropping for next steps in order to generalize the approach without manual work.

Another difficulty comes with photos Self-portrait and Village, after using different base color channel and metrics, the output seems still very blurry. Thus, I checked the code again, and noticed that the search space of bottom levels of the image pyramid is hard-coded into [-3, 3]. I turned it into another variable, and then tested out another windowsize [-1, 1] for bottom levels of the pyramid. Then the results are much better.