Images of the Russian Empire:

CS194-26 Proj 1 (Project Web Page) : Colorizing the Prokudin-Gorskii photo collection;

Project Overview

In this project, we were given a set of images taken by Sergei Mikhailovich Prokudin-Gorskii in the early 1900s. Ahead of his time, he took pictures of countless landmarks using three expores of every scene using a red, blue, and green filter. While there was no way to integrate the three photographs into a singular color image at the time, our project today is to align the three images taken, layer them on top of each other, and then use the in-built Python libraries to display the images.

My approach

My approach to this project was to develop a script on Jupyter notebook that could successfully process a single image, and use the algorithm and techniques to develop abstractions that could generalize to all of the images in the data set. I used the boilerplate code provided and noticed all I needed to implement was the `align` function. I then shifted the first image 15 pixels vertically and horizontally and then overlayed the two images. For each of the displacements, I calculate a score that corresponded to the difference of the images (lower the score the closer the two images are aligned) and then use the displacement with the minimum score. This score function are based on those provided: SSD (sum of squared differences) and NCC (normalizd cross correlation).

Challenges

Once I completed the basic implementation, it worked well for the smaller images, but I underestimated how long it would actually take for the larger images. Then I proceeded to implement the pyramiding function. I implemented pyramid aligning with an iterative approach as follows:

First I down sampled the two images by 50% until they were less than 300 pixels tall. I have a preset max_dispacement used to find the best vertical and horizontal dislacements for this blurriest image. Then I use the images one scale above, align the two images based on the displacements I found for the previous down sampled images, and then fine tune the adjustments. I continue this process until I've fine tuned the alignment for the original images.

Here's how it works:

Another challenge I had was when I was calculating the score between two images, I originally used the whole images. But the black borders from the image, and the colorful borders from shifting the image added unncessary noise and made some images seem incorrectly more similar or different than they actually were. To solve this issue, I added another parameter called frame_size; I only compared a frame within each of the images a small subsction of the center of each image.

Results

Small Images

Cathedral

Loss Function: Sum of squared distance

Green and Blue Displacement: Vertical: 5; Horizontal: 2

Red and Blue Displacement: Vertical: 12; Horizontal: 3

Monastery

Loss Function: Normalized cross correlation

Green and Blue Displacement: Vertical: -3; Horizontal: 2

Red and Blue Displacement: Vertical: 3; Horizontal: 2

Settlers

Loss Function: Normalized cross correlation

Green and Blue Displacement: Vertical: 7; Horizontal: 0

Red and Blue Displacement: Vertical: 14; Horizontal: -1

Nativity

Loss Function: Normalized cross correlation

Green and Blue Displacement: Vertical: 3; Horizontal: 1

Red and Blue Displacement: Vertical: 8; Horizontal: 0

Large Images

Harvesters

Loss Function: Normalized cross correlation

Green and Blue Displacement: Vertical: 58; Horizontal: 16

Downscaled 4 times during pyramid aligning.

Red and Blue Displacement: Vertical: 124; Horizontal: 16

Downscaled 4 times during pyramid aligning.

Emir

Loss Function: Normalized cross correlation

Green and Blue Displacement: Vertical: 47; Horizontal: 23

Downscaled 4 times during pyramid aligning.

Red and Blue Displacement: Vertical: 236; Horizontal: -198

Downscaled 4 times during pyramid aligning.

This solution actually required me to use an edge detector. For the original image without edge detection, please refer to the bells and whistles sections for the comparison betweent the non-edge detection image and the enhanced image.

Icon

Loss Function: Normalized cross correlation

Green and Blue Displacement: Vertical: 40; Horizontal: 18

Downscaled 4 times during pyramid aligning.

Red and Blue Displacement: Vertical: 88; Horizontal: 23

Downscaled 4 times during pyramid aligning.

Lady

Loss Function: Normalized cross correlation

Green and Blue Displacement: Vertical: 51; Horizontal: 6

Downscaled 4 times during pyramid aligning.

Red and Blue Displacement: Vertical: 111; Horizontal: 8

Downscaled 4 times during pyramid aligning.

Self Portrait

Loss Function: Normalized cross correlation

Green and Blue Displacement: Vertical: 77; Horizontal: 29

Downscaled 4 times during pyramid aligning.

Red and Blue Displacement: Vertical: 175; Horizontal: 37

Downscaled 4 times during pyramid aligning.

Note: for this particular image, I had to increase the scanning area. For all other images, after downscaling, I would scan a displacement of +/- 8 pixels. But for this image, I search +/- 16 pixels. `MAX_DISP` = 16 this time.

Three Generations

Loss Function: Normalized cross correlation

Green and Blue Displacement: Vertical: 48; Horizontal: 16

Downscaled 4 times during pyramid aligning.

Red and Blue Displacement: Vertical: 108; Horizontal: 11

Downscaled 4 times during pyramid aligning.

Train

Loss Function: Normalized cross correlation

Green and Blue Displacement: Vertical: 42; Horizontal: 6

Downscaled 4 times during pyramid aligning.

Red and Blue Displacement: Vertical: 84; Horizontal: 32

Downscaled 4 times during pyramid aligning.

Turkmen

Loss Function: Normalized cross correlation

Green and Blue Displacement: Vertical: 56; Horizontal: 22

Downscaled 4 times during pyramid aligning.

Red and Blue Displacement: Vertical: 116; Horizontal: 28

Downscaled 4 times during pyramid aligning.

Village

Loss Function: Normalized cross correlation

Green and Blue Displacement: Vertical: 64; Horizontal: 12

Downscaled 4 times during pyramid aligning.

Red and Blue Displacement: Vertical: 136; Horizontal: 22

Downscaled 4 times during pyramid aligning.

Bonus Pictures

Chapel where Ivan the Terrible's Son was Born!

Loss Function: Normalized cross correlation

Green and Blue Displacement: Vertical: 70; Horizontal: -3

Downscaled 4 times during pyramid aligning.

Red and Blue Displacement: Vertical: 121; Horizontal: -23

Downscaled 4 times during pyramid aligning.

Floodgate and guardhouse of the M.P.S.

Link: http://www.loc.gov/pictures/collection/prok/item/prk2000000005/

Loss Function: Normalized cross correlation

Green and Blue Displacement: Vertical: 70; Horizontal: -3

Downscaled 4 times during pyramid aligning.

Red and Blue Displacement: Vertical: 121; Horizontal: -23

Downscaled 4 times during pyramid aligning.

Note: this is at the Ministry of Communication and Transportation at Belyie Ozerki

Monument

Loss Function: Normalized cross correlation

Green and Blue Displacement: Vertical: 1; Horizontal: -1

Downscaled 4 times during pyramid aligning.

Red and Blue Displacement: Vertical: 7; Horizontal: -2

Downscaled 4 times during pyramid aligning.

Bells and Whistles

Edge Detection

Original Aligned Emir (no edge detection)

Loss Function: Normalized cross correlation

Green and Blue Displacement: Vertical: 47; Horizontal: 23

Downscaled 4 times during pyramid aligning.

Red and Blue Displacement: Vertical: 236; Horizontal: -198

Downscaled 4 times during pyramid aligning.

Aligned Emir (with Canny edge detection)

Loss Function: Normalized cross correlation

Green and Blue Displacement: Vertical: 48; Horizontal: 23

Downscaled 4 times during pyramid aligning.

Red and Blue Displacement: Vertical: 105; Horizontal: 40

Downscaled 4 times during pyramid aligning.

Notice how the Green and Blue images were already aligned perfectly originally. The Red and Blue filtered images are the problem. Once we take an image with everything filtered except its edges, it works. Let's take a look at what the edges look like

Emir filtered Red with Canny Edge Detector Applied

Canny Sigma Parameter: 1

Emir filtered Blue with Canny Edge Detector Applied

Canny Sigma Parameter: 1

The Canny edge detector uses a variance parameter which you can use to specify how sensitive you want the edge detector to be. If you set a high standard deviation it returns fewer pixels indicated to be part of an edge.

Emir filtered Red with Canny Edge Detector Applied

Canny Sigma Parameter: 5

Emir filtered Blue with Canny Edge Detector Applied

Canny Sigma Parameter: 1

Let's look at the difference in image quality with a sigma of 5 instead of 1

Original Aligned Emir (no edge detection)

Loss Function: Normalized cross correlation

Green and Blue Displacement: Vertical: 47; Horizontal: 23

Downscaled 4 times during pyramid aligning.

Red and Blue Displacement: Vertical: 236; Horizontal: -198

Downscaled 4 times during pyramid aligning.

Aligned Emir (with Canny edge detection - Sigma - 5)

Loss Function: Normalized cross correlation

Green and Blue Displacement: Vertical: 47; Horizontal: 23

Downscaled 4 times during pyramid aligning.

Red and Blue Displacement: Vertical: 107; Horizontal: 40

Downscaled 4 times during pyramid aligning.

The differences are minimal with this change.

Histogram Equalization

Original Aligned Lady (no Histogram Equalization)

Note: Note that this picture is kind of dim. We can se from a histogram that most of the pixels are centered the lower end of the 0 .. 255 spectrum.

Aligned Lady (with histogram equalization)

Note: The original images have been equalized




Note this was implemented using the skimage libraries provided