CS194-26: Intro to Computer Vision and Computational Photography, Fall 2021

Project 1: Colorizing the Prokudin-Gorskii Photo Collection

Angela Chen



Overview

The Prokudin-Gorskii photo collection contains glass plate images separated into three color channel images. In this project, the goal is to use image processing techniques to create a single color image with as few visual artifiacts as possible. The general idea to achieve this is to separate each image into three color channel (R, G, B color channels) images, align them, and then stack them on top of each other so that a single RGB color image is created. In this report, I will first go over a simple single-scale exhaustive search to align smaller jpg images. Then I will go over implementing a multi-scale pyramid search for larger tiff images.

Part 1: Exhaustive Search

Each glass plate image contains three color channel images. I want to separate this single glass plate image and stack the three color channel images on top of each other to create a single RGB color image. However, if I just separate and stack without aligning, the resulting image does not look very good with visual artifacts and you can see the separate color channels where they don't align so well.

The original cathedral.jpg from the Prokudin-Gorskii photo collection. You can see the three color channels. Top image is the blue channel. Middle image is the green channel. Bottom image is the red channel.
The three color channels of cathedral.jpg stacked on top of each other without aligning. You can see clear artifacts. This is why at the very least exhaustive search is important to properly align the images!

To align the images, I searched over a window of displacements in both the x and y directions. I chose a displacement window of [-25, 25] pixels. I shifted one image over by each displacement in the window and calculated the Sum of Squared Differences (SSD, defined as sum(sum((image1-image2).^2))) between the two images. The aligned image with the lowest SSD score was the best aligned image. I aligned both the R and G channel images to the B channel image. After aligning the R and G channel images to the B channel image and stacking all three on top of each other, I just cropped out the border by taking the central 90% region of the colorized image so that the image looked nicer.

My first result with aligning cathedral.jpg over a displacement of [-25, 25] pixels and taking SSD over the whole image.

While the image above is not as bad as no alignment, I think it can look a little better. Since the most important part of the image is the central region, I decided to only calculate SSD on the central 65% region of both the aligned image (R or G channel image) and the base image (B channel immage). The result looked much better.

The colorized result of aligning cathedral.jpg over a displacement of [-25, 25] pixels and taking SSD over the central 65% region of each image. G to B offset (5, 2). R to B offset (12, 3).

Then I just did the same thing with the other two jpg given example images.

The original monastery.jpg from the Prokudin-Gorskii photo collection.
The colorized result of aligning monastery.jpg over a displacement of [-25, 25] pixels and taking SSD over the central 65% region of each image. G to B offset (-3, 2). R to B offset (3, 2).
The original tobolsk.jpg from the Prokudin-Gorskii photo collection.
The colorized result of aligning tobolsk.jpg over a displacement of [-25, 25] pixels and taking SSD over the central 65% region of each image. G to B offset (3, 3). R to B offset (7, 3).

Part 2: Pyramid Search

Since tiff images are so large, running exhaustive search on them would take too much time. This is where pyramid searching comes in. For each pyramid, I created an image pyramid and did exhaustive search on each level, starting from the smallest resolution to the largest (size of the original tiff image).

To create an image pyramid, I stored the original image first. Then using that original image, I scaled it down by some scale factor (a double from 0.0 to 1.0). For example, if the scale factor is 0.5, then the rescaled image is half the size of the previous, etc. Then I stored this new rescaled image. I kept rescaling down the image using the previous rescaled image until the desired number of total rescaled images (aka levels).

The following images are all very compressed jpg images of the example tiff images.

The original workshop.tif from the Prokudin-Gorskii photo collection.
Colorized workshop.tif. I aligned R and G to B, used a displacement window of [-15, 15] pixels, took SSD over the central 65% region, and used a scale factor of 0.5 and an image pyramid of 5 levels. G to B offset (53, 0). R to B offset (105, -12).
The original church.tif from the Prokudin-Gorskii photo collection.
Colorized church.tif. I aligned R and G to B, used a displacement window of [-15, 15] pixels, took SSD over the central 65% region, and used a scale factor of 0.75 and an image pyramid of 3 levels. G to B offset (25, 4). R to B offset (58, -4).

The most challenging part of pyramid search for me was getting a fast runtime. The above two images took around 3 minutes each, and this was after varying the scale factor and number of pyramid levels. I decreased the window displacement size and central cropping region for the following images, and pyramid search become much faster, taking around a minute or less, while still maintaining good colorized image results.

The original harvesters.tif from the Prokudin-Gorskii photo collection.
Colorized harvesters.tif. I aligned R and G to B, used a displacement window of [-15, 15] pixels, took SSD over the central 60% region, and used a scale factor of 0.5 and an image pyramid of 5 levels. G to B offset (59, 17). R to B offset (123, 14).
The original icon.tif from the Prokudin-Gorskii photo collection.
Colorized icon.tif. I aligned R and G to B, used a displacement window of [-10, 10] pixels, took SSD over the central 60% region, and used a scale factor of 0.5 and an image pyramid of 5 levels. G to B offset (40, 17). R to B offset (89, 23).
The original lady.tif from the Prokudin-Gorskii photo collection.
Colorized lady.tif. I aligned R and G to B, used a displacement window of [-10, 10] pixels, took SSD over the central 60% region, and used a scale factor of 0.5 and an image pyramid of 5 levels. G to B offset (55, 8). R to B offset (114, 12).
The original melons.tif from the Prokudin-Gorskii photo collection.
Colorized melons.tif. I aligned R and G to B, used a displacement window of [-10, 10] pixels, took SSD over the central 60% region, and used a scale factor of 0.5 and an image pyramid of 5 levels. G to B offset (81, 10). R to B offset (178, 14).
The original onion_church.tif from the Prokudin-Gorskii photo collection.
Colorized onion_church.tif. I aligned R and G to B, used a displacement window of [-10, 10] pixels, took SSD over the central 60% region, and used a scale factor of 0.5 and an image pyramid of 5 levels. G to B offset (51, 27). R to B offset (108, 37).
The original self_portrait.tif from the Prokudin-Gorskii photo collection.
Colorized self_portrait.tif. I aligned R and G to B, used a displacement window of [-7, 7] pixels, took SSD over the central 60% region, and used a scale factor of 0.5 and an image pyramid of 5 levels. G to B offset (78, 29). R to B offset (176, 37).
The original three_generations.tif from the Prokudin-Gorskii photo collection.
Colorized three_generations.tif. I aligned R and G to B, used a displacement window of [-7, 7] pixels, took SSD over the central 60% region, and used a scale factor of 0.5 and an image pyramid of 5 levels. G to B offset (52, 14). R to B offset (111, 12).
The original train.tif from the Prokudin-Gorskii photo collection.
Colorized train.tif. I aligned R and G to B, used a displacement window of [-7, 7] pixels, took SSD over the central 60% region, and used a scale factor of 0.5 and an image pyramid of 5 levels. G to B offset (43, 6). R to B offset (87, 32).
The original emir.tif from the Prokudin-Gorskii photo collection.
Colorized emir.tif. For this image, I aligned R and B to G, used a displacement window of [-7, 7] pixels, took SSD over the central 60% region, and used a scale factor of 0.5 and an image pyramid of 5 levels. B to G offset (-49, -24). R to G offset (57, 17).

Colorizing more images from the Prokudin-Gorskii photo collection

Here's some more images I downloaded and colorized from the Prokudin-Gorskii photo collection. I downloaded all the images as jpg and ran my exhaustive search algorithm from part 1 on them.

The original sunset.jpg from the Prokudin-Gorskii photo collection.
The colorized result of aligning sunset.jpg over a displacement of [-25, 25] pixels and taking SSD over the central 45% region of each image. I aligned R and B to G. B to G offset (-5, -2). R to G offset (4, 2).
The original lynx.jpg from the Prokudin-Gorskii photo collection.
The colorized result of aligning lynx.jpg over a displacement of [-25, 25] pixels and taking SSD over the central 45% region of each image. I aligned R and B to G. B to G offset (-6, -2). R to G offset (7, 1).
The original conservatory.jpg from the Prokudin-Gorskii photo collection.
The colorized result of aligning conservatory.jpg over a displacement of [-25, 25] pixels and taking SSD over the central 85% region of each image. I aligned R and B to G. B to G offset (-6, -3). R to G offset (7, 1).
The original peonies.jpg from the Prokudin-Gorskii photo collection.
The colorized result of aligning peonies.jpg over a displacement of [-25, 25] pixels and taking SSD over the central 65% region of each image. I aligned R and B to G. B to G offset (-5, 0). R to G offset (5, -1).
The original tree_stream.jpg from the Prokudin-Gorskii photo collection.
The colorized result of aligning tree_stream.jpg over a displacement of [-25, 25] pixels and taking SSD over the central 65% region of each image. I aligned R and B to G. B to G offset (-3, -3). R to G offset (2, 3).

Attempting Some Bells and Whistles

Some images were kind of dull, so I played around a bit with some color contrast. For the following images, I split each image into the three B, G, and R channels. I then equalized the histogram for each color channel then merged them back together to create some interesting color contrast.


For tree_stream.jpg, I really wanted to bring out the warm tones and make the image more saturated, kind of make the image have a more autumn feel for the season. I again split tree_stream.jpg into B, G, and R color channels and increased the values in each channel. After merging the color channels back together, I converted the merged image to HSV so I could increase the saturation and decrease the value a little since I found the merged image to be a little too bright. I then converted the HSV back to BGR to display and save the image.