Project 1: Colorizing the Prokudin-Gorskii photo collection

By Samuel Sunghun Lee [CS194-26 - Fall 2020]

Brief Overview

The goal of this assignment is to take the digitized Prokudin-Gorskii glass plate images and, using image processing techniques, automatically produce a color image with as few visual artifacts as possible.

Path to Goal

To produce a single RGB color image, we need to extract the three color channel images, place them on top of each other. A standard naive way to do this is to exhaustively search over a window of possible displacements i.e [-20,20], and score each one using some image matching cost function like SSD or NCC. Then, we can stack these new shifted images on top of each other to produce a single RGB image. The drawback of this naiive algorithm is that it runs incredible slow with larger higher resolution images. Thus, an image pyramid algorithm can be used.

Small & Low Resolution Images

For small and low resolution images, the standard exhaustive search over a range of windows ran reasonably fast (30sec/img) and gave reasonably aesthetic RGB images. In my implemented Align algorithm, I chose the range [-20,20] and applied the shift in both the X and Y axies with the Sum of Squared Differences (SSD) metric. The Align algorithm was applied to the Red and Green channel with respect to the Blue channel. Then, the lowest SSD metric was found and the corresponding displacement was applied to produce a displaced image for both Red and Green channels. Then, the R G B images were stacked on top to produce a single RGB image.

Big & High Resolution Images

For big and high resolution images, thethe standard exhaustive search over a range of windows ran extremely slow (30min/img). Thus, I implemented the Pyramid algorithm that recursively pyramid align 50% scaled subimages of the image until the image was <= 500pixels wide. Then, the recursive displacements were calculated and rescaled by the factor of 2. This approach gave reasonably aesthetic RGB images in a reasonable amount of time (30sec/img). In my implemented Pyramid algorithm, I used the previously mentioned Align algorithm, which used the range [-20,20] and applied the shift in both the X and Y axies with the Sum of Squared Differences (SSD) metric.

Image Preprocessing

Some images had noisy borders of various colors that affected both the Align and Pyramid algorithm and thus produced unaesthetic images. To account for this, I cropped the borders of the smaller images by 20 pixels and cropped the borders of the bigger images by 500 pixels BEFORE running the algorithms. This had tremendously good results to the produced RGB images and created a cleaner, well-aligned output.

Produced Images

Small & Low Resolution Images

cathedral monastery tobolsk
Displacement Vectors
cathedral R[5, 2], B[12, 3]
monstery R[-3, 2], G[3, 2]
tobolsk R[3, 3], G[6, 3]
Note: cropped each image by 20pixel margins

Big & High Resolution Images

castle emir harvesters icon lady melons onion_church self_portrait three_generations train workshop
Displacement Vectors
castle R[32, 0], B[96, 0 ]
emir R[48, 24], G[0, -160]
harvesters R[56, 16], G[120, 16]
icon R[40, 16], G[88, 24]
lady R[48, 8], G[112, 8]
melons R[80, 8], G[152, 8]
onion_church R[48, 24], G[104, 40]
self_portrait R[80, 32], G[152, 32]
three_generations R[48, 16], G[112, 8]
train R[40, 8], G[88, 32]
workshop R[56, 0], G[104, -8]
Note: cropped each image by 500pixel margins

Self-Selected Images

ex1 ex2 ex3
Displacement Vectors
ex1 R[0, -5], B[7, -6]
ex2 R[3, 1], G[12, 1]
ex3 R[2, 1], G[11, 1]
Note: cropped each image by 20pixel margins

Reasons for Aligning Failure

Some images failed to align well. A possible reason is that those images that failed to align well had very noisy borders, such as thick random black lines that interfered with the SSD calculation when aligning the color channels with the Blue Channel. I believe that when those borders are removed precisely, the alignment will be a lot better.