CS194-26 Project 1
Colorizing the Prokudin-Gorskii photo collection
Sudhanvi Koneti

Project Overview

The goal of this project was to take the R, G, and B glass plates of the photos taken by Prokudin-Gorskii and stack them on top of one another. This would ideally create a colored image out of 3 seemingly gray-scale images. The challenge of this problem was to align the images in order to produce a clear, colored image.

Small-Scale Images

For small images, alignment was easy enough. I would iterate through a displacement range of [-15, 15] in both the vertical and horizontal directions. For each combination I would calculate a score using the Sum of Squared Differences (SSD). After observing all combinations I would find the displacement vector with the smallest SSD and use that to align the R and G plates to the B plate.
Below are the jpg images that were colorized using the process described above:

cathedral.jpg


R:[-6, 3]
"G:[-5, 2]"

monastery.jpg


R:[-11, 2]
G:[-10, 2]

tobolsk.jpg


R:[-8, 3]
G:[-4, 2]


Larger Images and Image Pyramid

Searching over a [-15,15] range in both the vertical and horizontal directions would be too exhaustive for larger images such .tif's. This is where and Image Pyramid comes in. The idea of an image pyramid would be to work on it at different scales or levels. Going from level 0 to level 4, each progressive level would be the image rescaled by 0.5. Scaling down an image to a coarser level allows us to process the image at a much faster speed.
In my algorithm, the base case was the coarsest level of pyramid. Here I utilized a similar approach to the naive solution, but instead of searching over a [-15,15] window, I searched over a [-64,64] window. I settled on these numbers as I found them to provide the best results across the majority of images. Making the window larger increased the runtime with little improvement in image quality. Making it smaller didn't properly align the plates. On each level after the coarsest, my algorithm would search over a [-1,1] range, using the displacement vector returned from the previous level as a starting point. Note that I had to multiply the returned vectors by 2 to account for change in scale. At all levels, I scored the displacement vectors using SSD just like in the naive solution and selected the vectors in a similar fashion. Once I had reached the original scale, I would return the resulting displacement vector and shift the plates accordingly
Below are the tif images that were colorized using the image pyramid process:

*emir.tif


R:[-185, -1039]
G:[-143, 24]

harvesters.tif


R:[-260, 13]
G:[-132, 15]

icon.tif


R:[-298, 22]
G:[-153, 17]

lady.tif


R:[-263, 0]
G:[-133, 0]

melons.tif


R:[-210, 11]
G:[-112, 8]

onion_church.tif


R:[-278, 36]
G:[-141, 25]

self_portrait.tif


R:[-213, 36]
G:[-115, 28]

three_generations.tif


R:[-273, 9]
G:[-138, 12]

train.tif


R:[-301, 30]
G:[-152, 1]

*village.tif


R:[-183, 25]
G:[-130, 12]

workshop.tif


R:[-278, -13]
G:[-139, -1]


A Note and Known Issues

NOTE: For .jpg images I manually cropped 1% of the height from the top and bottom. For .tif I cropped 3% instead. I manually cropped 4% of the image from both sides, regardless of image type.


The first set of issues are the images: emir.tif and village.tif. Due to different brightness values these images are not easily aligned using SSD as a metric. A possible solution to this would be edge detection, however, I did not have time to attempt this.


Another crucial issue is that some of my images have shifted too far and thus have wrapped around. As can be seen in the icon.tif image, what appears to be the top of the image, appears on the bottom. Despite hours of debugging, I was unable to find what caused this issue and am left perplexed.


The final issue is rather minor. It concerns the color borders. These could be removed either manually or using an algorithm that detects a continuous strip of one color.

More Colorized Images from the Collection

At the sea shore, Batum


R:[-40, 15]
G:[-28, 7]

Plaque


R:[-161, 2]
G:[-110, 6]

Lugano


R:[-101, -29]
G:[-56, -14]

Details of Milan Cathedral


R:[-186, -49]
G:[-105, -19]

Sim River


R:[-91, -4]
G:[-56, 15]