Project 1: Colorizing the Prokudin-Gorskii photo collection

Overview

Our goal for this project is to take digitalized Prokudin-Gorskii glass plate images and produce a color image by aligning the 3 plates representing the 3 RGB channels. Naively overlapping the plates results in a very low quality colorized imaged, so we must apply some displacement to each plate. To do this, we start with an exhaustive search over a window of possible (x, y) displacements, and choose the one resulting in the lowest Euclidean distance squared (ssd) to the blue plate. To optimize this for high resolution images, we use an image pyramid to process the image, searching for the best displacement from a coarse to fine scale.

Approach

I first started with an exhaustive search over a range [-15, 15] of possible (x, y) pixel displacements for the red and green plates, keeping the alignments with the lowest ssd to the blue plate. This produced decent results for the lower resolution .jpg images, but was not scalable to the larger .tif images. To remedy the scalability issue, I implemented an image pyramid to recursively search through possible displacements. Each pyramid "level" was half the size of the previous. We start the alignment from the coarsest scale (smallest image). At each subsequent level, we refine the displacement by searching 5 pixels to the left and 5 pixels to the right of the already calculated displacement. At the coarsest scale (height <= 400), we align by doing the regular exhaustive search over the [-15, 15] range. This procedure produced decent results for most of the .tif images. For larger images, I also found it necessary to crop the border by 3%, which produced slightly better results.

Results

cathedral.jpg

g displacement: (-1, 1), r displacement: (-1, 7)

tobolsk.jpg

g displacement: (2, 3), r displacement: (3, 6)

monastery.jpg

g displacement: (0, -6), r displacement: (1, 9)

icon.tif

g displacement: (17, 40), r displacement: (22, 89)

train.tif

g displacement: (2, 43), r displacement: (29, 87)

village.tif

g displacement: (12, 65), r displacement: (26, 93)

harvesters.tif

g displacement: (16, 60), r displacement: (13, 120)

three_generations.tif

g displacement: (12, 54), r displacement: (9, 111)

onion_church.tif

g displacement: (25, 52), r displacement: (36, 108)

workshop.tif

g displacement: (-1, 53), r displacement: (-13, 101)

melons.tif

g displacement: (8, 82), r displacement: (9, 169)

self_portrait.tif

g displacement: (0, 81), r displacement: (0, 169)

lady.tif

g displacement: (-2, 55), r displacement: (-8, 143)

emir.tif

g displacement: (24, 49), r displacement: (38, 59)

New examples

sunset.tif

g displacement: (-41, 75), r displacement: (-68, 114)

river.tif

g displacement: (-5, 25), r displacement: (-14, 102)

trees.tif

g displacement: (0, 39), r displacement: (12, 72)

Problems encountered

When implementing the image pyramid, the algorithm initially did not work well on the larger images. After some experimenting, it seemed that cropping the edges of the images led to nicer results due to noise reduction. In addition, several parameters, such as number of pyramid layers, range of each search iteration, and scale of pyramid images, all contributed to the runtime and quality of the results. I found that a scale of 1/2 per image and a range of +/- 6 pixels per iteration led to decent outputs for most images.

Failure cases

This algorithm noticeably does not work well on emir.tif and village.tif. This is possibly due to the difference in the values between the three plates for these images. Specifically, for emir.tif, the blue in the man's clothing leads a very white value in the blue plate and a very dark value in the red plate. SSD would label a "good" alignment with a poor score because of the difference in value between the plates. A similar explanation could be given for village.tif.