Colorizing the Prokudin-Gorskii photo collection




Exhaustive Search

In order to form a single RGB color image, I needed to align the three provided color channel images. Over a user-given range (ex: (-10, 10), (-15, 15), (-20, 20)), I displaced the R and G color channels along the x and y axis, then scored each displacement by comparing the Sum of Squared Differences (SSD) distances as an image matching metric, against the B color channel. I also tried using Normalized Cross-Correlation (NCC), an alternate image matching metric, but ultimately found that that implementation was slower/less efficient.

The initial results were, unfortunately, pretty mis-aligned. I attributed this to the large discrepancies in the outer edges of the image, at the glass plate borders.

Displacement: R (-1, 1), G (-1, 7)
Displacement: R (0, -6), G (1, 9)
Displacement: R (2, 3), G (3, 6)

So, I cropped 10% off the edges of each color channel before finding the optimal displacements, then applied those displacements to the uncropped images. That way, I could maintain the full size of my color channel images, while still allowing my displacements to be optimized for the center, focus subject of the image. This method yielded much better results, as shown.

Displacement: R (2, 5), G (3, 12)
Displacement: R (2, -3), G (2, 3)
Displacement: R (3, 3), G (3, 6)



Pyramid Search

For high resolution images (such as tiff files), exhausive search would require a much larger pixel displacement range, and would thus become much more expensive. In order to combat this, I implemented a faster search procedure, where alignment is done through a combination of displacements, calculated at various downscaled image resolutions. I start by downscaling the image channel to 1/8th of its original size, then running the exhaustive search on this smaller image. I can then use this course displacement to align the original image channel, scaling appropriately (in this case, since I downscaled the channel to 1/8th of its size, I apply a scaled version (x8) of the yielded displacement). I repeat this method three more times, downscaled at 1/4, 1/2, and 1. By using this pyramid method, I can maintain that my pixel displacement range is relatively small, since I am able to add scaled alignments at less expensive search resolutions.

Displacement: R (4, 25), G (-4, 58)
Displacement: R (17, 60), G (14, 124)
Displacement: R (26, 52), G (36, 108)
Displacement: (14, 53), G (11, 112)



Results

Displacements

emir.tif:
R (24, 49), G (55, 103)
cathedral.jpeg:
R (2, 5), G (3, 12)
church.tif:
R (4, 25), G (-4, 58)
three_generations.tif:
R (14, 53), G (11, 112)

melons.tif:
R (10, 82), G (13, 178)
monastery.jpeg:
R (2, -3), G (2, 3)
onion_church.tif:
R (26, 52), G (36, 108)
tobolsk.jpg:
R (3, 3), G (3, 6)

icon.tif:
R (17, 41), G (23, 90)
self_portrait.tif:
R (29, 79), G (37, 176)
harvesters.tif:
R (17, 60), G (14, 124)
lady.tif:
R (9, 52), G (12, 112)

train.tif:
R (6, 42), G (32, 87)
workshop.tif:
R (0, 52), G (-12, 104)

More Examples

More examples, downloaded from the Prokudin-Gorskii collection.

Displacement: R (1, 2), G (1, 11)
Displacement: R (-1, 5), G (-3, 9)
Displacement: R (0, 4), G (-1, 15)