Sachit Shroff's Project 1

This project focused on colorizing the Prokudin-Gorskii photo collection, which was taken by the visionary photographer Prokudin-Gorskii in Imperial Russia. The images here are prooduced from high resolution digital scans of his original glass negatives, of which he has one taken with a red, green, and blue filter for each image. But because of the construction of his camera and the way the plates were scanned, aligning the images to produce correctly colorized images proved non-trivial.

A Summary of my Approach

At a base level, my approach is pretty simple. It involves splitting the scan into the three plates (red, green, and blue) and testing the alignment of the red and green plates against the blue plate at various offsets. I then "scored" each alignment (ignoring a set border of pixels around the edge) using a Sum of Squared Differences metric to find the best offset. I shifted the red and green plates by the corresponding best offsets, then stacked the aligned red, aligned green, and original blue plates into a single RGB image, cropped off the sides (based on the max alignment shift), and saved theoutput colorized images. For large images, I constructed image pyramids for each plate, which consist of succeedingly smaller representations of the original image (in this case downscaled by a factor of 2/3) until a desired size is reached (in this case smaller than 512x512 pixels). I then aligned the red and green image pyramids to the blue one by first aligning the smallest representations, scaling up the best offset I found at the smallest level and searching a series of offsets around that previous best offset in the next smallest levels, and so on until I reached the original representation. This allowed me to search fewer expensive-to-calculate offsets at the large levels and find alignments quickly (under 50 seconds on my laptop).

Issues I Ran Into

The first issue I ran into was speed when using the naive approach on the large .tif images. I solved this by instead using the image pyramid approach described above. To speed things up further, I increased by step size through the space of displacements to 2, and so tried 1/4 as many displacements at each level. I also noticed that some images needed to be displaced by more than the maximum displacement my search allowed but mostly in the vertical direction, so increased my vertical search range while decreasing my horizontal search range. I also saw lots of larger images not being aligned correctly, which I solved by ignoring more pixels by the border (default 5% of the image in each direction).

My Example Outputs with Offsets in the Format (x,y) Displacement:

All scaled down to 425 x 600 so they are easily viewable, here are the images:

NOTE: I didn't do well on some of the larger images at first, and some images (like melons, harvesters) stillhave some spots of color.

Cathedral: Red (-3,-12), Green(-2,-5)

Monastery: Red (-2,-3), Green(-2,3)

Tobolsk: Red (-3,-6), Green(-3,-3)

Harvesters: Red (-13,-124), Green(-17,-59)

Emir: Red (-54, -103), Green(-24,-50)

3 Generations: Red (-12,-112), Green(-15,-53)

Workshop: Red (12,-105), Green(0,-53)

Lady: Red (-12,-116), Green(-9,-56)

Train: Red (-31,-87), Green(-6,-42)

Icon: Red (-22,-89), Green(-16,-40)

Onion Church: Red (-36,-109), Green(-25,-52)

Melons: Red (-13,-178), Green(-9,-81)

Self Portrait: Red (-37,-177), Green(-28,-79)

Village: Red (-22,-137), Green(-13,-65)

My Outputs on Other Images

Offsets- House : (Red=(-40, -36), Green=(-21, -9)), Lamp : (Red=(-26, -59), Green=(-13, -4)), Corner : (Red=(-10, -125), Green=(-6, -45)), Boy : (Red=(18,-103), Green=(11,-45))