Project 1: Images of the Russian Empire

Colorizing the Prokudin-Gorskii photo collection

Prokudin-Gorskii's glass plate negatives contain the pixel values for red, green, and blue (RGB). These plates contain the images of the Russian Empire from the early 1900's. The goal of this project is to convert the original negatives into colored images. This process involves aligning the three images using Sum of Squared Differences (SSD) and/or Normalized Cross-Correlation (NCC). If the size of the image is particularly large, I used the image pyramid algorithm to recursively find best alignments on smaller versions of the image, updating the search window and displacement at each step as the image gets larger. In addition, edge detection is used to further fine-tune the alignment of the negatives (instead of just raw pixels). After alignment, I performed automatic cropping and white-balancing of teh image.

Bells and Whistles: Automatic Cropping, Edge Detection for better features and White Balancing




Approach:

First I attempted implementing the suggested solution on raw pixels of each color channel. Single scale alignment algorithms are Sum of Squared Differences (SSD) and Normalized Cross-Correlation (NCC).

Sum of Squared Differences (SSD): minimize sum(sum((image1-image2).^2)), over the displacement values for x and y in range of [-15, 15] between a blue and a red or green channel.

Normalized Cross-Correlation (NCC): maximize (image1./||image1|| and image2./||image2||), over the displacement values for x and y in range of [-15, 15] between a blue and a red or green channel.

For negatives that were in the TIFF format, a faster search procedure is done using image pyramids. An image pyramid represents the image at multiple scales (usually scaled by a factor of 2) and the processing is done sequentially starting from the coarsest scale (smallest image) and going down the pyramid, updating the offset values with each proceeding layer. At each level, the Normalized Cross-Correlation is calculated with a window range of int(np.log2(curr_rows or curr_columns)) at the smallest scale and a range of int(np.log2(curr_rows or curr_columns)/2) for every successive layer. Curr_rows or curr_columns is also an important distinction as the window for a column displacement should not be the same as the one for a row displacement. These numbers were chosen analytically as well as through trial and error. The number of levels was determined through int(round(np.log10(rows))), where rows is the # of rows in the original image. The idea is that the coarsest level will provide the general location of optimal alignment and each successive level will merely update in its vicinity to determine if a more optimal, local position can be found.

Better Features aka Bells and Whistles 1: In addition to using the image pyramid technique for faster alignment on TIFF images, edge detection is used to enhance the alignment mechanism. First transform an image into an images of edges through the Canny edge detection algorithm. That gives a clear contrast outline of the original image. Then run the original pyramid algorithm on these new features.

Canny Edge Detection and Pyramid Algorithm

Blue Edges

Before

After


Automatic Cropping: After the RGB images negatives are aligned, an automatic cropping function removes extraneous borders conservatively. First, extract the saturation level of an image and apply a threshold map. For every pixel larger than 0.93 in value turn it into a 1, otherwise 0. Then mark off 10% of image borders to analyze for speedup. Going from the middle of the image outwards find the first row/column where the ratio of white pixels to the total pixels is greater than 13%. Mark those locations as ones to be cropped off.



White Balancing: After the border is cropped, a white balancing algorithm takes over to enhance the original color image. Assume that the average color or the brightest color is the illuminant and shift those to white.


Offset: R:(0, 24), G:(5, 2)

Offset: R:(86, -2), G:(35, 2)

Offset: R:(118, -14), G:(55, -5)

Offset: R:(107, 40), G:(49, 23)

Offset: R:(123, 15), G:(60, 18)

Offset: R:(88, 22), G:(38, 16)

Offset: R:(97, -27), G:(37, -4)

Offset: R:(59, 44), G:(23, 29)

Offset: R:(120, 13), G:(57, 9)

Offset: R:(3, 2), G:(-3, 2)

Offset: R:(132, -2), G:(64, 4)

Offset: R:(8, 0), G:(3, 1)

Offset: R:(171, 49), G:(77, 29)

Offset: R:(14, -1), G:(7, 0)

Offset: R:(138, 22), G:(67, 13)

Offset: R:(111, 8), G:(55, 12)

Offset: R:(85, 29), G:(40, 8)

Offset: R:(139, 40), G:(60, 31)

Offset: R:(93, 7), G:(43, 10)