Project 1: Colorizing the Prokudin-Gorskii photo collection

Rajita Pujare

Overview

In this project, we were tasked with the digitized Prokudin-Gorskii glass plate images and, using image processing techniques, automatically produce a color image. The way to do this was to divide the image into three equal parts and align the second and the third parts (G and R) to the first (B).

Exhaustive Searching

For the smaller JPGs, I used the sum of squared distances metric, or SSD. This value was minimized between two images to find the best alignment. I initially chose the blue color channel as by base as it was given. To align the parts, I exhaustively searched over a window of possible displacements in a range of [-15, 15], i.e. vertical and horizontal shifts. I tried both SSD and NCC (normalized cross correlation) as a metric to identify similarity between the two channel images, and SSD ultimately gave better results. By minimizing SSD, the best displacements were found.

SSD = ( sum(sum((image1-image2)^2)))
      
NCC = dot_product((image1./||image1|| and image2./||image2||))

After the two channels were appropriately shifted, all three were stacked to produce the final image.

Output Images

Monastery

Green offset (x, y): -3 2
Red offset (x, y): 3 2

Cathedral

Green offset (x, y): 5 2
Red offset (x, y): 12 3

Tobolsk

Green offset (x, y): 3 3
Red offset (x, y): 7 3

Pyramid Speedup

For the larger tif images, it was clear that the exhaustive searching heuristic was not going to suffice. This is because it only examined displacements up to 15 px, and running larger exhaustive searches would take too long and be inefficient.

To address this, a coarse-to-fine pyramid speedup was used to handle large tif images. The speedup found the coarsest displacement by scaling down the image's resolution by a factor of 2. I did this until the image was scaled down to 1/16 of its original size, where an exhaustive search was run to find the offset. The image was shifted, and within the next coarsest resolution (scaled down by 1/8 of original size), the images were aligned using the search to adjust and update the calculated offset.

Finally, the top level is reached where we return the optimal displacement.

Output Images

Church

Green offset (x, y): 22 3
Red offset (x, y): 54 -4

Lady

Green offset (x, y): 45 7
Red offset (x, y): 104 9

Melons

Green offset (x, y): 76 8
Red offset (x, y): 167 11

Workshop

Green offset (x, y): 48 0
Red offset (x, y): 92 -10

Onion Church

Green offset (x, y): 46 25
Red offset (x, y): 101 34

Self Portrait

Green offset (x, y): 73 27
Red offset (x, y): 164 34

Three Generations

Green offset (x, y): 45 14
Red offset (x, y): 100 11

Train

Green offset (x, y): 39 6
Red offset (x, y): 80 30

Harvesters

Green offset (x, y): 56 16
Red offset (x, y): 116 14

Icon

Green offset (x, y): 38 16
Red offset (x, y): 84 22

Emir

Green offset (x, y): 45 22
Red offset (x, y): -168 -43

As we can see above, the emir photo was not successfully aligned: this is likely because each of the color channels has different levels of brightness, preventing the image matching metric from identifying similarity.

However, by setting the base channel as green (and aligning the red and blue channels), we see a much better result.

Emir (Fixed)

Blue offset (x, y): -45 -22
Red offset (x, y): 53 16

Other Images

Apples

Green offset (x, y): 6 1
Red offset (x, y): 15 0

Clouds

Green offset (x, y): 6 0
Red offset (x, y): 13 0

Man Breaking Wood

Green offset (x, y): 6 0
Red offset (x, y): 13 -1

Patterns

Green offset (x, y): 6 6
Red offset (x, y): 13 12