Project 1 - Colorizing the Prokudin-Gorskii photo collection

Overview and Approaches

The Prokudin-Gorskii photo collection features glass plate images, which are photos taken using a red, blue and green filter, such as the following example:

The goal of this project is to separate the red, green and blue channels for each picture and stack them up to create a colored visualization of the images. Partially using the starter code provided, for each image, I would read in the image and convert it to double-valued. Then, I would slice the original image into three parts with equal height in order to obtain the three channels of the same size. The main part of the project is aligning, which I spent the most time on and will elaborate on in the following paragraphs.

I use the sum of squared differences (SSD) to determine the displacement for best alignment. To get started, my first approach was to define a function named 'align', which naiively searches over a window of pixels ([-15, 15]) both horizontally and vertically to find the smallest SSD. This worked well for the jpg files since they are small enough to run fastly. Additionally, I found it to be more accurate to crop out the edges (10% of the length) first before computing SSD.

To increase performance, I implemented a 'pyramid' function that builds on align using the idea of image pyramids. In the function, I recursively rescaled the the image by a factor of 1/2 until it is small enough, and search for the best displacement around the coordinates returned by calling the function of the previous level. In the base case, I use the align function defined above to find the displacement for the smallest image. The main formula is if x, y are the displacement coordinates from the previous level, and x', y' are the displacement coordinates returned from the current level, the final returned displacement for the current level is 2x + x', 2y + y'. This significantly increased the execution runtime for large images to under 3 minutes.

Results on Example Images

For the displacement coordinates [x, y] listed below, x is the horizontal shift and y is the vertical shift. I use the blue channel as the base channel, and shift the red and green channels respectively.

Cathedral

Green: [2, 5]; Red: [3, 12]

Monastery

Green: [2, -3]; Red: [2, 3]

Tobolsk

Green: [3, 3]; Red: [3, 6]

Three Generations

Green: [14, 53]; Red: [11, 112]

Train

Green: [6, 42]; Red: [32, 87]

Emir

Green: [24, 49]; R & B: [-189, 100]

Lady

Green: [9, 52]; Red: [12, 112]

Icon

Green: [17, 41]; Red: [23, 90]

Harvesters

Green: [17, 60]; Red: [14, 124]

Melons

Green: [10, 82]; Red: [13, 178]

Onion Church

Green: [26, 52]; Red: [36, 108]

Self Portrait

Green: [30, 80]; Red: [37, 176]

Village

Green: [12, 64]; Red: [22, 138]

Workshop

Green: [-12, 104]; Red: [0, 52]

Results on Other Images in the Collection

At the 5th water supply control, irrigation canal (aryk) in the Murgab Estate (Low-res)

Green: [2, 7]; Red: [3, 15]

Liesnaia doroga (High-res large file)

Green: [-12, 36]; Red: [7, 97]

Minor Fixes for Improvement

While aligning the Emir image, I found the result to be off by quite a lot. This is likely because the intensity of one of the colored channels (G in this case) is too low, and it makes the alignment between the other two channels hard to compute because the SSD's are too close.

To fix this, I needed to use the green channel as the base channel, so I aligned the red and blue channels on the green channel instead, and the result is much better.

Blue: [-24, -49]; Red: [57, 17]