CS 194 - 26: Computational Photography Proj 1

Ritika Shrivastava, cs194-26-ate

Background

Sergei Mikhailovich Prokudin-Gorskii (1863-1944) [Сергей Михайлович Прокудин-Горский] believed early on (1907) that color images were the wave of future. He travelled across the Russian Empire and t took photographs of everything that he saw. He would record three exposures onto a glass plate using a red, green, and blue filter. During his lifetime, it was never possible to print color photographs. However, his RGB glass plate negatives, capturing the last years of the Russian Empire, survived and were purchased in 1948 by the Library of Congress. The LoC has recently digitized the negatives and made them available on-line.

Project

For this project, I took the image negatives that Sergei produced and alined them to create beautiful images of the Russian empire.

The very first steps were to import the image, and split it in three segments. The first block was the blue scale image (B). Then came green (G). Last was red (R). Once the images were imported, I began work on alinement.

0-Alignment

To aline the images, I started by seeing what the images looked like before they were alined. This I called 0-Alinement. As you can see in the image table at the bottom of this page, those images are extremely blurry and very hard to decipher. So much for viewing the Russian Empire.

Scoring Mechanism

The next approach I took was to aline the images using a scoring mechanism. This meant that I would shift the G or R to match the B frame (B was my reference frame). Then each alinement would be assigned a score. The alinement with the best score was the image. For these alinements, I attempted shifting the image between [-15, 15] in the (X,Y) direction. This meant that just to aline G with B, I was comparing 600 images (all shifted versions of G) with B. This technique was pretty good with smaller images, but with larger images, there were two problems. The [-15,15] was too small for images with heights and widths in the thousands. These would require larger ranges and the larger the range, the more comparisons. The second issue was that the large images were taking too much time. One image took me around 480 seconds! So for larger images, I decided to use a different approach.

Before talking about the approach for larger it is important to mention that there were two algorisms that I used to score the alinement for images. The first was Sum of Squared Differences.

sum((image1-image2)^2)

The second approach was normalized cross-correlation. This provided me with better data and so I used this in the future as well. Normalized Cross-correlation is the dot product of the images divided by their norm.

image1/||image1|| (dot) image2/||image2||

For the approach above, I removed 25 pixels for smaller images. This removed the black border.

While this approach was good with smaller images, it had several issues with larger images, which were mentioned above. The next approach I took was using an image pyramid.

Image Pyramids

For this approach, I took each image compressed to to half the size, till the size was less than 200 pixels.

Then I used the NCC scoring mechanism to match the image with its reference. Once this was done, I scaled up the image and doubled the value of the alinement vector. This gave me really good results.

The only issue I had with the result was that the borders were giving me problems. To remove them, 4% from the side of the images. This led to most of the images being alined.

At this point, the only image that was not alined was Cathedral and Emir. Cathedral was not working because a lot of the border was still present after the 4% cropping. This was fixed when I applied 8% cropping. Emir was not working because it has a different color channel.

Fixing Emir

To fix the Emir, I applied a strong contrast to R. If pixel intensity was less than 0.5 then it was assigned a 0 and if greater or equal, it was assigned a 1. This made the red image to the black figure of the Emir, with a few highlights on the ground. This also aligned well with the Green and Blue Channel.

Future Improvements

In the future, I would like to implement other scoring mechanisms. I would also like to automate border cropping and contrasting.

Results

Click on images to expand

Key

1st value = Time (seconds)

2nd value = G alinement vector

3rd value = R alinement vector

Final Results

Emir - Emir contrasting [49, 24] [104, 42]
Three Generations - Pyramid w/ Cropping 8.788250207901001 [54, 12] [111, 9]
Workshop - Pyramid w/ Cropping 8.532160997390747 [53, -1] [105, -13]
Harvesters - Pyramid w/ Cropping 11.412437915802002 [60, 16] [124, 13]
Monastery - Pyramid w/ Cropping 0.9431657791137695 [-3, 2] [3, 2]
Onion Church - Pyramid w/ Cropping 10.915465831756592 [52, 25] [108, 36]
Self-Portrait - Pyramid w/ Cropping 11.768255949020386 [78, 28] [176, 36]
Icon - Pyramid w/ Cropping 11.564610242843628 [40, 17] [89, 23]
Tobolsk - Pyramid w/ Cropping 1.387908935546875 [3, 2] [6, 3]
Castle - Pyramid w/ Cropping 9.217112064361572 [35, 2] [98, 3]
Lady - Pyramid w/ Cropping 10.759226083755493 [56, 0] [118, 10]
Train - Pyramid w/ Cropping 11.50697922706604 [42, 1] [86, 30]
Melons - Pyramid w/ Cropping 10.330153942108154 [82, 9] [178, 12]

Cathedral - Pyramid for Small Images [5, 2 ] [12, 3]

My Selection

Pyramid w/ Cropping 7.5440943241119385 [39, -12] [86, -27]

Pyramid w/ Cropping 7.397198915481567 [8, 8] [60, 13]
Pyramid w/ Cropping 7.033835172653198 [28, 3] [68, 7]
Pyramid w/ Cropping 6.9268810749053955 [48, 12] [115, 30]

Pyramid w/ Cropping 7.39547324180603 [38, 21] [76, 35]

Result Table

- Had to remove some images from the table due to webpage sizing issues

Below the results for 0-alinement, normalized cross-correlation, basic pyramid, and pyramid with cropping are seen.

In the caption of each image, there are a few values. The first value is the second that it program took to aline this image with the respective algorithm. The second value is the vector used to aline the G image filter. The third value is the vector used to aline the R image filter.

Key

1st value = Time (seconds)

2nd value = G alinement vector

3rd value = R alinement vector

0-Alignment

Normalized Cross-Correlation

Pyramid w/ Cropping

Three Generations [0,0] [0,0]

238.00124073028564 [15, 1 ] [15, 3 ]
8.788250207901001 [54, 12] [111, 9]

Emir [0,0] [0,0]
217.00218415260315 [0, 7 ] [15, 15 ]
8.0000581741333 [49, 24] [7, 31]

Melons [0,0] [0,0]
259.379695892334 [15, -4 ] [15, -8 ]
10.330153942108154 [82, 9] [178, 12]