Project 1: Colorizing the Prokudin-Gorskii photo collection
CS 194-26 Fall 2020
September 08, 2020
Sergei Mikhailovich Prokudin-Gorskii captured images of the Russian Empire from 1907 onwards ranging from landscapes, to people, to architecture on red, green, and blue glass plates with the intention of producing color photos. When Prokudin-Gorskii left the Russian Empire after the Russian revolution in 1918 his images were lost. Luckily, the plates were recovered and later purchased by the Library of Congress. Now the entire Prokudin-Gorskii collection can be viewed on the Library of Congress webpage. In this project, we first aligned each tri-image and then stacked each plate on top of the other to view the final colorized image that Prokudin-Gorskii intended to create.
The basic idea for aligning the images was to leave the blue plate static, and then apply an align function to the red and green plate in order to align it to the blue plate. From there we stacked the blue plate with the aligned red and aligned green plate to get the final image.
For small images I calculated the sum of squared differences between different vertical and horizontal transformations of the current plate with the blue plate. Since the sum of squared differences measures the square difference of the pixel value, we know that smaller values for SSD means that the image is closer to matching. So, for each vertical and horizontal transformation we find the transformation that results in the minimum SSD and apply that transformation to the current plate. I searched a range of [-20,20] pixels for both the x and y transformation in order to create my small images.
Additionally, I found that the borders on the plates of the image would typically throw off the SSD and result in a less clear image (especially for monastery.jpg), so pre cropping the image helped to focus the SSD on the pixels of the image that actually mattered. The image cropping was: [20:300][40:350] for each plate.
For the high resolution images, the previous method is inneficient. Instead we use an image pyramid (also called a Gaussian pyramid). The idea of the image pyramid is that we start with the original image and repeatedly scale down our image to make it smaller and smaller. When we get to the top level of the pyramid, it is small enough to run an exhaustive search on. Once we get an x and y displacement from the image at the top of the pyramid, we multiply the x and y displacement by the inverse of the scale factor and then search the pixels of the image at the next level of the pyramid to the left and right of these displacements that we could not have previously searched. We continue doing this until we reach the original image at the bottom level of the pyramid. The image pyramid allows us to narrow the range of our search and therefore allows us to align our images in a reasonable amount time. My implementation of the image pyramid takes ~40-50s per image, and I create a pyramid with five levels.
Like the previous method, I crop the image manually before running the align function to remove inconsistencies due to borders. The image cropping was [200:3000, 200:3500] for each plate.
Note:The Emir photo doesn't align well as shown above. This is due to brightness inconsistencies across the image that make it hard to align using the SSD metric.
Castle offset: r[4, 98] g[3, 35]
Emir offset: r[-680, 73] g[24, 49]
Harvesters offset: r[13, 124] g[16, 59]
Icon offset: r[23, 89] g[17, 40]
Lady offset: r[11, 112] g[9, 47]
Melons offset: r[13, 178] g[10, 82]
Onion Church offset: r[36, 108] g[26, 51]
Self Portrait offset:r[36, 175] g[27, 77]
Three Generations offset: r[14, 53] g[11, 111]
Train offset: r[31, 87] g[4, 42]
Workshop offset: r[-12, 104] g[0, 52]
Tobolsk offset: r[3, 6] g[2, 3]
Monastery offset: r[2, 3] g[1, -3]
Cathedral offset: r[-1, 9] g[-1, 2]
Water Lilies offset: r[-28, 117] g[-6, 46]
Mosaic on Wall offset: r[36, 128] g[22, 61]
Apricot Flowers offset: r[33, 117] g[24, 53]