CS 194 Project 1: Colorizing the Prokudin-Gorskii Photo Collection

By Sainan Chen

1 Overview

Sergei Mikhailovich Prokudin-Gorskii, a Rassian phytogragher, recorded three exposures of many scene onto a glass plate using a red, a green, and a blue filter. The pictures on the right (from left to right) are the camera he used to record exposures, the latern projector he planned to use to show colored pictures, and an example of glass plate negative. This project aim to auto-crop and align three glass plate negatives on the input image and finally produce one colored output image. See more example inputs from Prokudin-Gorskii collection.

2 Algorithms

The sample input iamges are in two types: small jpg files and large jpg files.

2.1 Sum of Square of Differences(SSD)

For small jpg files, The first step I did is dividing the input image vertically into three equal-size subparts coresponding to B(blue), G(green) and R(red) channel. Then I cropped the white and black border in every subimage in order to prevent intervention with alignment. The third step is to align G channel and R channel to B channel respectively by minimizing the Sum of Square of Differences(SSD) between the two matrices. If the two matrices are well aligned, we expect the SSD of the two matrix to be very small. Otherwise, there are still some noticable difference between the pattern of the two matrices. I used a predifined offset range (-15,15) for both x and y direction for alignment, and I aimed to find the offset which will give the minimum SSD. Finally, the three channels are overlapped and concatenated together in third dimension to produce the colored image. Another post cropping will happen to cut unuseful borders created by alignment.

2.2 Image Pyramid

For large jpg files, however, checking all the possible alignments within a relatively large offset range is very time-consuming and inefficient. The first solution I came up with is to shrink the image to a handleable size and enlarge it again after we finish alignment. However, in this way, much information in the image will be lost during the shrinking process, showing by blured view and wide black board as below. Therefore, I decided to use a search procedure called Image Pyramid. It mainly means continuing shrinking the image by a fixed scale (2 in my case) until the size is small enough for the alignment algorithm to run efficiently. Then we can enlarge the image back again, and in every image size level, we can do adjustment on the rough x and y offsets found before over a small range.

Shrink image directly
Use image pyramid

3 How can we improve?

3.1 Sum of Standard Deviation instead of SSD for different light intensity

In the previous part, I mentioned that I used SSD to find offsets for best alignments. However, when I ran the algorithm on the image emir, the output is not ideal (as below). As we can see, the flaw on alignment is obvious, and both R and G matrices are not well aligned. If we look at the input image (at the top), we can see clearly that the light intensity/brightness on 3 plate negatives are noticably different. It will cause the SSD of even perfectly matched 2 matrices to be large. Therefore, I considered sum of standard deviation of columns of difference matrix to eliminate the effects caused by different light intensity. If 2 matrices perfectly align with each other, the light intensity difference will be the same on all of the positions, and the variance should be 0. We can find optimal offsets by finding the minimum sum of standard deviation.

Use SSD for input with different light intensities
R offsets are (43, 105)
G offsets are (25, 50)
Use Sum of Std instead
R offsets are (45, 67)
G offsets are (25, 50)

For most inputs which has similar light intensities for 3 plate negatives, I also applied both SSD and Sum of Std to compare the performance of the two algorithms. As the image below shows, there is no significant difference between the alignments done by these two algorithms.

Use SSD for alignment
R offsets are (37, 109)
G offsets are (28, 52)
Use Sum of Std for alignment
R offsets are (37, 109)
G offsets are (27, 52)

3.2 Pre and Post Cropping

I performed cropping twice during the whole process.

The first cropping happens before the alignment and after the separation of 3 subparts. The purpose of the first cropping is to get rid of the white and black border originally on the background, otherwise the very high/low intensity of the border pixel will influence the SSD value in later calculation to harm the accuracy.

The second cropping happens after the alignment and concatenation and before the final output is produced. There are colorful borders on the output image because of the shifts of R and G matrices. These colors are distracting and dazzling, so we want to get rid of them.

I detect and remove these border pixels by deleting all the rows or columns which have the standard deviation very small. If the standard deviation is close to 0, it means the row or column has same value for all entries, usually all 0 or 1. Another way to check irrelevant border is by checking if mean value for the row or column is very small or very large in any color channel. It means the row or column is white or black border in original image, or it's outside the matrix that we want to consider after alignment (should have all entries equal to 0). Below is an example of how I removed the border on the left and the top of the output image after alignment and concatenation.

Output without post cropping
Output after post cropping

For pre-cropping, however, detecting border using mean or standard deviation is not a general solution. Some image contains sky which has color very close to white, and the threshold is hard to decide. Therefore, I finally choose to cut the black and white border on initial background based on a fix proportion of the image.

4 Gallery

4.1 Small jpg images

R offsets are (3, 12)
G offsets are (2, 5)
R offsets are (2, 3)
G offsets are (2, -3)
R offsets are (3, 6)
G offsets are (3, 3)

4.2 Large tif images

R offsets are (5, 99)
G offsets are (4, 35)
R offsets are (24, 90)
G offsets are (18, 42)
R offsets are (15, 179)
G offsets are (11, 82)
R offsets are (13, 117)
G offsets are (10, 56)
R offsets are (14, 124)
G offsets are (18, 60)
R offsets are (45, 67)
G offsets are (25, 50)
R offsets are (37, 109)
G offsets are (28, 52)
R offsets are (38, 177)
G offsets are (30, 79)
R offsets are (12, 112)
G offsets are (15, 13)
R offsets are (33, 88)
G offsets are (7, 44)
R offsets are (-11, 106)
G offsets are (1, 54)

4.3 Some more large tif images

R offsets are (2, 42)
G offsets are (7, 14)
R offsets are (20, 39)
G offsets are (16, 15)
R offsets are (36, 78)
G offsets are (22, 39)
R offsets are (5, 121)
G offsets are (4, 27)
R offsets are (16, 42)
G offsets are (14, 15)
R offsets are (12, 101)
G offsets are (3, 41)
R offsets are (-23, 97)
G offsets are (-5, 50)

5 Conclusion

In this project, I refreshed my knowledge with Matlab, which I hadn't used for years. I learned much about basic image manipulations as well as more complicated algorithms like SSD and Image Pyramid in Matlab. Although the concepts are easy to understand, I spent some time struggling with matrix manipulation and figuring out all the dimension agreements. Meanwhile, I had a great time playing with the photos and see their real colors after manipulation, and I am impressed by the sights, figures and buildings that Prokudin-Gorskii captured.