CS 194-26: Spring 2020
Project 1: Colorizing the Prokudin-Gorskii photo collection
Megan Lee


In this project, I colorized photos taken by Sergei Mikhailovich Prokudin-Gorskii (1863-1944). By taking the digitized glass plate images, extracting three color channel images, and creating an algorithm that produces colorized images through alignment and image stacking, I was able to realize Prokudin-Groskii's color photography.

Below is an example input and output that I created using my colorizing algorithm.

My Approach: Low Resolution Images

For the smaller, jpeg images, I was able to easily align the photos. I used a naive exhaustive search, looped a window of possible (x, y) displacements (in this case, I chose a [-15, 15] pixel window), and scored each displacement with the Sum of Squared Differences (SSD) distance. I then took the displacement with the lowest SSD score.

I compared the red channel to the blue channel, and the green channel to the blue channel, keeping the blue channel static. Then, I aligned the red channel and green channels to the static blue channel with the (x,y) displacements with the lowest scores, and created the color image!

The method worked well for the small jpg images, and was able to return results in mere seconds.

My Approach: High Resolution Images

The above method does not work for High Resolution files such as the tif files in the project. With large pixel counts of 3000-4000, the low resolution image approach was very slow and would be very expensive to run. In fact, I even waited 30+ minutes for it to run beforing giving up and restarting my kernel!

To speed up image alignment, I used an image pyramid. An image pyramid is a multi-level image stack, where each level contains the same image at different scales. I recursively implemented this algorithm so that I would start with the smallest image, search for the best (x, y) displacement (again, using SSD to score and keep the displacement with the lowest score). Then, I would move down a level, using the displacement that I found previously to narrow down the range that I had to search, thus reducing the time spent but also updating the displacement estimate to be more accurate.

I used a scaling factor of 0.5 (scaling down the images 2x each level) and a 5 level pyramid.

Problems I ran into

I noticed that a few color channels had thick black borders that other color channels did not. The inconsistency influenced my algorithm and cause it not to return a clean result in some cases. To fix this, I manually cropped the borders off all the images before feeding them into the algorithm.

I was able to align all the images but emir.tif with my algorithm. This was because of the difference in brightness in the different channels. Since the brightness in the three channels differed quite a bit, my algorith - which aligns the channels through pixel values - does not result in proper alignment. There are a variety of methods that could be used to fix this - one of them being an edge detector. More on this later!


Displacements are given by (dx, dy)

r: (12, 3), g: (5, 2)
r: (6, 3), g: (3, 2)
r: (3, 2), g: (-3, 2)
r: (124, 13), g: (59, 16)
r: (176, 37), g: (78, 28)
r: (4, 98), g: (3, 35)
r: (13, 179), g: (10, 82)
r: (36, 108), g: (26, 51)
r: (23, 89), g: (17, 40)
r: (32, 87), g: (5, 42)
r: (11, 111), g: (9, 49)
r: (-12, 104), g: (0, 52)
r: (11, 112), g: (14, 53)
r: (-492, 149), g: (24, 49)


r: (95, -25), g: (49, -6)
r: (113, -67), g: (75, -41)
r: (104, -6), g: (51, 3)

Bells and Whistles

Because of the difference in brightness in the emir photo, as discussed above the problems section, I decided to add an edge detection feature to better align the r, g, b channels. I used the Canny edge detection algorithm in the OpenCV library in order to do this. As you can see below, the results were great!

r: (-492, 149), g: (24, 49)
r: (107 , 40), g: (49 , 24)