CS 194-26 Image Manipulation and Computational Photography

Project 1: Images of the Russian Empire

Yin Tang, cs194-26-acd



Overview

For this project, we process the black and white images in the Prokudin-Gorskii collection to produce colorized pictures.

Methodologies

Single-Layer Method

Each image is splitted into R, G, B three color channels and single-layer method uses an exhausive gird-search over a 15-pixel-by-15-pixel window to find the optimal alignment coordinates for green and red channels over blue channel with SSD, Sum of Square Differences, as searching heuristic. It adopts a default 10 percent cropping on all sides, cause the edges of the image are useless information and can cause adverse effect in searching. Single-layer method is used for all .jpg files and I try to use it on emir.tif with a much larger search range once, the image looks good but it takes a really long time. The results are shown below.

cathedral : g = [2, 5], r = [3, 12]
monastery : g = [2, -3], r = [2, 3]
tobolsk : g = [3, 3], r = [3, 6]
emir_exhausive : g = [24, 49], r = [55, 103]

Image Pyramid Method

Image pyramid method is used to speed up the process for large images. After separating the three r, g, b channles, it will rescale each channel until the height of the matrix is less than 100, and then starting from the coarsest scale, with initial guess x, y both as 0 and the search range as 15, it finds the optimal alignment coordinates, updates the initial guess x, y as twice the previous optimal alignment coordinates, cause it uses scale factor of 0.5 working down the pyramid, and search range as half as before but as least 5, and continues until finishing the finest scale. It also adopts a default 10 percent cropping on all sides, and all layers use ssd as the searching heuristic. The base layer is just single-layer method. Image Pyramid method is used for all tif files, and all the results are generated in 20 seconds with perfect alignment except for the emir.tif. The results are shown below for all tif files except for the emir.tif.

harvesters : g = [16, 59], r = [13, 124]
icon : g = [17, 41], r = [23, 89]
lady : g = [9, 51], r = [11, 112]
melons : g = [10, 81], r = [13, 178]
onion_church : g = [26, 51], r = [36, 108]
self_portrait : g = [29, 78], r = [37, 176]
three_generations : g = [14, 53], r = [11, 112]
train : g = [5, 42], r = [32, 87]
village : g = [12, 64], r = [22, 137]
workshop : g = [0, 53], r = [-12, 105]

Problems with emir.tif

The result of emir.tif looks a bit wield due to the brightness difference in three colore channels, the original reuslt without fixing the brightness issue is shown on the left. Then I use the mean of pixel values in a single color channel to represent the brightness of the channel, turns out the green channel has the lowest brightness, so I use the same algorithm but instead using green channel as the base channel and finding the optimal alignment coordinates for blue and red channels. The result turns out to be really good, and I think that this solves the problem because using the lowest brightness channel has bigger contrast to other two color channels so the difference between alignment coordinates can be easily shown and the answer is more accurate.

emir : g = [24, 49], r = [-683, 75]
emir : b = [-24, -49], r = [17, 57]

Additional Pictures

Other five pictures of my own choosing from the Prokudin-Gorskii collection.

boat : g = [38, 42], r = [79, 134]
Milan_Cathedral : g = [12, 56], r = [24, 107]
Pond : g = [5, 53], r = [-29, 115]
Sepulchre : g = [32, 63], r = [51, 141]
Summer_Cathedral : g = [22, -97], r = [39, -54]