CS194-26: Project 1: Images of the Russian Empire

Ryan Koh

By recording three different exposures of various scenes onto a glass plate using red, green, and blue filters, the Russian man Sergei Milkhailovich Prokudin-Gorskii sought to document the world as he knew it in color! Now using the negatives that he was able to capture, we can process them to recreate the images in color!

Part 1: Naive Alignment

The general idea behind the process of recreating the images in color is to stack the three color channel images on top of each other, after correctly aligning them, in order to form a single RGB color image. Therefore, the main difficulties of the project is figuring out the correct approach to align the pictures. To do this, I set my blue color channel to be the reference point, and attempted to align the other color channels with respect to it. Using the metric of the L2 norm and two for loops, I shifted my red and green color channels within a displacement range of [15, 15] as best I could in the way that would minimize the norm between those respective channels and the blue one. Originally, I found the results of the alignment to be unsatisfactory, and tackled the problem by cropping the images to remove the edges when comparing the norms of the channels. This inevitably reduced the noise of the comparison to recreate the following small colored images with relative precision:

cathedral.jpg: Green Offset = (5, 2), Red Offset = (12, 3)
cathedral.jpg
tobolsk.jpg: Green Offset = (3, 3), Red Offset = (6, 3)
tobolsk.jpg
monastery.jpg: Green Offset = (-3, 2), Red Offset = (3, 2)
monastery.jpg

Part 2: More Efficient Alignment: Image Pyramid

While this naive algorithm worked well for small images, when used on larger images, the overhead of shifting and comparing these larger images directly caused the process to be incredibly slow. To combat this, I created an image pyramid, first converting the naive algorithm code into one that returned the offset calculated instead of the actual image after the shifts. Then, I took the original color channels, and downsized them by a factor of 2 between recursive calls. At the base case of a 400 x 400 pixel image, I ran the general algorithm on the coarse image to get an offset. This offset was then propagated up between recursive calls, multiplying the returned offset between calls by a factor of 2. Essentially, within every call, I used the previous offset value to reduce the search space of the offset for the next call, searching within some chosen radius around the offset. This allowed for much faster calculation time, allowing for the efficient and proper coloring of all the remaining images except emir.tif.

workshop.tif: Green Offset = (52, 0), Red Offset = (105, -11)
workshop.tif
three_generations.tif: Green Offset = (50, 14), Red Offset = (106, 25)
three_generations.tif
melons.tif: Green Offset = (69, 0), Red Offset = (167, 3)
melons.tif
onion_church.tif: Green Offset = (50, 26), Red Offset = (108, 37)
onion_church.tif
train.tif: Green Offset = (42, 6), Red Offset = (85, 32)
train.tif
lady.tif: Green Offset = (7, 0), Red Offset = (67, 1)
lady.tif
village.tif: Green Offset = (64, 13), Red Offset = (137, 23)
village.tif
self_portrait.tif: Green Offset = (77, 29), Red Offset = (176, 36)
self_portrait.tif
harvesters.tif: Green Offset = (59, 18), Red Offset = (123, 14)
harvesters.tif
icon.tif: Green Offset = (41, 18), Red Offset = (89, 23)
icon.tif
emir.tif: Green Offset = (48, 23), Red Offset = (9, -15)
emir.tif

Part 3: Aligning Emir through Attempts at Bells and Whistles (Extra Credit)

With emir.tif not aligning very well since its color channels were of different brightness values, I wanted to figure out a different metric to align the channels. The first thing I tried was to simply align the red channel to the adjusted green channel instead of to the reference blue channel. This by itself actually gave significantly better results, although the image still wasn't too sharp:

emir.tif (original approach): Green Offset = (48, 23), Red Offset = (9, -15)
emir.tif
emir.tif (adjusted reference): Green Offset = (48, 23), Red Offset = (106, 40)
emir.tif

However, after rerunning this new algorithm on the rest of the images, I noticed that although in general the quality of the pictures was exactly the same, lady.tif became aligned significantly worse:

lady.tif (original approach): Green Offset = (7, 0), Red Offset = (67, 1)
lady.tif
lady.tif (adjusted reference): Green Offset = (7, 0), Red Offset = (23, 0)
lady.tif

Based on this, I decided to go back to the drawing board and try again. After looking through the different options for bells and whistles on the project spec, and then researching a bit online, I came across the idea to use Canny Edge Detection to align the pixels. Essentially, my new algorithm involved taking the color channels, cropping the edges as usual, applying the Canny Edge Detection in order to get binary (black and white) images with edges detected, and then use the same image metric of the L2 norm in order to align the channels. To do this easily, I imported feature from skimage. Doing it this way, the images that became significantly better aligned were emir.tif, lady.tif, and melons.tif. The rest had either the same, or only a slightly different result:

emir.tif (original approach): Green Offset = (48, 23), Red Offset = (9, -15)
emir.tif
emir.tif (Canny): Green Offset = (48, 23), Red Offset = (106, 41)
emir.tif
,
lady.tif (original approach): Green Offset = (7, 0), Red Offset = (67, 1)
lady.tif
lady.tif (Canny): Green Offset = (62, 0), Red Offset = (116, -4)
lady.tif
melons.tif (original approach): Green Offset = (69, 0), Red Offset = (167, 3)
lady.tif
melons.tif (Canny): Green Offset = (79, 10), Red Offset = (180, 13)
melons.tif

Part 4: Additional Pictures from the Projudin-Gorskii Collection

Below are the results of my algorithm on various chosen pictures from the Projudin-Gorskii Collection. bamboo.tif seemed to have some trouble aligning properly due to the predominance of green, which may have caused the L2 norm image metric to not work as well:

old_cross.tif (Canny): Green Offset = (19, 9), Red Offset = (47, 3)
old_cross.tif
bamboo.tif (Canny): Green Offset = (55, -28), Red Offset = (122, -62)
bamboo.tif
prisoners.tif (Canny): Green Offset = (41, 12), Red Offset = (84, 21)
prisoners.tif