CS 194:-26 Image Manipulation and Computational Photography, Fall 2018

Images of the Russian Empire: Colorizing the Prokudin-Gorskii photo collection

The thought process.

The initial thought.

Now the project description stated that the naive approach would be to scan a [-15,15] area by shifting one image and keeping another stationary. Each shifted version of one of the color channel images would have its SSD with the stationary channel calculated and the shift with the smallest one would be the shift to use. I decided to use the blue channel as the stationary one and move the red and green channels. However, I did a silly mistake. Instead of calculating over all possible displacements, I took the minimum of each direction separately, simply the minimum vertically and horizontally. What made this hard to realize however was that this metric worked for three out of the four images I tested.

Trying to make something that don't work . . . work.

Now, what I thought was that I should just find a better way of analyzing the images. Rather then using the raw pixel values, I decided to use edge detection from gaussian blurring. As part of the spec, I knew we were not allowed to use any hi-level outside functions, but we were allowed to use resize. I knew that when a resized image was scaled back to its original size it created a gaussian blurred version of that image. Using this fact I subtracted resized and scaled back up versions of the images from themselves to get a mask of the original image.

This happened to work on all of the images now. However, I was still missing the fact that I was using a wrong displacement algorithm.

The realization

Once I moved onto trying to use an image pyramid to calculate the shifts for the larger images, I knew instantly something was wrong. After trying to perfect my image pyramid algorithm I realized that there was something simply wrong with what I was doing. I looked far and wide and realized that other people were describing that they used 2 loops. I only had 1 loop. Welp, time to start all over.

Restarting

I went back to my naïve approach and worked on fixing it. However, there was something still wrong, I was getting the wrong numbers. I thought it could be the edge detection, so I scraped that. Then I realized something blood boiling, the roil and resize function I had been using change the range of the image from 0-1 to 0-255, so my math was inherently wrong. I fixed this and, what do you know, everything worked beautifully.

The emir problem

All the sample images I tried worked until I tried the emir image. The simple SSD metric just didn’t work. I heard other people had used a different color channel as there base so I tried that but got worse results. Therefore, I decided to go back to edge detection.

Adding back the masks

Again something was wrong, I looked at my code for hours trying to figure it out and realized that I forgot to change my base color channel back to blue. I changed it back, and presto, it worked beautifully.

In Summary

I was able to get all the images to align. I used SSD with a recursive image pyramid. For images that did not align with this, I used edge detection by downscaling and then upscaling the image to create a gaussian blur that I subtracted from the original images.

A weird bug that I have is that I think the images might be unaligned by like a single pixel as compared to others results, but, as shown from the above trial and error, I don’t have the time to look for it at the moment. However, the images are still pretty much perfectly aligned.

Results

Offsets in the form: [y displacement, x displacement]

emir
Emir without edge detection, G->B:[49 24] ,R->B:[66 44]

Emir
Emir with change of base channel, G->B: does it even matter , R->B: it just messed up

Emir
Emir mask, G->B: n/a , R->B: n/a

Emir
Emir with edge detection, G->B: [49 24] , R->B: [106 40]

lady
Lady, G->B: [55 9] , R->B: [116 12]

icon
Icon, G->B: [41 17] , R->B: [89 23]

harvesters
Harvesters, G->B: [59 17] , R->B: [124 13]

train
Train, G->B: [43 6] , R->B: [87 32]

self_portrait
Self Portrait, G->B: [78 29] , R->B: [176 37]

turkmen
Turkmen, G->B: [56 21] , R->B: [116 28]

three_generations
Three generations , G->B: [52 14] , R->B: [111 11]

village
Village , G->B: [65 12] , R->B: [138 22]

settlers_aligned
Settlers, G->B: [7 0] , R->B: [15 -1]

monastery_aligned
Monastery, G->B: [-3 2] , R->B: [3 2]

nativity_aligned
Nativity, G->B: [3 1] , R->B: [8 0]

cathedral_aligned
Cathedral, G->B: [5 2] , R->B: [12 3]

Vidnar
Vidnar, G->B: [18 26] , R->B: [105 51]

Krestinkamnetlen
Krestʹi︠a︡nkamnetlen, G->B: [66 15] , R->B: [137 17]

Molodoibashkir
Molodoĭbashkir, G->B: [38 -15] , R->B: [90 -35]

This page uses bettermotherfuckingwebsite as a reference