CS194-26 Project 1

Michael Weymouth (cs194-26-adc)

Overview

In this project, I was tasked with aligning the 3 color channels (R, G, and B) of several images from the Prokudin-Gorskii collection, then reassembling the channels into the proper color representation of the original scene.

This process was completed in the following way:

1. The image file is first horizontally split into 3 equal segments, representing the three color channels B, G, and R, respectively.

2. Both the R and G channels are aligned with the B channel as follows.

3. First, it demeans both channels by subtracting out the average pixel value in each color channel from every pixel in that channel. Then, it takes the absolute value of every pixel in both channels. Instead of relying on raw pixel values, the algorithm is now aligning to a metric of “difference” from the mean value of the channel. Since the different color channels will have different values for different objects, this allows the algorithm to instead use an object-based representation for alignment, which proved to be a better metric overall.

4. It then passes both images to a recursive pyramid search function, which recursively calls itself to downsample the image by a factor of 2 until it hits the base case: the width of the channel to align is less than 400 pixels wide. It then runs a displacement search over a range of [-20, 20], a parameter I determined experimentally.

5. This displacement search translates the channel being aligned in both X and Y directions by a varying amount in the determined range, feeds the two channels being compared into a ranking metric, then returns the translation with the minimum value of the metric.

6. The best metric I found in my experimentation works as such: first, it crops out the outer 80% of the image to get rid of the borders which negatively impacted the alignment metric. Then, it flattens both channels to a 1-dimensional array and normalizes them, padding with zeros if there is a dimension mismatch. Finally, the metric returns the negative of the dot product of the two normalized vectors.

7. The optimal found parameters are then passed up the stack, rescaled to represent the shift necessary for the higher-resolution image, and the image is displaced by those found translation parameters.

8. A displacement search is then run over the range [-1, 1], since we now know that the optimal shift must lie in that range from the already-shifted image. The results of this displacement search, added to the previous translation calculated, are then passed up the stack to the next recursive call.

9. This recursive backtracking then continues until the algorithm reaches the original image size, and it has found the optimal offsets. The image is then translated one final time and returned, along with the displacements used, to the alignment function which called the pyramid search function.

10. After both the R and G channels have been aligned to B, they are stacked with the B channel to find the final color image.

11. I also implemented the automatic contrasting functionality, which (if enabled) takes the minimum and maximum of the image and expands the range of all of the channels in the image such that the minimum pixel is at -0.1 and the maximum pixel is at a value of 1.0.

12. The image is then saved to an output file and the offsets used, along with the time it took to align the channels, is returned.

When I was first implementing pyramid search, I ran into quite a bit of trouble with a number of images, which was quickly corrected by adding the cropping step.

I also ran into some trouble aligning images with intricate patterns of differing colors, such as those found on the robe in emir.tif. This is when I added the demean and absolute value step, which corrected for this by reducing the reliance of the ranking metric on the actual pixel values of the channel.

Finally, there were a number of artifacts in some images, which I originally interpreted to be from my alignment procedure. However, upon closer inspection, it would seem that these issues were caused by imperfections the source material. Perhaps the best example of this occurring is the colorized version of the three_generations.tif image, in which the coat of the man on the left has a haloing color effect on the bottom left. These imperfections are a subject for further study, and would likely be best corrected in a manual restoration.

Example Results

emir_colorized.png

G channel displacement: [49, 23]

R channel displacement: [106, 41]

Alignment time: 9.352622747421265 sec

monastery_colorized.png

G channel displacement: [-3, 2]

R channel displacement: [3, 2]

Alignment time: 5.712937831878662 sec

three_generations_colorized.png

G channel displacement: [54, 15]

R channel displacement: [113, 11]

Alignment time: 9.309688091278076 sec

settlers_colorized.png

G channel displacement: [7, 0]

R channel displacement: [14, -1]

Alignment time: 6.227230072021484 sec

train_colorized.png

G channel displacement: [42, 5]

R channel displacement: [86, 31]

Alignment time: 7.970168828964233 sec

icon_colorized.png

G channel displacement: [40, 17]

R channel displacement: [89, 23]

Alignment time: 9.173699855804443 sec

nativity_colorized.png

G channel displacement: [3, 1]

R channel displacement: [7, 0]

Alignment time: 5.850849151611328 sec

cathedral_colorized.png

G channel displacement: [5, 2]

R channel displacement: [12, 3]

Alignment time: 4.322472095489502 sec

village_colorized.png

G channel displacement: [65, 12]

R channel displacement: [138, 22]

Alignment time: 9.257261753082275 sec

self_portrait_colorized.png

G channel displacement: [79, 30]

R channel displacement: [177, 38]

Alignment time: 8.883623838424683 sec

harvesters_colorized.png

G channel displacement: [60, 17]

R channel displacement: [125, 14]

Alignment time: 8.090704202651978 sec

lady_colorized.png

G channel displacement: [54, 8]

R channel displacement: [118, 12]

Alignment time: 8.588798999786377 sec

turkmen_colorized.png

G channel displacement: [55, 21]

R channel displacement: [115, 29]

Alignment time: 7.547335863113403 sec

Additional Results

00033u_colorized.png

G channel displacement: [53, 10]

R channel displacement: [103, 15]

Alignment time: 8.395071029663086 sec

00859u_colorized.png

G channel displacement: [56, 14]

R channel displacement: [125, 23]

Alignment time: 8.83419919013977 sec

00245u_colorized.png

G channel displacement: [28, -7]

R channel displacement: [108, -19]

Alignment time: 9.065867900848389 sec

00344u_colorized.png

G channel displacement: [34, 0]

R channel displacement: [119, -2]

Alignment time: 8.46563982963562 sec

00016u_colorized.png

G channel displacement: [45, 12]

R channel displacement: [96, 15]

Alignment time: 8.455512046813965 sec

01251u_colorized.png

G channel displacement: [51, 27]

R channel displacement: [108, 36]

Alignment time: 8.783711910247803 sec

Bells & Whistles

I implemented the automatic contrasting feature, which automatically expands the contrast range of the image. All of the above images were produced with automatic contrasting enabled, so below I present a few before-and-after images.

Emir Before

Emir After

Icon Before

Icon After

Village Before

Village After