CS 194 Project

Aditya Yadav


Overview:

Long ago in 1907, Sergei Mikhailovich Prokudin-Gorskii traveled the Russian Empire to take color photographs of everything he saw. He had special permission from the Tzar to do so. He ended up taking photographs of nearly everything from bridges to landscapes to people, etc... thousands of pictures total. His main idea behind capturing the color in the images was to record three exposures of each scene he wanted to capture. He used a red, green, and blue filter to capture the exposures on to glass plates and although there was no way to print color photos until much later, he believed that people would eventually be able to use these glass plates to recreate the color images. In 1948 the library of congress purchased the plates and have made them available online.

Our goal for this project was to take the glass plate images and use techniques in the realm of image processing to produce color images. This would be done by extracting the three color channel images and aligning them on top of each other to form one RGB image.

A lot of this project also revovles around making a fast and efficient way to do the above mentioned alignment because some of the image files are very big.

My Approach:

Image Pyramid:


I used image pyramids in order to actually process and figure out how to align my images. More specifically, first I cropped my images down by 10% from all sides. I then found out how many times I would have to scale them down by 2x in order for them to have a height and width of at most 175 pixels. I then scaled the image down the needed amount of times and ran the alignment algorithm (explained later) to figure out what offset is needed at the scale to align the images best. I then scaled the original image down 1 less time than the previous iteration and searched around the displacements found from the previous call to align them. If I found a better alignment then I returned that displacement and continued down the pyramid. Now, an important note here was that as you work through the pyramid, you need to scale the offsets you find as well by 2x to ensure they're the same when the image is 2x bigger.

Alignment Algorithm:


For actually aligning 2 images, depending on what iteration of the image pyramid you were on, slighty different things happened. If you were on the first iteration of the image pyramid, so the image is scaled down as small as it will be, then the alignment algorithm simply offsets one of the images X up and down by -15 to 15 pixels, thus checking every combination of those moves. image X (offseted) is now compared to image Y and the ssd between them is found. An important note for the ssd is that I made sure to not compute the ssd for the parts of the image on the edges that could possibly have overlap from the opposite side of the image from the offsetting. For this I just ignore 15 pixels in from each side. Now depending on which offset had the best ssd (min ssd) i return that offset.

If in the image pyramid process we are on anything past the first iteration, the main difference is that the alignment algorithm is given around what offsets to search and then only looks around those specific offsets, more specifically by offsetting by 3 in each direction from the given offsets and checking those displacements and getting their ssd and seeing which is best. That is then returned for the next iteration.

Issues Encountered:


While trying to figure out the algorithm, I had to tweak parameters constantly. Some key problems I encountered were that in order to make the image pyramid algorithm faster I had to figure out how small to make the images at most to ensure there was still a point in aligning them. Also, for calculating the ssd, I had to ignore the offset pixel window around the images to avoid that overlap. This helped a lot.

Cropping explained in bells and whistles

Results of Aligning Given Images:



Offsets are represented as (X, Y).
X represents the shift to the right. Negative value means a shift to the left.
Y represents a shift down. Negative value means a shift up.
Offsets are in units of pixels

r and g were both shifted to align onto b

Image Info Aligned Image
cathedral.jpg
g offsets: (2, 5)
r offsets: (3, 12)
5.73 seconds
emir.jpg
g offsets: (24, 49)
r offsets: (43, 86)
32.59 seconds
harvesters.jpg
g offsets: (16, 59)
r offsets: (13, 123)
32.31 seconds
icon.jpg
g offsets: (17, 41)
r offsets: (23, 89)
33.56 seconds
lady.jpg
g offsets: (9, 51)
r offsets: (11, 112)
33.10 seconds
melons.jpg
g offsets: (10, 81)
r offsets: (13, 178)
29.05 seconds
monastery.jpg
g offsets: (2, -3)
r offsets: (2, 3)
6.05 seconds
onion_church.jpg
g offsets: (26, 51)
r offsets: (36, 108)
33.29 seconds
self_portrait.jpg
g offsets: (29, 78)
r offsets: (37, 176)
35.51 seconds
three_generations.jpg
g offsets: (14, 53)
r offsets: (11, 112)
33.13 seconds
tobolsk.jpg
g offsets: (3, 3)
r offsets: (3, 6)
6.11 seconds
train.jpg
g offsets: (5, 42)
r offsets: (32, 87)
31.80 seconds
village.jpg
g offsets: (12, 64)
r offsets: (22, 137)
34.83 seconds
workshop.jpg
g offsets: (0, 53)
r offsets: (-12, 105)
31.95 seconds

Results of Aligning Images of my Choosing:

Image Info Aligned Image
church.jpg
g offsets: (19, 9)
r offsets: (38, 35)
33.23 seconds
pinkhus.jpg
g offsets: (-22, 52)
r offsets: (-55, 107)
34.54 seconds
river.jpg
g offsets: (3, 17)
r offsets: (19, 138)
33.20 seconds

Challenges Encountered:

Images like emir.tif or pinkhus.tif were pretty hard to get to align. It is still not perfect. In emir.tif I did figure out that using an edge detection image and computing the ssd on that during alignment makes a big difference but the other images didn't fare as well with that change. For pinkhus.tif, I believe that the challenge lies around the fact that the water probably was not constantly in the same position during the multiple exposures making it hard to align that. Overall though, the biggest challenge I found was trying to ensure that images didn't take too long to process without sacraficing the smaller images and having them not work. I had to tweak paramters to ensure that it worked. Also images like harvesters.tif were hard because there is some rotation in there that can not be fixed with an alignment in simply the x and y directions.

Bells and Whistles:

Cropping:


First Crop:

The first step in my cropping algorithm was to compare the offsets for both the g and r images and determine based off of them, how much to crop from each side of the stacked colored image to remove any sort of overlap that was caused due to shifting either of the images. When the images were shifted they would repeat on the opposite side since they were being rolled. I wanted to remove that. To do so, I checked how much both g and r moved in each direction, so for example if they moved X and Y pixels down respectively, I would compare X and Y and choose the maximum value, and then remove the top Y pixel rows. I did the same thing for all sides basically. I performed the crop on the stacked color image.



Second Crop:

After that initial crop, I then continued by cropping the aligned version of r and g, and b, all the same way as I did the initial stacked image above. I then put them all into a sobel filter to produce edge detection images. I then checked the outermost 10% of the images on all sides starting from the outer edge and working in, to see if any of the rows or columns consisted atleast 65% of values greater than 0.20 for rows and 0.10 for columns. If they did then they were considered an edge and I saved the value for the furthest edge found from all directions for the three images. I compared the values found from all the furthest edges for all images and similarly to how I chose the largest offsets from all sides above in the first crop, I did a similar thing to figure out how much to remove from the sides to ensure that if a certain image cut a lot, the final image cut a lot. I performed this final crop with those values on the result of the first crop on the stacked color image. This was the result.



The algorithm probably wasn't the best but it worked well for some images, not all. It definitely did help a bit for all images. It was a bit computationaly heavy though.

g and r offsets are represented the same way as before.

For the first crop and second crop offsets, they are in the form of (A, B, C, D)
A represents how many pixel rows from the top were cut off
B represents how many pixel rows from the bottom were cut off
C represents how many pixel columns from the left were cut off
D represents how many pixel columns from the right were cut off



Image Info Aligned Image Aligned and Cropped Image
cathedral.jpg
g offsets: (2, 5)
r offsets: (3, 12)
[First Crop]: (12, 0, 3, 0)
[Second Crop]: (0, 3, 11, 18)
6.61 seconds
emir.jpg
g offsets: (24, 49)
r offsets: (43, 86)
[First Crop]: (86, 0, 43, 0)
[Second Crop]: (0, 0, 85, 39)
41.03 seconds
harvesters.jpg
g offsets: (16, 59)
r offsets: (13, 123)
[First Crop]: (123, 0, 16, 0)
[Second Crop]: (0, 0, 67, 70)
38.71 seconds
icon.jpg
g offsets: (17, 41)
r offsets: (23, 89)
[First Crop]: (89, 0, 23, 0)
[Second Crop]: (0, 0, 71, 0)
42.82 seconds
lady.jpg
g offsets: (9, 51)
r offsets: (11, 112)
[First Crop]: (112, 0, 11, 0)
[Second Crop]: (0, 0, 0, 0)
43.11 seconds
melons.jpg
g offsets: (10, 81)
r offsets: (13, 178)
[First Crop]: (178, 0, 13, 0)
[Second Crop]: (0, 0, 98, 95)
48.27 seconds
monastery.jpg
g offsets: (2, -3)
r offsets: (2, 3)
[First Crop]: (3, 3, 2, 0)
[Second Crop]: (11, 7, 18, 14)
48.27 seconds
onion_church.jpg
g offsets: (26, 51)
r offsets: (36, 108)
[First Crop]: (108, 0, 36, 0)
[Second Crop]: (0, 0, 109, 110)
41.95 seconds
self_portrait.jpg
g offsets: (29, 78)
r offsets: (37, 176)
[First Crop]: (176, 0, 37, 0)
[Second Crop]: (0, 0, 163, 110)
41.74 seconds
three_generations.jpg
g offsets: (14, 53)
r offsets: (11, 112)
[First Crop]: (112, 0, 14, 0)
[Second Crop]: (0, 0, 98, 209)
45.33 seconds
tobolsk.jpg
g offsets: (3, 3)
r offsets: (3, 6)
[First Crop]: (6, 0, 3, 0)
[Second Crop]: (3, 10, 18, 14)
6.42 seconds
train.jpg
g offsets: (5, 42)
r offsets: (32, 87)
[First Crop]: (87, 0, 32, 0)
[Second Crop]: (0, 0, 69, 0)
36.66 seconds
village.jpg
g offsets: (12, 64)
r offsets: (22, 137)
[First Crop]: (137, 0, 22, 0)
[Second Crop]: (0, 0, 0, 68)
41.64 seconds
workshop.jpg
g offsets: (0, 53)
r offsets: (-12, 105)
[First Crop]: (105, 0, 0, 12)
[Second Crop]: (0, 0, 0, 85)
38.53 seconds
pinkhus.jpg
g offsets: (-22, 52)
r offsets: (-55, 107)
[First Crop]: (107, 0, 0, 55)
[Second Crop]: (0, 0, 98, 49)
38.53 seconds
river.jpg
g offsets: (3, 17)
r offsets: (19, 138)
[First Crop]: (138, 0, 19, 0)
[Second Crop]: (0, 0, 0, 185)
32.21 seconds
church.jpg
g offsets: (19, 9)
r offsets: (38, 35)
[First Crop]: (35, 0, 38, 0)
[Second Crop]: (10, 0, 51, 63)
31.39 seconds