Vincent Escueta

Sergei Mikhailovich Prokudin-Gorskii took 3 cameras. He put a blue filter on one, a green filter on another, and a red filter over the last. He put the green camera on top the red camera and the blue camera on top of the green one. He then took these cameras and photographs of scenes with all three cameras. Sergei took thousands of pictures, but never got to see them in color. By aligning his photographs using normalized cross correlation, the colored photographs that Sergei wanted have now come to life.

Normalized Cross Correlation (NCC)

To produce colored images, the blue, green, and red images must be well aligned. To do this, we move the red and green photographs over the blue until they almost perfectly align. To find the best alignment, we took the red and green images and displaceed them horizonally and vertically ([-15, 15] pixels each direction) and found the normalized cross correlation between the green image and the blue image then with the red image and the blue image at each displacement based on gray intensities. The displaced green and red photos that give the best correlation with the blue photos are then used to create the colored rgb image. The equation for the normalized cross correlation is

 < image1 / ||image1||, image2 / ||image2|| >

where image1 is the pixels of image1 where we treat the overall image as a vector, ||image1|| is the magnitude of the whol first image, and image2 is just like image1. By dividing each picture by the magnitude we form the unit vector of each image and we get the dot product of the unit vectors to get the correlation. The higher the dot product, the closer the dot product is to 1, the better the correlation between the 2 images.

However, a lot of images didn't produce good photos. This is because the intensities at pixels of two different colors that are of the same part of the scene may be drastically different. To account for that, I only correlated the middle 50% of the photographs because I assumed the main parts of the picture that would help provide the best correlation would be in the middle 50% of the images.

Below are a few examples of the images produced using NCC. The left images(odd Figures) are results when doing NCC on the whole the whole image. The right images(even Figures) are results when doing NCC in the middle 50% of the images.

Figure 1: all pixels of monastery.jpg: R:[8 1] G:[-6 0] TIME:0.565473079681	Figure 2: middle 50% of pixels monastery.jpg: R:[2 2] G:[-3 2] TIME:0.41522693634
Figure 3: all pixels of nativity.jpg: R:[7 1] G:[4 1] TIME:0.594136953354	Figure 4: middle 50% of pixels nativity.jpg: R:[8 0] G:[3 1] TIME:0.450391054153
Figure 5: all pixels of settlers.jpg: R:[14 -1] G:[7 0] TIME:0.823676109314	Figure 6: middle 50% of pixels settlers.jpg: R:[14 -1] G:[7 0] TIME:0.405462026596

Image Pyramid

The images shown perviously were all smaller photos - about 300 pixels on one side. However, for larger photos that were about 3000 pixels on one side, the amount of time it took to align and color the photos were significantly longer because we had to increase our range of displacements and had to generate NCC's for larger images.

To decrease the amount of time it takes to align and color the photos, we used the image pyramid. The way it works is that we continually rescaled an image down by half until rescaling the image down by half would mean that it's height would be less than 128. We then aligned and colored the image using NCC. Once we got a displacement, (x, y), we rescaled the image upward by doubling the size of the current image and did NCC with a displacement range of [(x-2, x+2), (y-2, y+2)]. We repeat this until we align and color the original sized image.

The following shows the time difference between the basic scanning of the images versus using the Image Pyramid algorithm.

Figure 7: Naive cathedral.jpg: R:[12 3] G:[5 2] TIME:2.9832880497.	Figure 8: Image Pyramid cathedral.jpg: R:[12 3] G:[5 2] TIME:0.675570011139.
Figure 9: Naive icon.tif: R:[15 12] G:[15 14] TIME:316.376515865.	Figure 10: Image Pyramid icon.tif: R:[89 23] G:[41 17] TIME:17.6947059631

The time to align and color smaller images with the Image Pyramid is about 2.3 seconds less than before, but the time difference in the larger images is about 300 seconds and because our displacement values are [-15,15], then the best alignment can't be calculated. Making the displacement values even larger would make the time to align larger images even worse.

The following are the larger images that used the Image Pyramid algorithm.

Figure 1: middle 50% of pixels self_portrait.tif: R:[175 37] G:[78 29] TIME:13.4188771248	Figure 12: middle 50% of pixels of three_generations.tif: R:[110 12] G:[52 14] TIME:14.5100548267
Figure 13: middle 50% of pixels turkmen.tif: R:[116 28] G:[55 21] TIME:13.3178520203	Figure 14: middle 50% of pixels village.tif: R:[137 22] G:[64 12] TIME:15.6597101688

Edge Detection

Although all the images that were produced after the image pyramid were good, there were still some exceptions when using pixel intensity to align images that would produce bad results such as in Figure 29. To fix this, instead of doing NCC on raw pixels and pixel intensity, applying the canny edge filter allowed for the best results for all images. The canny edge filter was applied on each image used for calcuating NCC. Figure 29 shows NCC when looking at all the pixels, Figure 30 shows the aligning becoming worse after applying the Image Pyramid algorithm, and Figure 31 shows the alignment becoming perfect after applying the canny edge filter. The runtime increased significantly, but the results were much better