Tessellate by HTML5 UP

Project description

The goal of this assignment is to take the digitized Prokudin-Gorskii glass plate images and, using image processing techniques, automatically produce a color image with as few visual artifacts as possible. In order to do this, I extract the three color channel images, place them on top of each other, and align them so that they form a single RGB color image.

Window of displacements

Anchoring at blue image, we try to align red image and green image by trying a window of displacements, say [-15, 15]. For each displacement value, we roll the image accordingly, and evaluate some distance function between it and the anchored image. We select the displacement with the minimum distance value.

Distance Function

For this assignment, there are 2 possible distance function. First is SSD (Sum of Squared Distance), and second is NCC (Normalized Cross Correlation). After experimenting with both, I concluded that the results were extremely similar. I only evaluate the distance function over the certer 2/3 of the images, since rolling an image will append the overflowed part on one side to the opposite side, which will not match the original image.

Pyramid Method

For large images, it is extremely slow and inefficient to perform the window method over a large set of displacements. Therefore we downscale the image, and use a small window size to align the coarse image; with that alignment, we double the scale of the image, so on and so forth. For each scale, we only use a small window; however we are moving to an accurate alignment gradually over the scales.

Results on Smaller Images

Original digitized glasses

cathedral.jpg

monastery

nativity

settlers

Single Scale Alignment

The result of running single scale alignment using a window of displacement works well on small images. Again we are only evaluating the center 2/3 of the images.

As we can see, the boundaries of the merged images are not aligned well due to rolling. This can be resolved by simply croppy out the borders.

Results on Larger Images

Pyramid Alignment

I started with the image downscaled to 1/512 of its original size, and then work my way to till the original image. At each level I still use a window of displacements from [-15, 15].

The boundary issue still exists, since we are using the same technique at each level. This method works on most of the images, but there are cases where it does not work well.

Failure

In the case of Emir of Bukhara, the images to be matched do not actually have the same brightness values (they are different color channels). This posts a significant challenge, since RGB intensity does not provide a good feature.

One solution is to use a better feature. I attempted Canny Edge detection to extract edge information for each pixel, and use that to match.

Edge feature

Instead of looking at RGB feature, I used canny edge detector to get edge matrix for each of the glasses

Using edge information as feature, I performed the same pyramid matching method on the edge matrices, and obtained the best displacements, and using them to align the original R and G images.

As can be observed, now the alignment is much better.

Automatic cropping

After obtaining the color image, I turned it into grayscale image, and then use edge detection on it. The sum of a row or column in the edge matrix is a good indicator of bordering. Using this information, we can implement auto-cropping. Some cropped examples are below