Xinyang Geng
Sergei Mikhailovich Prokudin-Gorskii was a Russian phogographer ahead of each time. From 1907 to 1915, he traveled around the Russian Empire, taking colored image of everything before colored photography was invented. He did so by taking three pictures of the same scene with red, green and blue filters. Later on, these images were purchased by the library of congress and made available online.
The fundamental idea behind image alignment algorithm is to exhaustively shift one of the images and find the best match. To quantify how good a match is, we use a metric called normalized cross correlation. For two vector and , the normalized cross correlation is defined as the inner product of the normalized vectors:
Our naive algorithm is
Note that the images all contain black edges that does not include any alignment information, so we crop out 14% of the images before we compute normalized cross correlation.
Let's run our algorithm on small images
The main disadvantage of our naive algorithm is that it is slow to search exhausively, especially for large images. One method to improve the performance is to find the approximate best offset by searching on the smaller scale images. Here is an illustration from Wikipedia:
Each level we reduce the image size by a factor of 2, and we stop rescaling when we hit a 300 pixel limit. In the bottom level, we still search the offsets from -15 to 15. For all the levels above, we only search 12 pixels (from -6 to 6).
Let's run the algorithm on all images.
Sometimes it is not ideal to compute the alignment directly on raw image. Hence I've also tried algining images with edge features. Here we used the Canny edge detector provided in scikit-image. Let's visualize the edge features.
Now let's run our algorithm on all the images.
We can see that all aligned images contain invaid edges. Here we propose an algorithm to automatically detect all the edges and crop them out. We notice that the edges correspond to large changes of pixel value along the axis, and this fact point to the direction of oriented gradients. The gradient of a image along a given axis can be computed as the convolution of a finite difference filter and the image.
Let's take a look at the gradient along the horizontal axis.
As we can see, the output is noisy. We can smooth it by applying a Gaussian filter. Let be the convolution operator. We know that
Hence, we can first take the derivative of the Gaussian filter by convolving it with the finite difference filter, and then convolve the resulting filter with the original image. This result in the derivative of Gaussian filter.
Let's visualize it.
Now let's plot the mean edge response on the y axis. The edge signal looks very clear! We can threshold it to detect the edges. We take 60% of the center area of the image, and take the maximum of edge response as the baseline. Then we add 0.5 standard deviation of the edge response and set it as our threshold.
Now we run our auto cropping algorithm on all the colored images. We can see that in the last image, the algorithm over-estimated the edge.
Let's run our algorithm on some extra images
We notice some of the images is blury, so let's apply unsharp masking to make it sharp. Let be the convolution operator and be a Gaussian filter. Let M be the input image, the unsharp filtering calculate the output as:
will give us the high frequency signal, and we add it back to the orignal image.
Finally, we recolor the image by transfering the color histogram from another image. Here we transfer the RGB distribution of one image to another.