Before the time color photograph was invented, Prokudin-Gorskii took a simple way to portray the colorful world with only grayscale cameras: he took the same scene with three different exposures, using red, green, and blue filters. Even though he didn't materialize the color photographs, his RGB plates were preserved. This project uses contemporary image processing techniques to automatically overlap three digital negatives to produce one single color image.
Given the original image composing three channels, I first split the whole digital image into three images representing RGB values distinctively. Even though these three images have the same size,c to put them naively on top of each other will create blurry and shady images, because each RGB image might have some subtle variations. The Single-Scale alignment algorithm is intuitively simple: each time it matches two images from two channels, and it shifts one of the images in a given range to find the best matching results between two images. By exhaustively searching through from [-15, 15] displacement range on x and y axes, my algorithm uses L2 Norm to measure the similarity between images. My implementation uses a squared root version of the Sum of Squared Differences(SSD), but its efficacy is equivalent. The algorithm records the smallest L2 Norm and the corresponding displacement. After using the np.roll() funcßtion to shift two from three channels, we then stack all of these images together for the final color image.
For small images like .jpg files, Single-Scale alignment is able to find suitable displacements for good visualizations. For large images like .tif files, the algorithm becomes computationally inefficient. To deal with high-resolution images like .tif files, I implemented the Pyramid alignment algorithm which is a recursive function using Single-Scale alignment as a base case. The algorithm uses downscale images instead of original images to find the smallest L2 Norm. After each recursion call, the corresponding displacement will be modified based on the downscale ratio and recorded accumulatively for future recursions. Starting from a tiny image, the algorithm gradually upscales images back to the original shape to produce more precise measurements. Note that due to the introduction of recursions, each time the search box becomes [-3, 3] instead of [-15, 15]. For the number of recursions, I use an experiment-based parameter that generally works for all current high-resolution data.
In the original image composing three channels, there're white and black borders around the photograph. In the computation of L2 Norm, these borders contribute to unnecessary values, which lead to imprecise alignment. One way to solve this problem is o crop the borders and then align the cropped images. I implemented an Auto-Crop algorithm that detects the border by comparing the average value of the marginal rows and columns to the experimented threshold value. This leads to better alignment and reduced the severity of single-color bands due to misalignment.
In skeleton code, the alignment algorithm chooses the blue channel as the base channel, shifts the images from the red and green channels to match it. This generally works for images with similar brightness values. However, for emir.tif, which doesn't have the same brightness value, blue-based alignment doesn't produce optimal results. I tested emir.tif with green and red base alignments and found out the green channel method shows great improvement. For other images, the difference between the blue base and the green base is not as obvious as temir.tif.
For many images, the contrast is not so optimal. I implemented a CDF-based Histogram Equalization, with references mentioned in my codes and readme file. This produces many good-looking images, but for some images, it might subtly damage the effects of the alignment algorithm.
All of these examples are using Automatic Cropping, Green Base Channel, and without Automatic Contrast. These results should be reproducible from my main.py, by different input file names.
Here are some examples from the Prokudin-Gorskii collection that are not provided by the course staff. These images are colorized are using Automatic Cropping, Green Base Channel, and without Automatic Contrast.