CS194-26: Project 1 - Sean Chen

Project Overview

The Prokudin-Gorskii collection is a photographic survey of Russian Culture in the late Tsarist era. At a time where monochrome photography was commonplace, photographer and scientist Prokudin-Gorskii developed a method to capture colored images -- he placed red, blue, and green glass filters over a camera, so when the he took pictures from the cameras in quick succession, he could capture the same scene with three different light absorptions. Prokudin-Gorskii then showed the developed images through a stacked set of projectors with colored filters. Now with digital records of the collection, this class project aims to automate the tuning required to align the three color views into a clear, vibrant image.

My Approach

I started by implementing exhaustive search and tested on the small .jpg images. There were two directions of objectives to use in the search: cross-correlation of the color images or mean-squared error over different alignments. I tested both quickly by using the built in numpy implementation of 2D correlation with the intention to implement from scratch if the results were good, but results were sufficient for MSE/SSD approach that implementing 2D correlation was not necessary. As for the specific of the MSE approach, I measured the 2-norm of the difference between two images at a time over the different alignments, and these alignments were searched over a 50 x 50 grid. As for the choice of 2-norm, this wasn't necessarily optimal, so different ordinances were tested, but 2-norm turned out to work better than max and 1-norm.
The high resolution images are roughly 3000 x 3000 pixels compared to the 300 x 300 for low resolution, so exhaustive search would require a 500 x 500 grid search space. Both the larger search space as well as the larger set of elements to take MSE over would make exhaustive search prohibitively slow. Thus I implemented pyramid search. Here the high resolution image is scaled down by 16 to roughly a 180 x 180 pixel image, and exhaustive search is done on this small image. Then when the best alignment is found, the aligned image is scaled up by two, and the alignment is only adjusted over some slack. This way the image is only searched on a small space with a small set of elements in MSE. One optimization on this is that the search is stopped at the 1/4 scale, and the alignment is directly scaled up from there. Taking advice from previous projects, the cut in the search prevents taking MSE over the high resolution image. The results were sufficiently clear with the cut off.
Mentioned above, alignment search is performed over a pair of images at a time, where one color serves as the base that the other two are aligned over. As such, which color acts as the base can affect the quality of the search results. Most images were aligned with red as the base, including all the extra images from the collection. Emir and Workshop aligned well with green as the base, and no other tweaking required. Church did not work well with any alignment.
As suggested in the hints, the borders of the image are either a dark lining or have bad contrast. Since alignment only needs a few distinct features to base on, the alignment works better when the unimportant edges are not factored into the MSE search. When presenting the image, it is ok to apply the derived image shifts to the raw, uncropped image because that the original images are the same size, and so are the cropped amounts.

Image Specs

As a side note, despite following recommendation to use area interpolation for lowering resolution, the compressed tif images still have some aliasing, especially the self-portrait.
Image Offset Order (base,1st,2nd) First Offset Second Offset Runtime (s) Image
tobolsk.jpg rgb (-4, -1) (-6, -3) 1.724
harvesters.tif rgb (-64, 4) (-124, -12) 1.678
monastery.jpg rgb (-6, -1) (-3, -2) 1.712
lady.tif rgb (-64, -4) (-116, -12) 1.168
church.tif rgb (-32, 8) (280, 528) 1.587
icon.tif rgb (-48, -4) (-88, -24) 1.557
onion_church.tif rgb (-56, -12) (-108, -36) 1.831
cathedral.jpg rgb (-7, -1) (-12, -3) 1.678
melons.tif rgb (-96, -4) (-180, -12) 1.723
self_portrait.tif rgb (-96, -8) (-176, -36) 1.14
emir.tif gbr (-48, -24) (56, 16) 1.457
workshop.tif gbr (-52, 0) (52, -12) 1.677
train.tif rgb (-44, -28) (-88, -32) 1.1
three_generations.tif rgb (-60, 4) (-112, -12) 1.262
boat.tif rgb (-120, 8) (-132, 12) 1.68
pidma.tif rgb (-28, -20) (-36, -40) 1.073
cross.tif rgb (-28, 4) (-48, -4) 1.698

Bells & Whistles

Aligning over different colors: Histogram Equilization: The color images have pretty differing bightness on their own, so it could make sense to balance their brightness before searching. I first tried applying Contrast-Limited Adaptive Histogram Equalization (CLAHE) using OpenCV implementation, and this did allow Church to align properly. Later to speed up processing, I switched to vanilla Histogram Equalization and this worked fine. I suspect that histogram normalization specifically worked with Church because, if we look at the final image, the tint is very blue, so the blue definitely has an brighter average.
Normalization: Instead of histogram equalization, statistically normalizing each color of Church works well, for the same reasons as for equalization. Specifically, the normalizing is calculated over each color individually, so that each color has average brightness 0, std 1.
Automatic cropping: An easy way to automatically crop given that alignment is done by rolling is to remove the amount that is rolled from the final image. Note that this does not account for the black bordering that was originally along some sides. Because of the black boardering, there is still some color streaks on the sides after automatic cropping. Below are the results of doing this.

Preprocessing for Church.tif

None CLAHE Histogram Equalization Statistic Normalization

Automatic Cropping