Sergey Prokudin-Gorskii was an innovative Russian photographer who experimented with color photography well ahead of his time. In the early 1900s, he sought to take photos all over Russia in the hopes of making a documentary of the Russian Empire to teach schoolchildren about the culture and diversity of the empire.
He captured hundreds of scenes, each yielding three photographs, one taken with a red filter, blue filter, and green filter. Though the resulting three photographs are black and white, if they are alligned and projected through filters of the same colors they reproduce a color version of the scene.
The above is an example of three photos taken of a cathedral, each representing a different color channel. After alligning and stacking the channels (and a few other tricks), we can get the following image:
We can brute force an alignment of the red and green channels onto the blue channel by exhaustively searching over a specified window and calculating the sum of squared differences for every possible alignment in that window. We can then take the alignment with the smallest sum of squared differences loss.
I used the window [-15, 15] for the search window for both x and y displacement.
This approach works really well on these small JPG files, but is slow on large images due to the repeated calculation of the sum of squared differences loss.
In order to speed the algorithm up, we can use a course to fine image pyramid. Here we use the assumption that the alignment for two images is close to the alignment on the two downscaled images (after scaling the resulting displacement vector by 1/(sqrt(scale)) where scale is the scale factor we used when downscaling the image's resolution).
I used a 10 level image pyramid, downscaling the image resolution by .75 every time. I chose 10 levels because it was a good balance between speed and quality. As you get to lower and lower resolutions, your alignments get less and less accurate to what the best alignment is for the original image.
The results from using the image pyramid are pretty remarkable, producing good results for almost every image.
As learned in class, histograms with more uniform appearance will be higher contrast. We can use skimage's built in histogram equalization function for this. For some images, the results are very good!
The top row do not have histogram equalization, but the bottom row does, and the colors in the bottom row are more vibrant. I especially love the middle image of a church.
A naive approach would crop a fixed amount off of every side of the final image. We want to crop the parts of the image that are either black or a solid bar of color. The approach that I used was to take images and crop all rows and columns with average value less than .05 and greater than .95.
I had several issues while aligning. My worst image visually was probably the self portrait image, and I assume its because a very large image with very fine features, so small mistakes with alignment visually show a lot. Using a larger window for my base case would likely result in a more visually accurate result.
I tried canny edge detection, but it performed poorly on several images such as the lady.tif.
The problem spec asks us to align to the blue channel, but that gave me poor results on emir.tif. Aligning to green gave me better results on every image, including emir.tif.
Image title | B Displacement | R Displacement | Time taken |
---|---|---|---|
workshop.tif | (-53, 0) | (51, -11) | 34.881289105 |
emir.tif | (-48, -24) | (54, 17) | 33.859198490000004 |
monastery.jpg | (3, -2) | (6, 1) | 0.43461654500001146 |
church.tif | (-25, -4) | (33, -8) | 36.504083897 |
three_generations.tif | (-52, -13) | (56, -2) | 47.809113333 |
melons.tif | (-76, -6) | (82, 2) | 35.90759688699998 |
onion_church.tif | (-50, -27) | (54, 11) | 34.932735506 |
train.tif | (-42, -6) | (43, 27) | 34.723513911 |
tobolsk.jpg | (-3, -3) | (4, 1) | 0.4902734819999637 |
icon.tif | (-41, -17) | (47, 5) | 47.12127929000002 |
cathedral.jpg | (-5, -2) | (7, 1) | 0.7231956809999929 |
self_portrait.tif | (-72, -26) | (83, 6) | 39.33717431400004 |
harvesters.tif | (-57, -16) | (60, -3) | 40.981361505999985 |
lady.tif | (-52, -8) | (57, 3) | 47.164769616 |