Introduction

The project is to recombine the RGB channels from frames that were taken separately by Prokudin-Gorskii while also making sure to handle image alignment and large files.

My basic approach was pretty straightforward, where I used Normalized Cross Correlation to score each displacement between [-15,15]. This works pretty well for the small images. Then, in order to handle larger images I implemented an image pyramid that worked in exponents of 2.

To improve the speed I made the following small optimizations:

I dynamically adjusted the number of levels on my pyramid depending on the image resolution. In principle this balances speed with the need for fine-grained refinement.
For the top level I would redo the search with a larger radius if the resulting displacement suggests we need to do so as it’s fast to recompute large displacement searches at the top level.
Instead of aligning R and G to R, I aligned R to B, then G to AR (the aligned R). This was the magic sauce that fixed Emir for me so I just continued doing that. My hunch is since the blue dress is so prominent in Emir it’s easier to match R and G together since they are closer in intensity in that area. This means I didn’t need to use more complex methods such as edge detection to match Emir properly, and there is probably a way to choose the base optimally based on a “closeness” measure.

In terms of parameters at each pyramid level I would search in displacements between [-5, 5] in both x and y directions. My preprocessing involved cropping the middle 60% of the image to avoid edge artifacts from the scanning and cropping process.

My algorithm clocks in under a minute (as measured by %time in iPython) for all the example images on a semi-modern macbook which I would consider successful, as my goal was to maximize quality within the time constraint.

Results on examples:

Note: for display purposes I applied a flat 10% crop on each side.

Displacement g:(53, -1) r:(105, -12)

Displacement g:(42, 7) r:(85, 33)

Displacement g:(3, 3) r:(7, 4)

Displacement g:(51, 15) r:(110, 13)

Displacement g:(78, 29) r:(176, 37)

Displacement g:(50, 27) r:(108, 37)

Displacement g:(-3, 2) r:(3, 3)

Displacement g:(83, 10) r:(179, 13)

Displacement g:(53, 8) r:(114, 12)

Displacement g:(41, 18) r:(90, 23)

Displacement g:(59, 17) r:(124, 15)

Displacement g:(48, 24) r:(106, 41)

Displacement g:(5, 2) r:(12, 3)

Displacement g:(33, 2) r:(98, 5)

Images of my own choosing

Displacement g:(12, 5) r:(42, -3)

Displacement g:(13, 18) r:(38, 22)

Displacement g:(-21, 15) r:(36, 22)

Displacement g:(47, 5) r:(220, 5)

Bells and whistles

The post-crop images are already pretty nice, but they look a little flat. To improve this I increased the dynamic range, allowing at most 0.2% of the image to clip. This mostly just involved finding the 0.1th and 99.9th percentile of levels (R+G+B) and rescaling all the values to be within those bounds. Here are some comparison images:

Conclusion

This was really fun! I just started reading ‘A Gentleman in Moscow’ and so this time period is very interesting to me on a personal level. I hope we get more projects like this in the future.