Project1. Images of the Russian

Raiymbek Akshulakov

Introduction

Sergei Mikhailovich Prokudin-Gorskii (1863-1944) [Сергей Михайлович Прокудин-Горский, to his Russian friends] was a man well ahead of his time. Convinced, as early as 1907, that color photography was the wave of the future, he won Tzar's special permission to travel across the vast Russian Empire and take color photographs of everything he saw including the only color portrait of Leo Tolstoy. And he really photographed everything: people, buildings, landscapes, railroads, bridges... thousands of color pictures! His idea was simple: record three exposures of every scene onto a glass plate using a red, a green, and a blue filter. Never mind that there was no way to print color photographs until much later -- he envisioned special projectors to be installed in "multimedia" classrooms all across Russia where the children would be able to learn about their vast country. Alas, his plans never materialized: he left Russia in 1918, right after the revolution, never to return again. Luckily, his RGB glass plate negatives, capturing the last years of the Russian Empire, survived and were purchased in 1948 by the Library of Congress. The LoC has recently digitized the negatives and made them available on-line.

Here is the sample of the images that were taken

Our goal - to align those images into one colored one:

My approach

First step is to crop image by crop value from all sides - it is done because there is a lot more noise on the edges of the image than center so it is better to concentrate on the center

bearscscsac
Then I introduce another hyperparameter which is the gap. We define the window which is gap away from the boundaries of the cropped image. We denote that image at the blue channel as b_window and algorithm will try to find the corresponding window in another channel to get the displacement. In another image we define the search window over which we are going to search the corresponding window to b_window in the blue channel. It will be a square of side 2*gap+1 with the center in the lower left pixel position on the b_window. We go over all possible pixel position in that search window and use cross correlation to compare the b_window at they blue channel and rectangle of the the same size in the current channel with current search pixel position as the lower left pixel position as shown in the scheme above Once we find the corresponding window by maximizing the cross correlation we can expand that window by gap in all directions to get the image on the current channel that corresponds to the cropped blue channel. If we set crop to be greater than gap then we won't go over the initial image boundaries
bearscscsac

One Problem

Well that approach is the best if we have infinite time, but unfortunately that is not going to happen for big images. It will work fine with smaller images as you can see over here, big ones would take too much time to process since the inputs to cross correlation is super big #Here is some examples of small image alignments
Displacement : [3, -2]-green, [-3, -2]-red

Displacement : [-3, -3]-green, [-7, -3]-red

Displacement : [-5, -2]-green, [-12, -3]-red

Also another lowres image form the collection
Displacement : [-1, -2]-green, [-2, -3]-red

Image Pyramids

One possible workaround can be method called image pyramids that already uses our method that we already made!!! As image shape decreases higher search window we can use.


Imagine a dialog

So in this case what if we can just scale the image down
and then get the displacement and multiply it
by the scaling factor?

If we do it then we would loose some
precision when scaling the displacement

But then we already will be clos to the
actual values of the displacement, and that
will allow us to use smaller
search window size on bigger images

That is right. soo basically our pyramid
algorithm will consist from the
following steps write it down


Algorithm:

1. Crop images by some crop value
2. Scale down the image by some power of two, let's say two to the power of depth variable
3. Do brute force alignment on that smaller image and multiply the displacement by 2**depth
4. Apply this displacement on original image
5. Decrease the depth, decrease the gap value
6. Rinse and repeat on already shifted image with new depth and gap values

Over here you can see the results of that approach on big images


Displacement is [-60, -17]-green, [-124, -14]-red Displacement is [-41, -17]-green, [-89, -23]-red
Displacement is [-51, -26]-green, [-108, -36]-red Displacement is [-55, -8]-green, [-117, -11]-red
Displacement is [-25, -4]-green , [-58, 4]-red Displacement is [-53, 0]-green, [-105, 12]-red
Displacement is [-49, -24]-green, [-104, -55]-red Displacement is [-53, -14]-green, [-112, -11]-red
Displacement is [-79, -29]-green, [-176, -37]-red Displacement is [-42, -6]-green, [-87, -32]-red

Note: self portrait image uses a bit higher crop value since it is a bit bigger and requires more centered crop


Bells and Whistles

As you probably notice on image is a bit more blurry. That is Emir Bukharskii, Alim Khan ruler of Bukhara - the country that at that time was very close to my country Kazakhstan. Over here different channels are quite different in term of brightness values so that is probably why there is so much difference

One of the solution for this problem is based on that we do not need to rely on the brightness values because we can look at the edges - For example, we can use Canny edge detection from Opencv
python library. Over here how it looks like for emir

If during cross correlation step instead of using the channels we can use the edge detections and that worked out perfectly for emir as you can see here