Aligning Prokudin-Gorskii glass plates

Haoyan Huo

The task of this problem is to align RGB channels for Prokudin-Gorskii glass plates.

Method

Single-scale version

To align the RGB channels, we first implemented a single-scale version, that finds correct windows for G/R channels in the way such that G/R channels are aligned with B channel. B/G/R channels are just sliced from the original image in equal heights.

To measure the "goodness of alignment", we used SSD/NCC metrics:

SSD metrics: Compute the sum of squared distance between two images.
NCC metrics: Compute the negative of dot product of two flattened images.

One last word about computing the metrics is that the borders of channels are causing lots of troubles. In the end, I used the 70% of original image to compute metrics values.

Pyramid version

To align large images, we cannot use the single-scale version, because searching a large image alignment requires searching a large range of windows, which can be very inefficient.

To solve this, we implemented a pyramid version where we scale down images by a factor of 2 until the image dimension is less than 400x400. Then we perform a single-scale search for the smallest image, and use that as the center of searches for larger dimensions. Assuming the lower dimension image alignments are successful, the search range for the doubled-dimension image is only [-2, 2], which greatly reduces the computation burden.

Extras

Different features

We can for sure use the raw pixels as inputs to compute metrics values, however this proved to be less successful for the "emir" image, because the RGB channels for this image have large differences. To solve this, we also tried to compute the gradient of images, and use the sum of absolute values of gradients on X/Y directions as the input for metrics function.

I also used the edge detection algorithm provided in the python opencv package (Canny detector). Since it produces an image whose pixels are either 0/1, we can treat the edge image as another input for computing metrics.

Automatic border removal

The borders of these glass plates can be annoying for viewers. To remove them, we propose to use gradient information. The idea is that the border has very different RGB values compared to real photographs, which creates large gradients at the boundary between borders and real images. By thresholding gradient norms, we can know where borders approximately start/stop, and can then remove the borders.

Results

Single-scale version

Here are the results using the single-scale version calculated using raw pixels and gradients. Click each image to enlarge it.

We can see three images are nicely aligned. All methods give the same translation vectors. Except for "tobolsk", although the difference of synthetic image quality is really hard to tell.

Pixels_SSD	Pixels_NCC	Grad_NCC	Edge_NCC
cathedral G(5,2) R(12,3)	cathedral G(5,2) R(12,3)	cathedral G(5,2) R(12,3)	cathedral G(5,2) R(12,3)
monastery G(-3,2) R(3,2)	monastery G(-3,2) R(3,2)	monastery G(-3,2) R(3,2)	monastery G(-3,2) R(3,2)
tobolsk G(3,3) R(6,3)	tobolsk G(3,3) R(6,3)	tobolsk G(3,2) R(6,3)	tobolsk G(3,3) R(6,3)

Pyramid version results

Here are the results using the pyramid version calculated using raw pixels and gradients. We also added 2 more images from the Prokudin-Gorskii glass plates collections.

Note that now the Grad method performs better than the Pixels method for the "emir" and "lady" images, as we can hardly see any misalignments in the Grad method production. The failure of the Pixels method is largely due to different contrasts of RGB channels.

However, the Grad method performs worse than the Pixels method for "onion_church", "three_generations", and "train" images. I suspect this is due to the fact that these images are stretched differently in different channels, and they also contain more textures which have large changing gradients. The Grad method focuses more on "detailed" changes of the image, while the Pixels method can better "smooth" different stretchings.

Lastly, the Edge method performs reasonably well on all images. I think this is because the edge detector uses gradient information, as in the Grad version, but the post-processing, smoothing, and filtering of edges also makes it perform well on images that have large changing gradients.

Pixels_SSD	Pixels_NCC	Grad_NCC	Edge_NCC
emir G(49,24) R(88,45)	emir G(49,24) R(104,55)	emir G(49,24) R(106,41)	emir G(49,24) R(107,40)
harvesters G(60,16) R(124,13)	harvesters G(60,17) R(124,13)	harvesters G(60,14) R(124,11)	harvesters G(60,17) R(125,11)
icon G(41,17) R(89,23)	icon G(41,17) R(89,23)	icon G(39,16) R(88,23)	icon G(40,16) R(89,23)
lady G(56,8) R(110,12)	lady G(55,8) R(110,12)	lady G(57,9) R(120,13)	lady G(56,10) R(120,13)
melons G(82,11) R(179,13)	melons G(82,10) R(178,13)	melons G(79,8) R(176,14)	melons G(80,10) R(176,14)
onion_church G(51,26) R(108,36)	onion_church G(51,26) R(108,36)	onion_church G(51,29) R(107,34)	onion_church G(52,24) R(107,35)
self_portrait G(79,29) R(176,36)	self_portrait G(79,29) R(176,36)	self_portrait G(77,29) R(174,36)	self_portrait G(77,29) R(174,37)
three_generations G(53,14) R(112,11)	three_generations G(53,14) R(112,11)	three_generations G(55,12) R(107,7)	three_generations G(56,16) R(114,12)
train G(42,6) R(87,32)	train G(42,6) R(87,32)	train G(39,-3) R(85,28)	train G(41,7) R(85,29)
village G(65,12) R(138,22)	village G(65,12) R(138,22)	village G(64,10) R(137,21)	village G(65,11) R(137,21)
workshop G(53,0) R(102,-12)	workshop G(53,0) R(102,-12)	workshop G(53,-2) R(104,-13)	workshop G(52,-1) R(102,-12)
naziya G(16,2) R(43,4)	naziya G(16,2) R(42,4)	naziya G(16,2) R(40,3)	naziya G(16,1) R(41,3)
flowers G(23,15) R(52,21)	flowers G(23,15) R(52,21)	flowers G(24,15) R(52,21)	flowers G(24,15) R(52,20)
pidma G(50,13) R(113,19)	pidma G(50,13) R(113,19)	pidma G(52,11) R(113,19)	pidma G(52,12) R(113,19)

Border removal

The results of border removal is shown in the table below. A feature is that for some images, we can still see a colored border. This is due to lost pixels or insufficient exposures of certain channels. For example, the "lady" image contains a redish border at the end, probably due to a damaged glass plate. Whether we should remove this "colored border" depends on what kind of image we want, and can be easily changed by adjusting the threshold to cut off gradients. In my opinion, the colored border still captures real-world objects so they should not be removed, and they also add a cool old-styled fashion to the photo collection.

Original (Pixels NCC)	Processed	Original (Pixels NCC)	Processed