Project 1: Colorizing the Prokudin-Gorskii photo collection
Overview
Our goal for this project is to take digitalized Prokudin-Gorskii glass plate images and produce a color image by
aligning the 3 plates representing the 3 RGB channels. Naively overlapping the plates results in a very low quality colorized imaged, so we must apply some
displacement to each plate. To do this, we start with an exhaustive search over a window of possible (x, y) displacements,
and choose the one resulting in the lowest Euclidean distance squared (ssd) to the blue plate. To optimize this for
high resolution images, we use an image pyramid to process the image, searching for the best displacement from a coarse
to fine scale.
Approach
I first started with an exhaustive search over a range [-15, 15] of possible (x, y) pixel displacements for the
red and green plates, keeping the alignments with the lowest ssd to the blue plate. This produced decent
results for the lower resolution .jpg images, but was not scalable to the larger .tif images.
To remedy the scalability issue, I implemented an image pyramid to recursively search through possible displacements.
Each pyramid "level" was half the size of the previous. We start the alignment from the coarsest scale (smallest image).
At each subsequent level, we refine the displacement by searching 5 pixels to the left and 5 pixels to the right of the already calculated
displacement. At the coarsest scale (height <= 400), we align by doing the regular exhaustive search over the [-15, 15] range.
This procedure produced decent results for most of the .tif images.
For larger images, I also found it necessary to crop the border by 3%, which produced slightly better results.
Results
cathedral.jpg
g displacement: (-1, 1),
r displacement: (-1, 7)
tobolsk.jpg
g displacement: (2, 3),
r displacement: (3, 6)
monastery.jpg
g displacement: (0, -6),
r displacement: (1, 9)
icon.tif
g displacement: (17, 40),
r displacement: (22, 89)
train.tif
g displacement: (2, 43),
r displacement: (29, 87)
village.tif
g displacement: (12, 65),
r displacement: (26, 93)
harvesters.tif
g displacement: (16, 60),
r displacement: (13, 120)
three_generations.tif
g displacement: (12, 54),
r displacement: (9, 111)
onion_church.tif
g displacement: (25, 52),
r displacement: (36, 108)
workshop.tif
g displacement: (-1, 53),
r displacement: (-13, 101)
melons.tif
g displacement: (8, 82),
r displacement: (9, 169)
self_portrait.tif
g displacement: (0, 81),
r displacement: (0, 169)
lady.tif
g displacement: (-2, 55),
r displacement: (-8, 143)
emir.tif
g displacement: (24, 49),
r displacement: (38, 59)
New examples
sunset.tif
g displacement: (-41, 75),
r displacement: (-68, 114)
river.tif
g displacement: (-5, 25),
r displacement: (-14, 102)
trees.tif
g displacement: (0, 39),
r displacement: (12, 72)
Problems encountered
When implementing the image pyramid, the algorithm initially did not work well on the larger images. After some experimenting,
it seemed that cropping the edges of the images led to nicer results due to noise reduction. In addition, several parameters,
such as number of pyramid layers, range of each search iteration, and scale of pyramid images, all contributed to the
runtime and quality of the results. I found that a scale of 1/2 per image and a range of +/- 6 pixels per iteration led to
decent outputs for most images.
Failure cases
This algorithm noticeably does not work well on emir.tif and village.tif. This is possibly due to the difference in the
values between the three plates for these images. Specifically, for emir.tif, the blue in the man's clothing leads a very white value
in the blue plate and a very dark value in the red plate. SSD would label a "good" alignment with a poor score because of the
difference in value between the plates. A similar explanation could be given for village.tif.