The goal of this project is to take the digitized Prokudin-Gorskii glass
plate images and process them to create a colored picture. This program
extracts the three color channels from the glass plate image and aligns
the green and red filters to the blue filter as a reference. For each
image, the (x,y) displacement values that were used to align the filters
are printed under it.
Approach
The image set for this project can be separated into low resolution and
high-resolution glass plate images. After dividing the low-resolution
images into its R, G, B color channels, I first cropped the borders of
the images by 30 pixels to account for any outliers in the borders that
may disrupt aligning calculations. Since the low-resolution images have a
relatively small pixel displacement, I aligned the R, G, B parts by
exhaustively searching over a window of -20 to 20 pixels of possible
displacements for the filters and scored each one using the Sum of Squared
Differences (SSD) image matching metric. Note that SSD is applied to the
green and red filters with respect to the blue filter. After finding the
displacement with the best score, I applied the displacement values to
their respective filters and stacked the R, G, B filters together to
create the colored image.
Similar to the approach for low-resolution images, I first cropped the
borders of the high-resolution images by 300 pixels before performing any
aligning calculations. Since high-resolution images have a larger pixel
displacement compared to low-resolution images, exhaustive search becomes
inefficient to calculate. Because of this, a more efficient approach is an
image pyramid. An image pyramid represents the image at multiple scales
stacked on top of each other. The top image is the smallest sized version
of the image and the bottom is the original sized image. In my
implementation, each level of the image pyramid is scaled by a factor of
0.3 and the image stops being rescaled when its dimensions are smaller
than 350 by 350 pixels. At the top of the image pyramid, the image is
scaled small enough such that its pixel displacement can be found through
SSD. Using these pixel displacement values, we can find a new range of
displacement values to search through for the next image in the image
pyramid: First, we divide the displacement value returned from the
previous image by our scale factor. Next, add and subtract this value by
1 / scale factor. This gives us the minimum and maximum range
pixel displacement values to search through for the next image. This
process is repeated until the original image, resulting in aligned R, G, B
channels, and a smaller region of displacement values to search through.
Challenges
Some images failed to align properly such as the emir picture. This is due
to inconsistent brightness values across the R, G, B channels. In
particular, the blue channel of the emir image was the brightest compared
to its red and green channels which resulted in bad SSD metric scores even
when the channels were aligned properly. Thus, when finding the best
displacement value using the blue channel as a reference, it results in
the misalignment as shown in the image.
Images
Low Resolution Samples
Tap through the images below.
The green filter used (x,y) displacement values (5, 2). The red
filter used values (12, 3).
Cathedral
The green filter used (x,y) displacement values (-3, 2). The red
filter used values (3, 2).
Monastery
The green filter used (x,y) displacement values (3, 3). The red
filter used values (6, 3).