Project 1: Images of the Russian Empire
CS 194-26 Spring 2020
Colorizing the Prokudin-Gorskii photo collection
Henry Xu, CS194-26-aai
Overview
In this project, I used image processing techniques to automatically recolorize digitized Prokudin-Gorskii glass plate images with special attention to minimizing visual artifacts through proper channel extraction, superposition, and alignment.
To enhance final image quality, I also implemented automatic cropping, contrasting, and white balancing. To improve alignment across all images through better features, I experimented with both a naive median-based heuristic and Sobel edge detection.
Details
I started by reading in the data from the glass plate and splitting the image in thirds, one section for each channel in blue, green, and red, respectively. To minimize the impact of noise at the borders, I performed the alignment optimization with just the middle 80% of each channel, implemented by cropping 10% from each edge. The alignment optimization took two forms: single-scale (Part 1) and multi-scale (Part 2).
Part 1: Single-Scale Alignment
Single-scale alignment was performed using a naive grid search over all possible (x,y) displacements in a previously specified range (for this project I used [-15, 15] x [-15, 15]). Using the blue channel as the base layer, I optimized for the minimum of the Sum of Squared Distances (SSD) using raw pixel values over all displacements for the green and red channels (see Remarks section for notes about normalized cross-correlation). SSD is defined as follows:
where the subtraction is element-wise and the sum is over all elements. The objective function could thus be written as:
where c refers to the channel we’re interested in aligning (i.e., red or green). Once the displacement corresponding to the minimum SSD was found for both the green and red channels, the last step was to overlay the now aligned channels on top of each other and display the resulting image.
Results
Part 2: Multi-Scale Alignment
While capable for small images, single-scale alignment did not scale well as input image size grew. To tackle this challenge, I implemented multi-scale alignment through the use of image pyramids. The general idea is that it doesn’t cost as much to perform larger searches at the coarser resolutions, so we start from the smallest image and slowly work our way down the pyramid as the displacement estimate is refined, thereby avoiding large searches at the higher resolutions. The algorithm, along with some of the parameters I used, is provided below:
Results
cathedral.jpg
Green Alignment: [5 2]
Red Alignment: [12 3]
tobolsk.jpg
Green Alignment: [3 3]
Red Alignment: [6 3]
monastery.jpg
Green Alignment: [-3 2]
Red Alignment: [3 2]
1.
Precompute layers of the pyramid by scaling by a factor of 1/2 until the image is smaller than 32 by 32 pixels.
2.
At the smallest scale, initialize the starting displacement to be (0, 0) and perform single-scale alignment in a displacement range of [-15, 15] x [-15, 15].
3.
Perform single-scale alignment on the next layer of the pyramid, using the optimized displacement found for the previous scale multiplied by 2 as the new displacement starting point and decreasing the search range by 2 on all sides.
4.
Repeat step 3 until the bottom of the pyramid is reached, taking care to ensure the search range doesn’t go below [-1, 1] x [-1, 1]. Return the final displacement and image.
Note that alignment parameters are in the form [x y].
Of special note are modifications needed for emir.tif, noted in the Remarks section below.
icon.tif
Green Alignment: [41 17]
Red Alignment: [89 23]
harvesters.tif
Green Alignment: [59 16]
Red Alignment: [123 13]
lady.tif
Green Alignment: [51 9]
Red Alignment: [112 11]
melons.tif
Green Alignment: [81 10]
Red Alignment: [178 13]
onion_church.tif
Green Alignment: [51 26]
Red Alignment: [108 36]
self_portrait.tif
Green Alignment: [78 29]
Red Alignment: [176 37]
village.tif
Green Alignment: [64 12]
Red Alignment: [137 22]
train.tif
Green Alignment: [42 5]
Red Alignment: [87 32]
workshop.tif
Green Alignment: [53 0]
Red Alignment: [105 -12]
three_generations.tif
Green Alignment: [53 14]
Red Alignment: [112 11]
Remarks
emir.tif was a bit of an odd case due to the significantly different brightness values for the three channels, especially when it came to the subject’s clothing which was predominantly blue with little in the way of red, resulting in suboptimal alignment when applying the same configuration based on raw pixel values used for all the other images. In fact, we end up aligning the low brightness of the clothing in the red channel with the dark portion of the wall behind the subject. We achieved markedly more succesful results by choosing to not crop emir.tif and adopting a naive median-based pixel transformation, in which all pixels in a channel greater than the median for the channel were set to 1 and all pixels less than the median set to 0. Prevailing hypothesis for why these changes worked so well include: (1) the black border was quite consistent and accurate across the three channels for this particular image, resulting in cropping actually having adverse effects, and (2) the stringent median cutoff mitigates the chance the varying brightness values leads the algorithm astray in favor an feature that merely prioritizes general shapes and trends.
emir.tif without custom configuration
Green Alignment: [49 24]
Red Alignment: [ 26 -829]
Bells and Whistles
Automatic Cropping
Automatic White Balance
Automatic Contrasting
Sobel Edge Detection (Better Features)
Aligning and Processing Other Sources
Automatic cropping was achieved through filtering columns and rows near the borders with an unnaturally high number of pixels that did not match the majority of pixels in the scene. Specifically, I identified rows and columns where greater than 70% of the pixels in the row or column had values that fell outside of that of the middle 70% of all pixels in the scene, and cropped out all identified rows and columns that either were edges or had a direct link to an edge through rows and columns to be cropped. Comparison of before and after cropping for a few sample images are below:
Automatic white balance took the form of two types: grey world and white world. In grey world, we estimate the average color to be the illuminant, and scale the scene to make the illuminant grey by multiplying each channel by 0.5/ac where ac is the average brightness for the channel. In white world, we estimate the brightest color to be the illuminant, and and scale the scene to make the illuminant white by multiplying each channel by 1.0/ac where ac is the 99th percentile for the channel (max is not used due to being 1.0 in many of the images). Results for both methods are shown below:
emir.tif
Green Alignment: [51 23]
Red Alignment: [107 39]
emir.tif with custom configuration
Green Alignment: [51 23]
Red Alignment: [107 39]
Additional Examples
Automatic contrasting involves scaling the histogram of brightnesses such that the darkest pixel is zero and the brightest pixel is one. To mitigate the chance we pick reference darkest and brightest pixels that aren’t already zero and one, we look at only the middle 99% of pixels. We use the minimum and maximum pixels of this range to equalize the histogram by subtracting by the minimum and divided by the difference between the max and the min. The effects are a little subtle, but notice the increase in vividness of color in the results below:
As seen by the previous remark on emir.tif, the use of raw pixels for alignment unfortunately isn’t a one size fits all solution. The images below experience alignment problems similar to that of to emir.tif due to the channels not sharing similar brightness values. However, edge detection algorithms bring promise by focusing on general shapes and outlines, reducing the chance raw pixel values lead us astray. Using Sobel edge detection, which operates by approximating derivates, we see improved generalizability of our multi-scale alignment algorithm.
Remarks
Normalized cross-correlation (NCC), defined as the dot product between the normalized versions of the channels, was another similar measure I evaluated. Maximizing NCC produced similar to identical results to minimizing SSD.
piony.tif
Green Alignment: [75 21]
Red Alignment: [156 30]
yekaterinburg.tif
Green Alignment: [58 28]
Red Alignment: [128 35]
lugano.tif
Green Alignment: [ 41 -16]
Red Alignment: [ 92 -29]
adobe.tif
Green Alignment: [32 4]
Red Alignment: [79 7]
Before
After
Before
Grey World
White World
V Malorossii.tif
Green Alignment: [21 -5]
Red Alignment: [235 -9]
V Malorossii.tif
Green Alignment: [21 -5]
Red Alignment: [ 0 342]
lodeinoe_pole_cathedral.tif
Green Alignment: [ 24 -11]
Red Alignment: [ 96 -19]
stone_egg.tif
Green Alignment: [ 61 -17]
Red Alignment: [117 -44]
Before
After
Raw Pixel Values
Sobel Edge Detection
study_near_waterfall.tif
Green Alignment: [36 13]
Red Alignment: [123 24]
study_near_waterfall.tif
Green Alignment: [36 13]
Red Alignment: [0 -9]
Hubble Legacy Archive, NGC 6543
Files:
Red: hst_06943_09_wfpc2_f673n_pc_sci.fits
Green: hst_06943_09_wfpc2_f588n_pc_sci.fits
Blue: hst_06943_09_wfpc2_f487n_pc_sci.fits
Note that these files were converted to .tif format using FITS Liberator and a linear stretch function
Filters:
Red: 673nm narrowband
Green: 588nm narrowband
Blue: 487nm narrowband
Alignment:
Green Alignment: [0 0]
Red Alignment: [-3 -2]
Alignment Problems
Normalized Cross-Correlation
Better Color Mapping
Some of the colors in the aligned images seemed a little off, which may have been due to the assumption that the original plates corresponded directly to the red, green, and blue channels. To address this issue, I tried changing the color mapping, by converting the image from RGB to HSV and manually adjusting the hue. Changing the red channel seemed to only degrade quality, so only the hues for blue and green were adjusted. I couldn’t find an alternative color map that worked for everything, but I did find shifting green and blue a touch helped bring out a bit more depth to some of the images.
Adjusted Color Map
Original Color Map
emir.tif
Green Alignment: [49 24]
Red Alignment: [107 40]
emir.tif
Green Alignment: [49 24]
Red Alignment: [26 -829]