Colorizing Black and White Images

Vivian Liu, cs194-26-aaf

Explanation of Technical Implementation

NCC, Image Pyramid

    The image pyramid recursively quarters the image until the image reaches a size (250px x 250px) that can be efficiently aligned using the NCC approach. At this scale, the images are aligned according to the interior 80% of the image.

    The NCC, or normalized cross-correlation, is a measure of similarity that searches for the alignment which maximizes the dot product of the normalized, vectorized images. My program rolls one image around the other using a window of possible displacements:[-15,15] to search for the best displacement.

    After aligning at the highest level of the pyramid (the image that is the most downsampled), the displacement is passed back down the pyramid and scaled 2x at each level. Then the image is realigned, albeit with a smaller window of possible displacements--[-2,2] rather than [-15,15]--to refine the alignment.

Automatic Cropping

     For cropping, I used Canny edge detection and did a loose implementation of Otsu thresholding (I think). The Otsu method is a method that binarizes images into foregrounds and backgrounds.

I summed over a Canny filter row-wise and column-wise to generate values for each row and column of each color channel. Then I looked for indices that met my Otsu threshold to determine what was the minimum amount of cropping I needed. My threshold was simple (one standard dev. above the average).

     I implemented a second image pyramid, but this one was purely for cropping so that I could determine the amount needed to efficiently crop .tif images. Like the NCC image pyramid, this pyramid passed down the crop amount and scaled it up by 2x.

    It turned out that the best results came from running this procedure twice, once to remove white space, and the second time to remove as much artifact/actual border as possible.

Top, without autocrop. Bottom, with autocrop.

Color Contrast, Color Balance

Color balance (left), color contrast (middle), just alignment (right).

These were my simpler bell and whistles. I implemented contrast stretching, which rescaled the intensity of the image so that all the values lied between a percentile range.

For color balance, I implemented something in literature I found called the simplest color balance algorithm, which scaled the color range of each channel so that it spanned almost the entire scale.

Generally, color balance tended to look better than color contrast. However, sometimes color balance would overdo it and the image could turn out a bit yellow, so in that case I just went with the color contrast version.

Results

Note that for this image, I had to align the red and blue channels against the green one.

*Extra image

Filename	Red Displacement	Green Displacement	Crop Amount	Time
cathedral.jpg	(2, 5)	(3, 12)	18px	1s
selfportrait.tif	(29,78)	(37, 176)	384px	15s
lady.tif	(9, 51)	(12, 111)	224px	20s
monastery.jpg	(2, -3)	(2, 3)	12px	1s
village.tif	(12, 64)	(22, 137)	224px	17s
icon.tif	(17, 41)	(23, 89)	208px	19s
nativity.jpg	(1, 3)	(0, 7)	20px	0s
harvesters.tif	(16, 59)	(13, 124)	192px	19s
emir.tif (aligned against g channel)	(17, 57)	Blue Displace: (-24, -49)	176px	16s
settlers.jpg	(0, 7)	(-1, 14)	20px	1s
three_generations.tif	(14, 53)	(11, 112)	208px	11s
train.tif	(5, 42)	(32, 87)	160px	19s
turkmen.tif	(21, 56)	(28, 116)	256px	21s
extraplate.tif	(21, 60)	(20, 142)	192px	19s
extraplate2.tif	(23, 49)	(39, 105)	224px	22s
extraplate3.tif	(10, 60)	(18, 140)	224px	19s