Colorizing Black and White Images

Vivian Liu, cs194-26-aaf

Explanation of Technical Implementation

NCC, Image Pyramid



    The image pyramid recursively quarters the image until the image reaches a size (250px x 250px) that can be efficiently aligned using the NCC approach. At this scale, the images are aligned according to the interior 80% of the image.

    The NCC, or normalized cross-correlation, is a measure of similarity that searches for the alignment which maximizes the dot product of the normalized, vectorized images. My program rolls one image around the other using a window of possible displacements:[-15,15] to search for the best displacement.

    After aligning at the highest level of the pyramid (the image that is the most downsampled), the displacement is passed back down the pyramid and scaled 2x at each level. Then the image is realigned, albeit with a smaller window of possible displacements--[-2,2] rather than [-15,15]--to refine the alignment.

Automatic Cropping


     For cropping, I used Canny edge detection and did a loose implementation of Otsu thresholding (I think). The Otsu method is a method that binarizes images into foregrounds and backgrounds.

I summed over a Canny filter row-wise and column-wise to generate values for each row and column of each color channel. Then I looked for indices that met my Otsu threshold to determine what was the minimum amount of cropping I needed. My threshold was simple (one standard dev. above the average).

     I implemented a second image pyramid, but this one was purely for cropping so that I could determine the amount needed to efficiently crop .tif images. Like the NCC image pyramid, this pyramid passed down the crop amount and scaled it up by 2x.

    It turned out that the best results came from running this procedure twice, once to remove white space, and the second time to remove as much artifact/actual border as possible.

Top, without autocrop. Bottom, with autocrop.

Color Contrast, Color Balance


Color balance (left), color contrast (middle), just alignment (right).

     These were my simpler bell and whistles. I implemented contrast stretching, which rescaled the intensity of the image so that all the values lied between a percentile range.


For color balance, I implemented something in literature I found called the simplest color balance algorithm, which scaled the color range of each channel so that it spanned almost the entire scale.

Generally, color balance tended to look better than color contrast. However, sometimes color balance would overdo it and the image could turn out a bit yellow, so in that case I just went with the color contrast version.

Results

Note that for this image, I had to align the red and blue channels against the green one.

*Extra image

*Extra image

*Extra image

Filename Red Displacement Green Displacement Crop Amount Time
cathedral.jpg (2, 5) (3, 12) 18px 1s
selfportrait.tif (29,78) (37, 176) 384px 15s
lady.tif (9, 51) (12, 111) 224px 20s
monastery.jpg (2, -3) (2, 3) 12px 1s
village.tif (12, 64) (22, 137) 224px 17s
icon.tif (17, 41) (23, 89) 208px 19s
nativity.jpg (1, 3) (0, 7) 20px 0s
harvesters.tif (16, 59) (13, 124) 192px 19s
emir.tif (aligned against g channel) (17, 57) Blue Displace: (-24, -49) 176px 16s
settlers.jpg (0, 7) (-1, 14) 20px 1s
three_generations.tif (14, 53) (11, 112) 208px 11s
train.tif (5, 42) (32, 87) 160px 19s
turkmen.tif (21, 56) (28, 116) 256px 21s
extraplate.tif (21, 60) (20, 142) 192px 19s
extraplate2.tif (23, 49) (39, 105) 224px 22s
extraplate3.tif (10, 60) (18, 140) 224px 19s