CS194-26 Project 1 Writeup

Overview:

This project aimed to align black-and-white images taken on blue, green, and red color channels drawn from the Prokudin-Gorskii collection, such that the final output image appeared colorized. Although to have a very robust alignment algorithm, many small features and anomalies that arise from the nature of the image collection need to be addressed, a simple exhaustive search technique based upon maximizing a single metric already does a good job on most images. I detail such an approach, which we built in this project, below.

Approach:

The colorization scheme worked by first segmenting the single image containing all the frames into the corresponding color channels. The unoptimized, simple algorithm then performed an exhaustive search on a paramterized space of potential shifts of the red and green channels on the blue channel, using a metric to gauge how good a certain shift was.

Two issues arose with this approach; the first was that the intial metric which was used, the sum of squared differences, was not giving very precise alignments; using the normalize cross correlation, instead, yielded better alignents. A second issue was that the initial borders around the image frames were not inherently related to the image content, and were instead obstructing the proper performance of the alignment algorithm. This issue was dealth with by using only a subsection of the picture in the center (which still had essentially just as much structure needed for alignment) for use in the metric maximization scheme.

One final note in the approach is that some of the images were very large, and thus the naive exhaustive search was too slow of an approach. I implemented an image pyramid scheme to optimize the performance of the algorithm. The image pyramid recursively called the algorithm, aligning first on a manageably shrunken version of the photo input. Successive calls searched around refined neighborhoods of these shift estimates, scaled to larger and larger resizes of the original image, until finally the optimal alignment was found in the original photo.

Results:

Listed below are the results on different images after alignment with the algorithm. In addition, the (x, y) shifts on the red and green color channels, with respect to the blue color channel, are listed.

Example images:
Cathedral shifts: green (2, 5), red (3, 12)

Monastery shifts: green (2, -3), red (2, 3)

Nativity shifts: green (1, 3), red (0, 7)

Settlers shifts: green (0, 7), red (-1, 14)

Emir shifts: green (24, 49), red (-301, 85)

Harvesters shifts: green (17, 59), red (13, 123)

Icon shifts: green (17, 41), red (23, 89)

Lady shifts: green (9, 51), red (12, 112)

Self_portrait shifts: green (29, 79), red (37, 176)

Three_generations shifts: green (14, 53), red (11, 111)

Train shifts: green (5, 42), red (32, 87)

Turkmen shifts: green (21, 56), red (28, 116)

Village shifts: green (12, 64), red (22, 137)
Other images:
Doctors shifts: green (5, 65), red (8, 142)

Tools shifts: green (1, 66), red (-7, 133)

Weights shifts: green (-1, 54), red (5, 104)

Failure cases:

Note that the only colorized image with clear defects, that can be attributed to the algorithm, is the photo of the emir. To understand this failure case, look at the gray image frames:

The top frame is the blue channel, and the bottom from is the red channel. Also note that in the image the algorithm outputted, the red channel is the channel that is misaligned. These two features make sense when noting that our algorithm focuses on maximizing the normalized cross correlation between two color channels. Because the emir's garb takes up a large fraction of the photo, and that it is very bright in the blue channel but very dark in the red channel, the true proper alignment between the red and blue channels has a low NCC because of this mismatch. Thus, the algorithm outputs another (incorrect) alignment, instead.

Bells and whistles:

None implemented.