Lakshya Jain: CS 194 Project 1

The project’s goal was to colorize the negatives from the Prokudin-Gorskii photo collection, for which I used image alignment techniques like edge detection and NCC. For this project, I aligned the images using the NCC heuristic. For the alignment itself, I first got rid of the borders by removing a tenth of the image from the top, right, left, and bottom, which meant I was dealing with the interior parts of the image. I checked a pixel displacement of [-5, 5] x [-5, 5] (the size of the displacement was passed in as an argument) for the alignment channel against the reference channel and picked the alignment with the maximum NCC.

I created an image pyramid to actually align the images, however. The number of layers was determined by the formula int(np.ceil(math.log(base_res_x, 2)) - 4), which essentially was the log base 2 of the image size but excluding the coarsest 4 layers (so the base layer is a 16 x 16 image). Each layer has an image scaled down to the appropriate size using sktransform’s rescale function, and the images increase by factors of two as you traverse the pyramid. A “master vector” of alignments is constructed at each layer — each layer gives you some information about how to adjust the color channels, and these changes are scaled up by a factor of 2 and applied to the next layer using np.roll before searching for the best displacement once again (factoring in this new change). At the end, the original two channels are aligned using the master alignment vector constructed.

I implemented three bells and whistles: edge detection, automatic contrasting, and white balancing. For scaling the image intensities in automatic contrasting, I first cropped a fixed amount off the borders (ten percent) and then got the maximum and minimum magnitude of the interior pixels, scaling everything below the minimum in the original, non-cropped channel to zero and then dividing the array by (maximum - minimum). I then got rid of any extraneous bright pixels by setting anything greater than the maximum in the original non-cropped channel to 1. This ensured that the intensities stayed between zero and 1. For white balancing, I found the channel with the maximum average intensity and scaled the other channels up to that level by multiplying each channel by max_brightness/avg(channelx_brightness), where “channelx” is the channel we are concerned with. Finally, edge detection was done by using sktransform’s sobel function — instead of computing the NCC of the channel’s pixels itself, I computed the NCC of the channel’s edge images instead. This allowed me to align the emir’s photo in a much more accurate and easy way, as before, it did not align due to the differing brightness values in the color channels.

Some issues I ran into involved the image being very grainy at first; this was because I was downsampling the image to the base level of the pyramid and then rescaling that downsampled image. The way I got around this was to store copies of the original image downsampled by different factors; each layer had a fresh copy of the original image downsampled. Furthermore, initially the image pyramid alignment was minimal at best and was not actually aligning images properly. I realized, later, that this was entirely because I did not store a master “alignment” vector whose magnitude I could adjust to carry over alignments to the next layer — obviously, an alignment of 3 pixels in a 16x16 image will be a much greater transformation if shifted to a 64x64 image.

The results of my method on all sample images and a few extra ones are attached below. On the left, you can see the colorized image without any bells and whistles, and on the right, you can see the same image with the aforementioned enhancements applied. The displacement vectors for the green and red channels are listed on top of each image; blue was used as the reference channels for each of these images. The three images at the bottom (the church, the loedinoe pole, and the palace) are all images downloaded from the LOC's archive that I demonstrated my algorithm on. I had to compress many of the larger images from the normal restoration because of the 25MB filesize limitation, but did not have to compress the Bells and Whistles images. However, for comparison's sake, I have left the icon, train, and turkmen photos uncompressed so that it is clear that the submission is not doctoring results in any way.

Cathedral - Normal Restoration: green: (5, 2) red:(12, 3). Bells and Whistles (RIGHT): green: (5, 2) red:(12, 3)

Emir - Normal Restoration: green: (49, 24) red:(112, -784). Bells and Whistles (RIGHT): green: (49, 24) red:(107, 40)

Harvesters - Normal Restoration: green: (60, 16) red:(125, 14). Bells and Whistles (RIGHT): green: (60, 17) red:(124, 14)

Icon - Normal Restoration: green: (42, 17) red:(90, 23). Bells and Whistles (RIGHT): green: (42, 17) red:(90, 23)

Lady - Normal Restoration: green: (55, 8) red:(111, 10). Bells and Whistles (RIGHT): green: (56, 9) red:(120, 13)

Monastery - Normal Restoration: green: (-3, 2) red:(3, 2). Bells and Whistles (RIGHT): green: (-3, 2) red:(3, 2)

Nativity - Normal Restoration: green: (3, 1) red:(7, 1). Bells and Whistles (RIGHT): green: (3, 1) red:(8, 0)

Self Portrait - Normal Restoration: green: (79, 28) red:(176, 35). Bells and Whistles (RIGHT): green: (78, 29) red:(176, 37)

Settlers - Normal Restoration: green: (7, 0) red: (14, -1). Bells and Whistles (RIGHT): green: (7, 0) red: (14, -1)

Three Generations - Normal Restoration: green: (55, 14) red: (112, 10). Bells and Whistles (RIGHT): green: (54, 12) red: (111, 9)

Train - Normal Restoration: green: (43, 5) red: (87, 32). Bells and Whistles (RIGHT): green: (42, 2) red: (85, 29)

Turkmen - Normal Restoration: green: (56, 20) red: (114, 26). Bells and Whistles (RIGHT): green: (57, 22) red: (117, 28)

Village - Normal Restoration: green: (65, 12) red: (138, 23). Bells and Whistles (RIGHT): green: (64, 10) red: (137, 21)

Church - Normal Restoration: green: (20, 18) red: (48, 29). Bells and Whistles (RIGHT): green: (16, 19) red: (46, 29)

Loedine_pole - Normal Restoration: green: (27, 11) red: (64, 16). Bells and Whistles (RIGHT): green: (27, 12) red: (63, 15)

Palace - Normal Restoration: green: (27, 6) red: (60, 5). Bells and Whistles (RIGHT): green: (26, 4) red: (59, 4)