CS194-26 Project 1: Colorizing the Prokudin-Gorskii photo collection

Overview

This project requires us to combine 3 channels of red, green, and blue images with some displacements to produce one colored image that's consistent. It provides some low resolution pictures that can be exhaustively computed, but also some high resolution images that require smarter algorithms to compute.

Approach

The image is first divided into equal parts as the 3 color channels. Then I implemented the basic approach. In the basic approach I exhaustively search for the best offsets that produce the lowest l2 norm bewtween 2 images within a window of [-15,15]. In my approach, I decided to red as my base, and aligned both blue and green to the red channel, and then stacked the three channels together. This works well for the lower resolution jpg images, but take way too long for the bigger tiff images. Thus, I needed to use the pyramid algorithm.

The pyramic algorithm consists of rescaling the images to lower resolution, aligning them, and recursively going up to finer granularity while updating the estimates. I implemented the levels of the pyramic according to the size of the image, rescaling them down to lower than size of (700,700), which a reasonable size for exhaustive search, through scaling by a factor of 2 at each level. Then, I use the relative scaling at the level as the window size, so that at higher levels, the window size is smaller, making it more efficient. In theory, higher levels don't require big window sizes as the lower levels provide bigger offsets. Through general testing,, it turned out that using red as the base channel works the best.

For the l2 norm scoring, I did not include the rolled parts in my calculations, but only the middle part to give more accurate scoring. Since the l2 norm use images of different sizes, which would biasedly encourage more rolling, I decided to use the average l2 norm by dividing by the number of pixels in the picture.

I also cropped out 5% from each edge for each image due to the general trend of having weird edges that resulted from rolling.

Image Results

Naive Exhaustive Search:


Monastery
Green channel image displacement: x: -6 y: -1
Blue channel image displacement: x: -8 y: -1

Cathedral
Green channel image displacement: x: -7 y: 0
Blue channel image displacement: x: -8 y: 1

Tobolsk
Green channel image displacement: x: -4 y: -1
Blue channel image displacement: x: -7 y: -3

Pyramid Search:


Church
Green channel image displacement: x: -34 y: 4
Blue channel image displacement: x: -54 y: 5

Emir
Green channel image displacement: x: -85 y: -10
Blue channel image displacement: x: -85 y: -19
This image is processed really badly because the three channels have different brightness, and thus the L2 norm cannot handle it well.

Harvestors
Green channel image displacement: x: -66 y: 2
Blue channel image displacement: x: -85 y: -2

Icon
Green channel image displacement: x: -48 y: -5
Blue channel image displacement: x: -85 y: -22

Lady
Green channel image displacement: x: -57 y: 9
Blue channel image displacement: x: -85 y: 17

Melons
Green channel image displacement: x: -85 y: 0
Blue channel image displacement: x: -85 y: 6

Onion_church
Green channel image displacement: x: -57 y: -10
Blue channel image displacement: x: -85 y: -1

Self_portrait
Green channel image displacement: x: -85 y: 1
Blue channel image displacement: x: -85 y: 5
This image is not processed well largely might because there're too many items in the image and without proper edge detection algorithms the image is hard to process.

Three_generations
Green channel image displacement: x: -58 y: 0
Blue channel image displacement: x: -85 y: -5

Train
Green channel image displacement: x: -49 y: -3
Blue channel image displacement: x: -85 y: -6

Workshop
Green channel image displacement: x: -49 y: 11
Blue channel image displacement: x: -70 y: 15

Custom Examples from the Prokudin-Gorskii collection


Water
Green channel image displacement: x: 1 y: 0
Blue channel image displacement: x: -1 y: 1

Sun
Green channel image displacement: x: -85 y: 12
Blue channel image displacement: x: -85 y: 22

Field
Green channel image displacement: x: -49 y: 12
Blue channel image displacement: x: -85 y: 2