CS194 Project 1: Images of the Russian Empire

Image Colorization

Prince Wang
Github
linkedin

Overview

The goal of this project is to take the digitized Prokudin-Gorskii glass plate images and, using image processing techniques, automatically produce a color image with as few visual artifacts as possible. The challenge of this project is to produce images of quality as high as possible via good alignment algorithms, as displacements among different color channels of the image will result in lower quality.

Approach




I implemented the following algorithm to produce the results as shown below:

Section 1: Color Channel Alignment with Exhaustive Search Algorithm (for small images)

To align the three color channels of smaller images, I exhaustively searched over a [-15, 15] window of possible displacements. I first align the green channel with the blue channel, then the red channel with the blue channel. For each possible displacement, I wrap the image accordingly and then compare the two channels by computing the Sum of Squared Distance(SSD) of the pixel values of the two images. In other words, SSD is used as a metric to indicate whether a possible alignment gives us a good result. Once the best alignment for the green and red channel is computed, we stack the three channels according to the best alignment and recreate the colored image.

sample result:

Cathedral.jpg (390 x 341)

Note1: Although exhaustive search method is computationally expensive, my use of it exclusively on small images ensures the colorization of the image to be fast. For the colorization of larger images, see next section.

Note2: When calculating SSD, I cropped 8% of the image pixels on each side. Since the frames of the image can contribute to large difference in pixel value, this measure is to prevent the frame of the image from affecting our calculation of the SSD.

Section 2: Color Channel Alignment with Image Pyramid (for larger images)

To align the three color channels for larger images with possibly very large displacement(say, a displacement of [100, 100]), the exhaustive search algorithm will be too computationally expensive. Thus we use the image pyramid method: we first scale down the image until it is sufficiently small (in my implementation, smaller than 500*500). Then, we use exhaustive search on this coarsest scale of the image. Once best alignment at this level is found, we scale up the image and update our alignment at each level, until we reach the original scale of the image with the calculated alignment.

sample results:

train.jpg (3714 × 3209)
three genreations.jpg (3741 × 3238)
workshop.jpg (3741 × 3209)

Section 3: Image Alignment with Edge Detection (Bells & Whistles)

Although I achieved good alignment results on many images, there are still some images where alignment failure. This could possibly be attributed to my algorithm’s inability to compare sections of the image where pixel values of different channels have high variation. For example, the naive algorithm which only compares pixel values fail to capture the similarity of a section of the image with pixels of very high blue channel value and very low green and red channel value.

To optimize our algorithm, I decided to compare image similiarity through gradient. Edge detection allowed us to circumvent the situation described above and instead focusing on aligning the edges on objects in an image.

applying Roberts filter to color channels

original
filtered

The idea is to use our alignment algorithm on filtered images instead of raw images. This allows us to achieve great results on most images which previously our naive algorithm couldn’t align properly. I tried convolution filters such as canny filter, roberts filter and many more.

improvements on small images using edge detection

raw pixel comparison
edge detection
raw pixel comparison
edge detection

improvements on larger images using edge detection

raw pixel comparison
edge detection

detail

raw pixel comparison
edge detection



raw pixel comparison
edge detection

detail

raw pixel comparison
edge detection

Section 4: Autocropping (Bells & Whistles)

I realized that although most images aligned well through using edge detection, the frames of the color channels do not align and makes the image look bad. Thus I implemented the code which automatically crops the borders. The algorithm is simple. I mentioned in section 1 that when calculating SSD between two color channels, I crop the image by 8% on each side. It turns out that this 8% accounts for the white color outside of each channel’s border. Then to process the border, we take the maximum of the absolute value of the displacement of each channel both horizontally and vertically. For example, if the displacements of green and red channel are [-10, 20], [7, 25] respectively, we crop the image by 10 horizontally and 25 vertically.

uncropped
cropped

detail

Section 5: Results Demo

monastery(G[2,-3], R[2,3])
workshop(G[-1,53], R[-12,105])
tobolsk(G[3,3], R[3,6])
emir(G[24,49], R[40,107])
church(G[2,5], R[3,12])
three generations(G[12,54], R[8,111])
melons(G[13,83], R[18,173])
onion church(G[25,52], R[35,107])
train(G[1,41], R[29,85])
icon(G[17, 41], R[23,90])
village(G[10,64], R[21,137])
self portrait(G[29,78], R[37,175])
harvesters(G[18,60], R[13,124])
lady(G[9,57], R[13,120])

Section 6: Images not in the dataset (Bells & Whistles)

In addition, I tried processing some other images from the collection

G[22, 57], R[28, 117]
G[-3, 55], R[-1, 124]
G[23, 59], R[31, 124]
G[12, 43], R[22, 104]
G[-25, 93], R[-52, 190]