CS194-26 Proj 1: Images of the Russian Empire

By Mingjun Lim

Overview:

In this project, we attempt to recreate color photos from the Prokudin Gorskii photo collection from digitized copies of his glass plate images. The focus of the project here was the alignment of the 3 image copies.

Naive Alignment algorithm:

In order to align the 3 images of different colors, I started with the naive approach of comparing 2 images over a fixed displacement range. Using a fixed range of about 15 pixels, we measured the sum of squared differences of 225 possible displacements against the blue picture. (15 x_displacements by 15 y_displacements).

Since we used the np.roll function to shift the image to our desired displacment, we used a filter matrix to borderize each image before computing the ssd. This helps us exclude the edge portions rolled over.

Image Pyramid algorithm

The naive algorithm worked decently well for smaller pictures, but using a hard coded value of 15 pixel displacement was less effective for larger images. In addition, calculating the SSD of these large images (tif files) was extremely computationally expensive.

To solve this problem, I implemented an image pyramid alignment method to to speedup the alignment operation. The algorithm attempts to find an optimal displacement for scaled down versions of both images. Then, it applies the displacement found on the larger image, and only has to find displacements in a much smaller search space (for which the scaled down version of the image was unable to use). This happens recursively until we get to the original sized image, at which the optimal displacement has been found.

For this particular implementation, we used a scaling factor of 0.5 for each layer of the pyramid. In the base case, we used a search space of 0.1*x_size of the coarsest image so we could search a consistent portion of the image regardless of its size.

Using these values, we first used our algorithm on a depth 5 image pyramid. While this worked very quickly (taking about 10s per image), this still resulted in some blurry images. We then refined our algorithm to use a depth 3 image pyramid instead. This took about 55s per image, but we were able to achieve decently clear pictures as a result.

One small issue with using the skeleton code was that the images appeared to have a negative color (compared to the original images). To fix this, we switched the order in np.dstack from BGR to RGB color, aligning the green and blue frames to the red one.

Image Produced

icon.tif: Green displacement: (-4, -48), Blue displacement: (-24, -90)

lady.tif: Green displacement: (-4, -60), Blue displacement: (-12, -116)

melons.tif: Green displacement: (-4, -96), Blue displacement: (-4, -163)

monastery.jpg: Green displacement: (0, -8), Blue displacement: (-4, -4)

onion_church.tif: Green displacement: (-12, -56), Blue displacement: (-36, -108)

self_portrait.tif: Green displacement: (-8, -97), Blue displacement: (-31, -163)

three_gnerations.tif: Green displacement: (3, -60), Blue displacement: (-12, -114)

tobolsk.jpg: Green displacement: (0, -4), Blue displacement: (-4, -8)

train.tif: Green displacement: (-28, -44), Blue displacement: (-32, -88)

workshop.tif: Green displacement: (-1, 52), Blue displacement: (-12, 100)

castle.tif: Green displacement: (0, -64), Blue displacement: (-4, -97)

cathedral.jpg: Green displacement: (-1, -8), Blue displacement: (-4, -12)

emir.tif: Green displacement: (-16, -56), Blue displacement: (-40, -72)

harvesters.tif: Green displacement: (12, -52), Blue displacement: (12, -104)

Extra Images

These extra images were chosen to explore how well the algorithm worked on different types of images. For gold.tif, I wanted to experiment with an image that had a full white background and intricate details in the same color. For kurmy.tif, I chose this image as it had sharp colors on a black background, a combination that was not present in the example images. For painting.tif, I thought it would be interesting to see if the colors and textures on the picture frame would be well aligned. Lastly, for roses, I wanted to experiment with how a largely green image with pops of pink would work with the algorithm.

The algorithm worked well for all the images below.

gold.tif: Green displacement: (14, -48), Blue displacement: (26, -85)

kurmy.tif: Green displacement: (19, -90), Blue displacement: (36, -116)

painting.tif: Green displacement: (-4, -41), Blue displacement: (-8, -68)

roses.jpg: Green displacement: (-16, -52), Blue displacement: (-36, -82)

Analysis of results

For the most part, the algorithm managed to align the images fairly well. Most of the images were sharp, although the color was slightly lacklustre in images such as workshop and castle, which seemed a bit washed out. icon.tif also had a color issues, in which the the column in front of the altar seemed oddly blue when it should be white. These color isuses could be fixed with proper white balancing and contrast.

With 2 images, emir.tif and harvesters.tif, there was some difficulty obtaining the correct alignment. In emir, the blue image can be seen to be slightly off in the y direction, creating a blue shadow. As explained in the assignment, this could be due to that the images to be matched don't have similar brightness values.

In the harvesters image, the generated image is not as sharp. In particular, some of the figures to the left of the middle appear slightly ghostly. This might also be attributed to the same issue where the colors in the images are very stark and do not share the same brightness values between color channels. This issue could have been fixed with a better alignment algorithm (instead of SSD). One idea I experimented with is the use of an edge detection algorithm to align the images, but I struggled to get it to work well on the images.

Besides those images, this particular image pyramid algorithm did not produce sharp images when used on smaller jpg images (monastery, catheral, and tobolsk). This is because we only use a 10% displacement for the coarsest image of the pyramid. Since the images were small to begin with, this resulted in a very small search space, resulting in a slightly blurry alignment. Using the naive algorithm yielded better results for these images, but we leave in the image pyramid versions for fairness.