Final Pre-canned Project I: Seam Carving

Jamarcus Liu, cs194-26-adh
Yin Tang, cs194-26-acd

Overview

In this project, we will reimplement seam carving that can change the size of an image by gracefully carving-out pixels in different parts of the image. In many real world scenarios, we have to change the aspect ratio of an image of size it down to certain dimensions to fit it on the website or Powerpoint slides. Standard image scaling is oblivious to the image content and typically can be applied only uniformly. Cropping is limited since it can only remove pixels from the image periphery. Compared to those methods, seam carving provide a content-aware approach to image resizing that bypasses geometric constraints. This approach was first proposed by Shai Avidan and Ariel Shamir in Seam Carving for Content-Aware Image Resizing.

Part I: Energy Functions

The idea of seam carving can be summarized as repeatedly finding and removing the seam with lowest cumulative “energy” until desired dimension has achieved. In this project, we will mainly be using the standard L1 magnitude of gradient for each pixel as the energy function. However, in bells and whistles, we will compare and contrast different energy functions and see how they perform differently on different images. Here are some examples of energy calculated at each pixel,

Image 1	Image 2

Part II. Removing the Minimum Energy Seam

Now that we have the energy value associated with each pixel, we can find out the vertical seam of lowest energy. This seam is essentially a path from the top of the image to the bottom of the image with least energy. Also, this path has to 8-connected, meaning for each pixel im[I, j], the next pixel can either be im[i + 1 , j - 1], im[i + 1 , j ], or im[i + 1 , j + 1]. To find such path is an exercise of dynamic programming. We can traverse the image row by row, and for each pixel, record the minimum cumulative energy up to that pixel and the column index of the pixel in previous row that was used to achieve this minimum energy.

Later, we can recover the path by iteratively backtracking from the pixel in the last row with minimum cumulative energy. Once we identify the relevant pixels to that seam, we can easily remove them via a boolean mask and indexing. Here are some successful results.

Original Image	Resized Image	Dimension (row, col)
		Original: (1000, 750) Resized: (800, 750)
		Original: (481, 700) Resized: (481, 400)
		Original: (637, 640) Resized: (437, 640)
		Original: (600, 900) Resized: (600, 750)
		Original: (609, 1500) Resized: (609, 1000)
		Original: (718, 900) Resized: (718, 750)

Unavoidably, there are some cases where the above algorithm fail for various reasons.

Original Image	Resized Image	Dimension (row, col)
		Original: (1000, 750) Resized: (1000, 500)
		Original: (580, 900) Resized: (580, 500)
		Original: (470,840) Resized: (470,640)

The three pairs of images above showcase three different occasions under which the seam carving algorithm might not work perfectly. In the first image, the string lights that go in zigzag shape across the two buildings. Therefore, even with small changes in the column dimension, the resized image would exhibit some noticeable discontinuities around the string lights. The second image of Doe Library have similar issues: we know from real life experiences that many pillars of the library are equidistant; however, seam carving does not work well when there is significant 3D-perspective in the image. Similar problem happens to the image of a street in Tokyo: over-resizing causes some lines to deform.

Bells & Whistles II: Exploring Energe Functions

In the following section we compare and contrast the results from different energy functions. There are three functions used, L1 , L2 , and Harris corners.

L1 energy function: the sum of absolute value of horizontal and vertical derivatives.
L2 energy function: the square root of squared sum of horizontal and vertical derivatives.
Harris corner metric: the ratio between trace and determinant of the matrix M, which is the outer product of [dx dy]^T with itself.

Here are some results of applying them to some example pictures.

Original Image	L1	L2	Harris Corner

From the images above, we notice how L1 and L2 energy function have very similar outputs. Also, Harris corner metric performs relatively bad on the first two images but a little better on the third one compared to L1/L2 metric. This is probably because L1 and L2 functions are in general more sensitive to edges where HC metric more sensitive to corners. Therefore, even though only small dimension changes are applied to the second image, HC metric produces some weird artifacts around the pillars whereas L1/L2 are very effective in keeping the content intact. The third image, however, does not contain many clearly defined vertical edges; hence, all metrics have similar performance. In conclusion, although over-resizing will result in distortion regardless of the choice of energy functions, we can use L1/L2 energy function when there are clear, straights edges in the image that need to be preserved and Harris corner metric if no such edge exists.

Reflections

This project was such an enjoyable and inspiring experience for us. The original paper was easy to follow and turned a simple idea to applications powerful. There were however several things we wanted to try out but did not have to time to. Currently, generating each image above takes about 3-5 minutes. This process could be optimized if we recalculate energy matrix, minimum energy matrix, and backtracking matrix every 5 iterations, for instance, rather than every iteration. Of course, doing so would suffer from some inaccuracies; therefore, the best use case would be either on larger images or small dimension changes.