Final Project

Nancy Shaw, cs194-26-abo
CS 194-26 Spring 2020

Seam Carving


Seam carving is an algorithm for content-aware image resizing. The algorithm works by finding and removing a number of seams from an image. Seams are defined to be paths of low energy pixels. The algorithm was first introduced in Seam Carving for Content-Aware Image Resizing.

Seam Removal

Vertical Seam Removal

An important part of the algorithm that was only briefly touched on is the energy function which determines how "important" each pixel is. Unnoticeable pixels that blend with their surroundings should have low energy values.

The energy function we use will be the same as the one described in the seam carving paper:

A vertical seam are just pixels that connect from the top to the bottom of the image (one pixel per row) and vice versa for a horizontal seam.The algorith chooses the seam with the least amount of energy to remove each iteration. By repeatedly carving out or inserting seams, we can retarget the image to a new size. We use dynamic programming to efficiently find the most optimal seam to remove.

Below are some of the results of vertical seam removal. The original image is on the left, and the seam carved image is on the right.




This failure case happened because too many seams were removed-- thus removing the high energy areas well.

Horizontal Seam Removal


Bells and Whistles: Seam Insertion

This was simply a natural extension from removal. We implemented this according to the paper-- we find the seams of the lowest energies and add them to the image in reverse order by averaging the pixel values across the seam.

We also include a failure case-- when we want to add so many seams that the ones we add are also relatively high energy.


Style Transfer


Architecture


Model: SqueezeNet (chosen primarily because it was a smaller/faster model and my friend mentioned he had good results with this. Better results can probably be obtained with vgg19 which the paper uses). Our loss function is a weighted sum of the content loss, style loss, and total variation loss. Content loss is the average of the difference between the content representation of the target (produced) and original image. Style loss is represented by the Gram Matrix which represents the correlation between the responses of each filter. Total variation loss is just the sum of the squares of differences in pexial values that are next to each other-- which helped smooth out the image a bit.

Hyperparameters: We run an Adam optimizer with an initial learning rate of 3.0 and a decayed learning rate of 0.1 for 200 epochs with varying style and content-loss hyperparameters.

Comparisons



Here we use Neckarfront as our content image and compare the results of a specific style (leftmost) between the paper's NN (middle) and my NN (rightmost). We can see that my model is not as great, but considering that we were using Squeezenet and not vgg19, I would consider the results to be pretty decent.

Other Results


We tried running the model on some other styles with the desert image above. Some of these turned out alright like the anime and wave images-- the colors and shaidng were captured quite nicely. The last image style (pixel) was not captured properly and is most definitely a failure case. We think in this scenario, this is most likely because my model sucked.