Final Project: Image Quilting

Quilting Comparison

Below are examples of creating a larger texture using random, overlapping, and seam-finding methods, respectively. In all algorithms, we find a patch in the original texture to append to the current, larger image of the texture. In the random algorithm, there are no criteria for new patches. For the overlapping algorithm, we check the patch's similarity with adjacent patches. In the seam-finding algorithm, we find the patch's similarity with adjacent patches and finds the lowest cost path across the new patches edges that will minimize the disparity between each patch. The bricks texture looks quite good with just an overlapping algorithm, while the white rice texture only begins to look natural using the seam cutting algorithm.

Random Patches
Overlapping
Seam-Cutting

Seam Cutting Illustration

More images

Here are the textures:

The following shows the result of the random, overlapping, and seam-finding algorithms respectively. For different algorithms, different patch sizes ended up looking best, hence the difference in size:

Texture Transfer

Texture Transfer works by finding blocks that are both most similar to the content image and mixes well with other patches in the current image. Below are some examples, along with the texture used:

Heart
Building

Since the white texture had the most variance in intensity of color, the images turned out best using that texture.

B&W

I reimplemented the cuts method: lim = min(half_overlap, bndcut.shape[1]) pq = [(bndcut[0][j], [j]) for j in range(lim)] heapq.heapify(pq) seen = set() while len(pq)>0: cost, path = heapq.heappop(pq) if len(path) >= bndcut.shape[0]: return path i = len(path) j = path[-1] for k in range(j-1, j+2, 1): if (i, k) not in seen and lim > k >= 0: new_cost = cost + bndcut[i][k] new_path = path + [k] heapq.heappush(pq, (new_cost, new_path)) seen.add((i, k))

Final Project: Neural Style Transfer

In this project, I used a pretrained VGG network in order to transfer the style of an artwork to my own images.

Creating The Model

An Overview of Neural Style Transfer

The neural style transfer model relies on a pretrained CNN model, since trained CNNs are able to represent certain features in the image concisely. Trained CNNs generally have a structure where earlier layers cover more basic "lower"-level visual items like edges and corners, while later layers appear to encode higher-order information like the texture of fur or the shape of an eye. By keeping the weights of a CNN frozen, we can adjust the features of an inputted image to make its content or style appear more like another image.

Starting from a pretrained VGG-19 network from PyTorch, I appended a few new layers to make the model optimize over preserving the content of one image and style of another image.

Content Loss

There are two types of losses that we need to add after a convolutional layer in the CNN: content loss, and style loss. We compute content loss using some convolutional layer's output, which is a feature representation of our input. In other words, we run both our input image and content image through the pretrained VGG network and use the outputs after some convolutional layer as feature representations of the images. Both feature representations are compared using the squared L2 inner product, and the mean squared error is saved as the content loss of the input image at that specific convolutional layer.

Style Loss

For the style loss, we compared the Gram matrices of the feature representations. The element in the i-th row and j-th column of the Gram matrix represents the squared L2 inner product between the i-th and j-th feature of the same image. Intuitively, the Gram matrix represents the relationship between different features of the same image: although the dot product removes any geometric infromation about the features, the overall representation of style is preserved. After calculating, the matrix is normalized the prevent brighter images from affecting the style too much. I decided to use a Pytorch module, instead of a Pytorch function, to write the loss function for simplicity. Originally, I placed the style and content losses after each layer, but my model performed quite poorly, with an extremely heavy bias towards the style. Some unfortunate examples of my early, style-heavy model are below. After some research, I changed my model so that it would compute the style loss on only every other convolutional layer and reduced the fraction that style loss was multiplied by.

Failures in the early iterations of the model. The oulines of the man's body in the painting is visible in the image.

Data Transformations

In my journey to make my images look better, I began to try different data transformations. According to a number of different sources I found, VGG19 normalized all images with a mean of [0.485, 0.456, 0.406] and a standard deviation of [0.229, 0.224, 0.225]. Although I do not know if this made a noticable difference, it was the only significant data transformation I used.

Running the Model

I ran the model on a few paintings and images, using a LBFGS as suggested by the authors. I ultimately found that abstract paintings, like a classic Rothko. Paintings with subjects often transfered their style poorly, sometimes transfering ghostly images of the objects themselves, due to placing style error module in later layers of the network.

Modigliani
Matisse
Miro
Warhol
Sisley
Miro
Pollock
Pollock again