CS 194-26 Final Project

name: Andrew Shieh

#1: Neural Style Transfer

Background
I've always wondered what I'd look like if some famous painter had drawn a portrait of me. Thanks to the power of convolutional neural nets (something I played around with in project 5), I've been able to do so with neural style transfer, detailed in "A Neural Algorithm of Artistic Style" by Gatys et. al. In this project, I wanted to implement this algorithm myself.

For visual content, here's an example I found online: starting with the two left images, we can generate the image on the right.


Images
First, I picked out two pictures: one content image and one style image. With these images, the network can learn the "content" of a picture, the "style" of the other, and combine them together into one. I picked out an example to imitate from the paper, using the Neckarfront in Tubingen as my content image and Starry Night as my style image.


To make the network run I first transformed them through a dataloader to resize both images to the same size. Next, I defined the content loss and style loss as indicated in the paper. These loss functions help hone the model so it can pick out and optimize the styles and contents from each picture. The content loss consists of an MSE over the input image and content image while the style loss is an MSE over a gram matrix and style image.


Model
Next, I had to build the convolutional neural net itself. Following the paper, I used a pretrained VGG-19 model (which has 19 layers) to run my images on. The paper specified that they found replacing the max pool layers with average pool layers and not running it through the dense layers resulted in better looking images, so I did the same with my network. All in all, my final network architecture looked like:


Style Transfer!
Finally, I ran my content and style images through the network! Here are my two input images alongside the final result:


Pretty cool! Many of the styles from Starry Night are present (colorscheme, wavy brushstrokes, starry sky), and I noticed that it even added the reflection of a star along the river bank.
An alert reader might notice that my final image isn't as vivid as the examples shown in the paper. This is most likely due to me not running the model for very long as training on my images (which were 512 px in height) for 500 epochs already took more than 30 min, despite me still seeing decreases in loss across the later epochs.

Gallery
Anyhow, I wanted to run some more examples from the paper as well. Below, behold Neckarfront in the style of "The Shipwreck of the Minotaur" and Neckarfront in the style of "Composition VII". I show the content image, style image, and final image for these examples:



These were all pretty good, and again although not as vivid as in the papers, we can still clearly see the styles being transferred into the input image.

For a final fun experiment, I decided to transfer the style from "The Scream" onto my Facebook profile picture. Have a look:


I look a little demonic so I probably won't be making this my new profile picture. However, the influence of "The Scream" on the colors and wavy lines can be seen, especially in the background of the picture.

Reflection
Overall, this project was quite a bit of fun. Re-implementing a popular paper with some cool results to show for myself was eye-opening, and it was truly astonishing to see the power of CNN's firsthand. I also was able to explore PyTorch in-depth, and used some parts of their documentation for guidance on this project.

#2: Image Quilting

Background
Image quilting is an algorithm described in a paper co-authored by Prof. Efros, so I thought I'd do some exploration into it! It is sort of like a precursor to neural style transfer, where we can generate an "image quilt" of specified size from a small image, then put images within that image quilt. The cool thing is that this paper was published 20 years ago, but the results can hold up to modern-day AI techniques.

Randomly Sampled Texture
First, I started by randomly sampling textures from the original texture to generate an image quilt. Below is the input image followed by a randomly quilted image:


Doesn't look so great, but that's expected given the lack of nuance within this quilting method. There are very many obvious gaps and seams within the generated quilt, so a more sophisticated technique needs to be used to generate it.

Overlapping Patches
This leads us to the next method of quilting, where we introduce the idea of overlapping patches. In essence, we will still sample patches, but this time we devise a sense of cost and thresholding (computed with sum of squared differences) to pick patches that are the "best" fit for that position. We also overlap the patches slightly so that the transitions are smoother. This creates better pictures, although it is tougher to implement. Have a look at the original image, random quilt, and this method:


This is quite a bit better than the naive randomization: we can see that the text is much more aligned, although we can still see some of the seams. One more improvement on this method will help us generate better and more natural looking quilts.

Seam Finding
Now, for the method described in the paper. Instead of just simple overlapping, we can devise a cutting function to help us generate the best "rip" of each patch to paste alongside each other. To explain in different words, rather than a simple square cut, we can have cuts that have more jagged edges that can further reduce our "cost" when overlapping each patch.

Here's a visualization of the calculation going on. First is the template, then the patch, the cost visualization on the patch, and the cut chosen. Notice how the cut navigates around the cost areas.


This cut method gives us quite good results, so take a look at the original image, the random quilt, the simple quilt, and this method alongside each other:


A watchful eye might be able to tell that the final method isn't perfect since the text doesn't make much sense and a small seam might be visible. However, this method overall clearly provides the best quilting.

Gallery
Just seeing it on this one example is a little boring; I ran the same algorithm on a couple of different provided sample images and my own images as well. Each is displayed in the order of original image, random quilt, simple quilt, and cut quilt:






Texture Transfer
Lastly, we can take our image quilt and transfer a target image into it. For example, the image on the homepage of a person's face in a loaf of bread.

The idea here is the same as quilting, except we use the image's luminance as our correspondence map as another cost function. This is in conjunction with the alpha variable to determine how much to weight each cost when creating our overall cost map.

Here's an image of Richard Feynman alongside him in a wall of bricks.


You can sort of make out his face and hair, shown in the wall itself. I wanted to do another example with a portrait of myself in the bricks as well. Below is a picture of me, and me in the synthesized brick wall:


This one is a little more hazy since there is so much going on in the source image. However, you're still able to make out the lines and angles from the background in the wall itself.

Bells and Whistles: Iterative Transfer
I implemented the iterative method within my code. Thus, the two examples are the same as above.

Reflection
This project was quite challenging, as it was hard to align everything as well as get all parts of the algorithm down since there were so many moving parts. However I quite enjoyed being able to implement something from a paper my professor had written, and it was really awesome to compare this method to the neural style transfer I also implemented, as both techniques can be used to generate images that blend styles and content together. Definitely a thought-provoking final project and a good way to end the semester.