Final Project

Final Project: Lightfield Camera and Artistic Style Transfer

Christine Zhu

Lightfield Camera

The objective of this project is to simulate camera refocusing and aperture adjustment via an array of lightfield images. We use the images from Stanford Light Field which are arranged in a 17x17 array for a total of 289 images.

Image Depth Refocusing

To simulate depth refocusing, we need to shift the images relative to a center image, which we have as the image in position 8x8, and average all the shifted images. How much we scale this respective shift determines the focus region of the image. Without any shift (aka scale of zero) we get an image that focuses on a more distant part of the image. As we increase the scale factor, we can see the focus shift forward. Here's an example on the chess image dataset and the bracelet image dataset:

alpha=0 shift

alpha=1 shift

alpha=2 shift

alpha=3 shift

alpha=0 shift

alpha=1 shift

alpha=2 shift

alpha=3 shift

Aperture Adjustment

To simulate aperture adjustment, we set a radius and avergage images only inside a radius from a given point. The larger the radius, the larger the effect. Without any shifting, the effect is centered on the focus point in the back for the chess images. If we wanted to make the focus point be in the center, we could use our code from the first part and shift the images first and then apply aperture adjustment.

alpha=0 shift

alpha=5 shift

alpha=10 shift

alpha=15 shift

alpha=20 shift

What I Learned

Previously, I had no idea what went behind aperture adjustment effects, and through this project, I learned how simple some of these effects really are.

Artistic Style Transfer

The objective of this project is to transfer the style of one image onto another without changing the underlying content of the image. To do this, we replicate results from https://arxiv.org/pdf/1508.06576.pdf. The general idea is that we can use the higher level information encoded in different layers of a CNN to minimize differences in style and differences in content. Here, we use a VGG network as described in the paper. The code is roughly based on Pytorch's tutorial for the same task, but several changes were made to make the method more faithful to the original paper.

For one, instead of using the first five convolutional layers as the Pytorch approach does, we use the first convolutional layer of each block. We also replace maxpool with avgpool in the VGG network, as the paper mentions better results using avgpool. Additionally, rather than using the lower level conv2_2 as the content layer, we replace it with conv4_2. Other experimentations were done such as altering weights for content vs style and altering step sizes, and also incorporating GPU usage. For some reason, the results using GPU via google collab produced worse results / different results than running with local CPU, so the images included below are ones involving CPU. Relative to that change, changes in number of steps vs weights were not as significant, with 300 steps being already sufficient.

Here are some examples below:

style image

content image

transferred image

style image

content image

transferred image

style image

content image

transferred image

Here is an example of one result that did not (in my opinion) turn out so well. The style in general does not have as consistent of a pattern as the previous style images so it is possible that our network did not capture and transfer higher level patterns of our style image.

style image

content image

transferred image