markdown.md

CS194-26 Project 5

Depth Refocusing and Aperture Adjustment with Light Field Data

YiDing Jiang

Overview

In this project, I implemented algorithms to achieve refocusing at different depth of the image and readjusting the aperture of the image after the pictures have been taken, leveraging light field, which is a 5D data structure that stores an image at every sub-aperture at different location in an 2D array of sub-apertures. I have also implented an interactive algorithm that allows refocusing as any point of the image. (Videos are tested with Chrome.)

Part 1: Depth Refocusing

First we need to pick the center of the light field. Since all light field data have 17x17 sub-apertures, we will use sub-aperture $(8,8)$ as the center. For every other sub-aperture $(u,v)$ , shift of the image at $(u,v)$ with respect to the image at $(8,8)$ can be denoted as

$\Delta (s,t) = C[(u,v)-(8,8)]$

The transformation is applied to images at every sub-aperture of the light field and all images are averaged to produce the refocused image.

$C$ denonotes a number constant across all sub-aperture that is related to the location of focus. The physical focus is at $C=0$ . Larger positive $C$ means we are refocusing further from the physical focus while smaller negative $C$ means we are refocusing closer to the camera from the physical focus. Intuitively, if an object is closer to the camera, the light rays are not focus at a single point. Therefore, shifting towards the center will cause the image to be refocused. The same logic can be replied to background image that's out of focus because the light rays are seperated past the physical focus.

The following animation is done with a linear interpolation between $C=-3.0$ to $C=3.0$ over 10 frames.

Lego Knight	Tarot Cards and Crystal Ball

If the shift is not integer, we can use bilinear interpolation to compute the shifted image. However, because my computer is kind of old, bilinear interpolation is extremely slow and the program constantly runs out of memory even if the images are stored as integers.

Nearest Neighbor Interpolation	Bilinear Interpolation

I did a comparison on the smallest image of the dataset and found that the artifacts from using nearest neighbor is not as significant as I thought it would be, but the performance gain is massive (about 5 minutes to 10 seconds per refocusing). For this reason, the remaining of the images are all done with nearest neighbor although interpolation can be activated in the code through method argument.

Part 2: Aperture Adjustment

To reajust the aperture of an image, we can use a subset of the sub-apetures around the center instead of averaging over the entire image to simulate a smaller aperture. Formally, if we want to approxiate a circular aperture of variable radii, we include sub-aperture $(u,v)$ in the averaging if $\sqrt{u^2+v^2} \leq$ r where r is the radius of whole aperture in the grid of sub-apertures.

The following animation is done with a linear interpolation between $r=0$ to $r=12$ at $C=1$ over 10 frames.

Lego Knight	Tarot Cards and Crystal Ball

Bells and Whistles

Interactive Refocusing

We can do interactive refocusing by finding the optimal offset that aligns a sub-aperture $(u,v)$ with the center. Using this offset we can estimate the depth of $(s,t)$ by $C' = \frac{\Delta s /(u-8) + \Delta t/ (v-8)}{2}$ . In principle, the ratio between the displacement should be the same but there are imperfection in the way cameras are move so the two ratios are usually close but not identical to each other. I decided to average them to get one sample estimate although in theory a better estimate can be obtained through linear regression. With $C'$ we can carry out depth refocusing as done above.

Perspective change

Since light field naturally encodes different persepctive, we can hallucinate limited change of perspective by doing a walk in the $(u,v)$ grid.