CS 194-26 - Project 5: Light Fields

Eli Lipsitz (cs194-26-acw)

Note: Click on any image to enlarge it.

Interactive Re-Focusing Demo

Click on either of the two photographs to refocus at that point!

Note: the depth maps used are approximate and, in some cases, extremely inaccurate. If clicking does not do anything at first, try refreshing the page and trying again.

Part 1: Depth Refocusing

The lightfield data consists of a number of images arranged in a grid, each of which is a photograph of the scene, with the camera positioned at the image's position in the grid (perpendicular to the viewing axis).

Thus, due to the parallax effect, the image at (8, 8) will be slightly shifted from the image at, say, (0, 0) (the center).

If we shifted the image at (0, 0) so that it lined up with the center image at a specific point, we could average the two together and end up getting an image which is clear (in focus) at the correspondence point, but blurry (due to the images not matching up) at all other points.

We can extend this by taking *all* of the images in the grid (17 * 17 = 289) and shifting them appropriately and averaging them.

I called the parameter to shift the images by (relative to their offset from the center in the grid), "c" -- this is relative to the pixel size of the images (1024 x 1024 for my example images), as well as the size of the grid, so the numbers by themselves are not particularly meaningful. This doesn't matter, however -- by sweep "c" across a certain range, we can sweep our focal plane across the image, producing an interesting effect.

In the following videos, the Cards "c" ranges from -4.0 to 3.0. The Legos "c" ranges from -3.0 to 5.0.

Part 2: Aperture Adjustment

If instead of shifting the images (or in addition to shifting the images), we simply average a subset of the images together, we can simulate differently sized apertures. If we take one image, that "simulates" the aperture of the original images (naturally).

If we summed nine images (a center image, and a ring of images around it), that simulates an aperture sized at 9 times the area of the original aperture, due to light being gathered from an area 9 times as large. However, this would also make the image 9 times brighter, so we divide by the number of images to average them, and account for the overexposure.

This can be repeated with any set of the images. In the images below, I've averaged the center 1 image, 9 images, 25 images, etc, all the way up to the entire 17x17 square (289 images).

Summary

Before this class, I'd already been exposed (hah!) to light field cameras through Ren Ng's CS 184. However, while we learned a lot about them, I did not personally do any light-field related projects.

It was pretty cool to see how easy to is to achieve interesting effects when you have light-field data. I'm especially happy with the Javascript interactive refocusing. As explained below, I ended up writing a program to calculate depth maps for a set of light field data to produce the realtime Javascript refocusing above.

The depth fields calculated in this method are extremely crude and not super accurate. I'd be interested in learning if there's more intelligent (and hopefully more effective) way of computing depth maps from light field data.

Bells and Whistles: Interactive Refocusing

I started by creating an interactive Python program that would show an image, accept a mouse click, and then for the ~16x16 region around the mouse click, try all possible shifts of the center image and the top left image, picking the one with the lowest squared deviation.

This worked pretty well, but wasn't super fast (to redo the shifting each time, it took about a second or so). It was also pretty annoying to run the Python program, loading all of the images each time. I realized that if I precomputed all of the shifts, I could do this a lot quicker.

Fortunately, I already precomputed all of the shifts to create the shifting videos shown above. I realized that I could do this in Javascript (and embed it into the final report!), if I could somehow compute the shifts for each point. This would require loading in several images and reimplementing a bunch of Numpy in Javascript. Instead, I realized I can just pre-compute the depthmap for the image, and use it as a lookup table.

The depth maps are computed by, for each pixel in the center image, computing the best shift (as described a few paragraphs prior). This was very simple, but very slow. I spent a lot of time vectorizing this program so that I could compute the depth maps quickly. I noticed that there was a lot of noise in the final depth maps. To combat this, instead of computing a single depth map for the center image versus one of the corner images, I computed the depth map for each of the four corners and averaged them. This seemed to help a bit.

Below are the depth maps I used: you can see how noisy and imperfect they are. Regardless, they seem to work decently enough for the purposes of refocusing (provided you click around enough to get enough of the well-computed depths).