Final Project: Augmented Reality & Light Field Camera

COMPSCI 194-26: Computational Photography & Computer Vision (Fall 2021)
Alina Dan

Poor Man's Augmented Reality

Overview

For this project, I inserted a synthetic object into a video I took. Specifically, I projected a cube onto a box by using respective 2D and 3D coordinates along with a camera projection matrix.

Setup

To begin, I used a box with marked grid lines and filmed a video while rotating my camera around the box:

​ ​

Keypoints with Known 3D World Coordinates

Next, I chose 26 2D keypoints of the box and measured the box to get the points' corresponding 3D coordinates:

​ ​

Propogating Keypoints to Other Images in the Video

To keep track of these keypoints across all the frames of the video (since the 2D points were picked based on the first frame of the video), I used a tracker from OpenCV (cv2.TrackerMedianFlow). The tracking was quite good as there weren't any points that went missing due to the tracker losing them. Here is how the tracking went in terms of allowing of propogation of the keypoints:

​ ​

Calibrating the Camera

With the 2D and 3D coordinates, I was able to use least squares to compute a camera projection matrix for each frame. This would allow me to project a cube onto the video in 2D based on 3D coordinates.

​ ​

Projecting a Cube in the Scene

After computing the projection matrix for each frame, I could then project and draw a cube onto the video based on its 3D coordinates. Here is the finalized video of the box with the projected cube:

​ ​

Lightfield Camera

Overview

For this project, I delved into the field of lightfield photography: the process of taking multiple images in a grid format and dynamically adjust the focus and aperture based on the photos taken.

Depth Refocusing

The first effect I worked on was being able to shift and average images within the collection such that it imitates focusing at different depth levels. This shifting allows for focusing on the foreground or background depending on the shift. For this, I took the grid's center image (the image taken at 8x8 since we have 17x17 viewpoints) and shifted all the pictures based on the equations:

x_shift = (x_i - x_center) * alpha

y_shift = (y_i - y_center) * alpha

where alpha is some constant that helps dictate what parts of the image to focus on and at. Here is a GIF depicting focusing based on different alpha values:

​ ​

Aperture Adjustment

The second part of this project involved implementing aperture adjustment for the lightfield photos. Over a range of different radii, I selected photos within the radius to select photos to be used in the averaging. Averaging a large number of images correlates to larger aperture whereas averaging over a small number of images correlates to a smaller aperture. Here are a few examples of different aperture due to different radii:

​ ​
Radius = 2
Radius = 6
​ ​

Learnings

This project was fun! It was nice getting to further see how we can use code and programming to implement what cameras do in real life. These things that we totally abstract away with the use of a camera is actually more complex and integral than we may think.