Augmented Reality

By Hyun Jae Moon

Introduction

The goal is to project a cube in the scene of a video, where the cube will remain intact with the natural flow of the movie. We will be using a self-made box with 24 points that are in equidistance. After setting keypoints with known 3D world coordinates, I've utilized TrackerMedianFlow to track the points over frames and calibrated the camera to project a cube in the scene.

Setup

Here is the first frame of the box. Unfortunately, the original video is too big that I won't be able to upload as an assignment.

Firstim

As you can see, it is an Amazon delivery box covered with paper towels. I've marked the points on the box with equidistance.

Keypoints with known 3D world coordinates

All I had to do is to create a clear mapping between the 2d points on each frame of the video and the actual real 3D world coordinates that has been measured manually.

Propagating Keypoints to other Images in the Video

I've utilized cv2.TrackerMedianFlow to track the subsequent points from the starting point. As long as I was able to pinpoint the initial points, the tracker does a fairly good job on tracking throughout the video. Here is the video

Result

As you can tell, the redpoints are the points that are being tracked over frames of the video. Unfortunately, the few points on the right didn't work as intended as some points are being cut out of the video. At this point, I could re-shoot the video, but I decided to go along with it, because the box was already discarded. If I had more time, I would re-shoot the video to perform a better result.

Calibrating the Camera

After having a distinct mapping from 3D coordinates to 2D coordinates we can compute the camera perspective matrix by solving the following matrix multiplication.

Homogeneous

Projecting a cube in the Scene

Using the matrix in the section above, I would simply have to set the axes for the cube, perform a matrix multiplication per frame, then draw the box. My box looks really weird (and requires some debugging), but we can see that the box does follow the box's object.

Result2

What have I learned?

The most important concept in this project is computing the perspective transformation matrix. To project a 3d object into our 2d digital devices, these transformations are very crucial and allows for more development in the AR world. With auto detection such as Harris Interest Point Detection, AR definitely has real-life applications that can be used for entertainment purposes, such as Pokemon Go.