CS 194 Final Project: Poor Man's Augmented Reality

Emily Hu


Input and Output: Side-by-Side

The difference in length is attributed to a slower frame rate in the output video on the right.

Intermediate Result: Tracking Points

These points were first manually picked, where the first frame of the video was used to determine the (x, y) coordinates of these points in the image. Then, for all subsequent frames of the video, I used cv2's TrackerMedianFlow to track each of these points, frame-to-frame. The result is what's displayed below.

A note about points: When I first picked the 20 points as suggested in the project spec, my resulting cube became more and more distorted with each frame. After inspecting the video with the tracked points overlayed, I realized it was an issue where a few points where manifesting as tracking failures. After a couple rounds of guess-and-check trying to replace the misbehaving points, I tried simply removing these poorly tracked points, and since the system was overdetermined to begin with, the removal of these outliers solved the issue. That is why my tracked points only displays 18 points instead of the original 20. The video below shows the original points I chose, including the poorly tracked ones, visible towards the end of the video. This was also before I realized the default dpi of matplotlib's savefig function was way too low for a video this size, so sorry about the potato resolution. :(

Bells and Whistles: Harris Tracking Points

Following the Piazza announcement, instead of the bells and whistles, I also implemented the other alternative tracking method proposed: Harris corners.

Left: This video shows the same 18 points used with the MedianFlow tracker to produce the final result.

Right: This video shows the Harris tracking method on the 20 points I originally picked before the point-decreasing process I noted above in gray.

Reflection

A general concept that I will definitely be able to apply elsewhere is that though it often seems the more data we have, the stronger and more accurate our results, sometimes an outlier can be exponentially more detrimental than the benefit added by one additional data point. This was a lesson in the importance of good inputs, which has been a relevant concept in almost all my other CS classes as well, especially security, databases, and algorithm design.

This project really epitomizes my favorite thing about the projects in this class, which is how simple and understandable the math and logic is behind concepts which before seemed like they might as well be rocket science. I had never even stopped to consider how the basic mechanism for something as advanced as augmented reality works, but this class and its projects really drove home the point that with the necessary mentorship, time, and knowledge, my peers and I are just as capable of tackling these awesome and applicable projects.