Final Projects: Seam Carving and Augmented Reality

Jeffrey Huo

CS194 Final Project 1: Seam Carving

Overview

The goal of this project is to resize images without sacrificing any important features within the image. The technique used to achieve this goal is called seam carving.

Algorithm

To resize the image, we first create an energy map of the image by using the gradient squared magnitude of the image in greyscale. With this information, we utilize dynamic programming to create a seam matrix that contains the lowest energy seam paths. We perform each time we remove a seam. Given the number of pixels we want to subtract, we find the lowest energy seam by the number of pixels times and remove them from the image, while also stitching the resulting image together. We can do this in either dimension to find the lowest energy seams horizontally and vertically to reduce pixels in both directions. To perform both, we simply perform this in one direction and do it in the other direction.

Bells and Whistles

For the bells and whistles, I sped up the algorithm by utilizing a min-heap. Given the number of pixels to reduce, I multiply this number by a constant and find the k min seams in the image. By doing this, we don't have to recalculate the seam matrix which dominates most of the run time. However, since there may be overlap in indices, I made sure that the starting index of the seams is separated by a distance of greater than five to reduce overlapping seams. Because of this assumption, the resulting image is not as good as the previous algorithm. However, the runtime is much faster than the original algorithm, more than six times faster.

Horizontal Carving Examples

For each example, we have the original image above the carved one.

Vertical Carving Examples

Combined Carving Examples

Overall, I think the algorithm did a decent job. However, there are some artifacts due to the complexity of the edges in the images I selected.

Fast Carving Examples

Failure Cases

The algorithm is sensitive to complex edges and color variations. As a result, simple images with simple, easy to identify backgrounds perform better. Here are some failures.

What I learned

The most important thing I learned about this project is how important input images are in constructing a good output for these types of algorithms. Additionally, I learned how different techniques and approaches are necessary in order to perform better and faster on a larger range of images.

CS194 Final Project 2: Augmented Reality

Setup

For this project, I used an old shoebox and created a regular pattern in order to make marking points easier.

Tracking Keypoints

In order to track the marked points as the frame changes, I used the median flow tracker in the cv2 library with an initial bounding box created from the first frame of marked points. On each successive frame, we update the tracker for each point. The tracking for the keypoints is relatively accurate except for two points that are lost towards the end of the video, but it should be fine since the rest of the points stay in the same spot.

Calibrating the Camera

In order to project 3d coordinates into 2d image coordinates, we need to find a projection matrix that transforms 3d coordinates of the cube onto the 2d images in the video. We calculate this matrix using our tracked points, known world points, and least squares to find the best possible estimate for the projection matrices of each frame.

After calculating these matrices, we apply them to 3d cube world coordinates to project them individually on each frame of the video. These are the results!

Conclusion

This was a super interesting project and goes to show you how AR could be simple to implement but difficult to perfect.