CS 194-26 Project 4A

Jaiveer Singh

Part A

Shoot and digitize pictures

The first step of this project was to produce several sets of input images. First, I took a few individual images to be rectified as "unit tests":

I also took sets of images for mosaicing, with high overlap between them (~70%):

Recover homographies

To recover the homographies, the key insight was to transform the standard equation relating a homography into a classic format of Ax=b.

This is accomplished by writing out the complete system of equations represented by the p' = H p equation:

... producing the following matrix with many zeroes that encapsulates the same information:

This is now in the form of Ax=b, the standard least squares problem. Solving with least squares for the x vectors gets us the 8 values of H (plus the 1) to reconstitute a full 3x3 H matrix.

Warp the images

In order to verify that this process works, I produced two rectified images of a Post-It Note and a Whiteboard, which are a "true" square and rectangle, respectively.

Blend images into a mosaic

Finally, in the culmination of all of these efforts, I produced 3 mosaics using the sets of images provided earlier.

Coolest thing

The coolest thing I learned was how to play with the A channel in RGBA images. I feel like I have a new superpower with being able to composite transparent images now that I can appropriately apply transparency blending.

Part B

After completing the previous part, it became clear that the slowest step in the process of producing mosaics was choosing matching points between images. Luckily, there exist techniques to perform this feature matching automatically.

First, corners are detected in each image using the standard Harris corner detection algorithm. The intuition here is that corners can be uniquely determined relatively easily, whereas points along an edge leave ambiguity as to where exactly on that edge the points are. Points on no edge, obviously, are essentially impossible to correspond.

Since there are many corners in this overlay, and since we desire only points that are more uniquely spread out throughout the image, we can perform Adaptive Non-Maximal Suppression to reduce the number of points produced. Essentially, only points that are local maxima in terms of intensity are preserved, producing a much better balance across the image.

Input Harris ANMS

After selecting 8x8 descriptors from a 40x40 pixel patch around each of these ANMS points, we must match these features. First, matching is conducted by comparing the L2 norms of features in image 1 to those in image 2. Thresholding based on the Lowe heuristic enables only matches that are much more closely aligned to their "true love" than their "next best choice" to be used in the subsequent calculation.

Finally, the RANSAC algorithm is used to discard any remaining outliers and produce a beautiful, automatic mosaic:

For the sake of comparison, we can also show mosaics produced using the same input images as the first part's manual selections:

Automatic Manual

Note that while the outputs in the case of Kitchen (with lots of distinctive features) are closely matched, the outputs in Hallway (with lots of white wall space) have a much poorer alignment compared to the manual alignment. This suggests that image choice is still important, even with automatic feature selection.

Coolest thing

The coolest thing I learned in this part was how to use the RANSAC algorithm to remove outliers probabilistically. I have yet to explore other use cases of this algorithm, but the entire space of "probabilistically correct" algorithms is especially interesting to me.