Project 6, Part B: Feature Matching for Auto-Stitching

Brandon Huang, Fall 2017


Overview

For this part of the project, now that our machinery for exploiting correspondences to stitch images automatically is working, we would like to automatically find correspondences between images in a panorama, allowing the entire process to be automated. The goal is to recover the homographies accurately; once this is done, the code from Part A takes care of the warping and stitching.

Procedure

  1. Detect interest points using the Harris corner detector. Corners are places of high interest because they can be localized in 2d.
  2. Using the Adaptive Non-Maximal Suppression (ANMS) algorithm, have points of higher H value (significance) from the corner detection algorithm suppress nearby points of lower value. This both distributes our interest points nicely, and also allows us to choose exactly how many interest points we can from each image.
  3. For each interest point, generate a feature descriptor. This is done by subsampling an 8x8 block from a 40x40 patch of the image, using a Gaussian filter for anti-aliasing. This descriptor is bias-variance normalized.
  4. For two images, match their interest points based on the Euclidean distance (SSD) between their feature descriptors. Instead of using the nearest neighbor, we look at the ratio between the first and second nearest neighbors. This is based on the assumption that there will be 1 correct alignment, and all other matches should be comparatively bad.
  5. We now have correspondences, but some of these may be outliers. To filter outliers, we use RANSAC to select the group of points with the most agreement on a homography.
  6. We now have robust correspondences. Reusing Part 1 code, stitch a panorama.

Procedure breakdown

Harris corners overlaid on one of the images. Many images cluster up near the boundaries of prominent features.
Harris corners filtered by ANMS. In this case, we reduced thousands of interest points to just 250.
Left image correspondences found by the feature descriptor, NN matching, and RANSAC.
The corresponding right image points.

Auto-stitching Results

Final results from the examples given above.
Manual results from part A. Automatic alignment did better!
An attempt to combine way too way images together. The ones on the edges seemed to be poorly aligned while the ones in the middle were fine; since the homographies were computed pairwise, this doesn't really make sense. Maybe the staircase in the middle provides more good correspondences than the generic trees to the sides.
Manual results from part A. Not quite as big because I hadn't figured out how to automate the stitching at that point.

Bells and Whistles

Automatic panorama detection

The panoramas above were generated by specifying the order of the constituent images from left to right. I also wrote a simple algorithm to generate a panorama from a bunch of candidates. I first compute correspondences between every pair of images, and count the number of correspondences generated i.e. the number of inliers from the RANSAC algorithm. This gives a good metric of how likely it is that two images are part of the same panorama. I then pick a random starting image and, going both left and right, repeatedly add the image that has the highest number of inlier correspondences with the current left or right side fringe of the panorama. Once I find a good ordering, I apply the same method as above to get the panorama. Currently this only works for a sequence as a panorama; the algorithm needs some modifications to work for panoramas with tree-like or more general graph structure.

An automatically detected panorama, giving similar results to the manually ordered, auto-stitched ones.
Manual results from part A.

What's the most interesting thing I learned out of this project?

At first, just using "corners" (whatever that means) as features seems a bit unintuitive. It's a bit magical how they turn out to be the exact right thing to look for in an image. If we think about it like a dimensionality reduction, though, it makes perfect sense: if we want to localize something in a 2D world (i.e. a picture), using a 2D surface gets us nowhere. Using an edge (a 1D object) gets us halfway there as we limit one degree of freedom. Using a corner, which is the intersection of two edges, gets us all the way there because we limit both degrees of freedom in the 2D plane.