Project 6, Part B: Feature Matching for Auto-Stitching
Brandon Huang, Fall 2017
Overview
For this part of the project, now that our machinery for exploiting correspondences to stitch images automatically is working, we would like to automatically find correspondences between images in a panorama, allowing the entire process to be automated. The goal is to recover the homographies accurately; once this is done, the code from Part A takes care of the warping and stitching.
Procedure
Detect interest points using the Harris corner detector. Corners are places of high interest because they can be localized in 2d.
Using the Adaptive Non-Maximal Suppression (ANMS) algorithm, have points of higher H value (significance) from the corner detection algorithm suppress nearby points of lower value. This both distributes our interest points nicely, and also allows us to choose exactly how many interest points we can from each image.
For each interest point, generate a feature descriptor. This is done by subsampling an 8x8 block from a 40x40 patch of the image, using a Gaussian filter for anti-aliasing. This descriptor is bias-variance normalized.
For two images, match their interest points based on the Euclidean distance (SSD) between their feature descriptors. Instead of using the nearest neighbor, we look at the ratio between the first and second nearest neighbors. This is based on the assumption that there will be 1 correct alignment, and all other matches should be comparatively bad.
We now have correspondences, but some of these may be outliers. To filter outliers, we use RANSAC to select the group of points with the most agreement on a homography.
We now have robust correspondences. Reusing Part 1 code, stitch a panorama.
Procedure breakdown
Auto-stitching Results
Bells and Whistles
Automatic panorama detection
The panoramas above were generated by specifying the order of the constituent images from left to right. I also wrote a simple algorithm to generate a panorama from a bunch of candidates. I first compute correspondences between every pair of images, and count the number of correspondences generated i.e. the number of inliers from the RANSAC algorithm. This gives a good metric of how likely it is that two images are part of the same panorama. I then pick a random starting image and, going both left and right, repeatedly add the image that has the highest number of inlier correspondences with the current left or right side fringe of the panorama. Once I find a good ordering, I apply the same method as above to get the panorama. Currently this only works for a sequence as a panorama; the algorithm needs some modifications to work for panoramas with tree-like or more general graph structure.
What's the most interesting thing I learned out of this project?
At first, just using "corners" (whatever that means) as features seems a bit unintuitive. It's a bit magical how they turn out to be the exact right thing to look for in an image. If we think about it like a dimensionality reduction, though, it makes perfect sense: if we want to localize something in a 2D world (i.e. a picture), using a 2D surface gets us nowhere. Using an edge (a 1D object) gets us halfway there as we limit one degree of freedom. Using a corner, which is the intersection of two edges, gets us all the way there because we limit both degrees of freedom in the 2D plane.