CS194-26 Project 6

Image Warping, Mosaicing, and Feature Matching for Auto-Stitching

Alex Fang

Part A

Introduction

In this part of the project, we create panoramas from multiply photos by using the homography transformation to rectify images in the collection to a singular perspective, and then we blend the images together to create a panorama. Most images here were taken by myself on UC Berkeley's campus.

Recover Homographies

To obtain the homography matrix, we manually label corresponding points between images, and then use least squares to calculate the elements in the homography matrix. We formulate the homography matrix transformation into a least squares problem to do so, solving for a through h in the homography matrix, as i is already known to be 1.

Image rectification

For image rectification, we identify the set of points we want transformed, and by taking the transformations of the four corner points, we know the range of points that our image will be transformed to. Knowing that, we use the inverse of the homograhpy matrix to perform inverse warping. Below is an example of image rectification.

Blending Images into a Mosaic

Using the previous parts, we combine them to form a mosaic of multiple images. To get rid of artificats that come from combining photos, we can blend them with weighted averaging by creating masks that have values depending on how close to each image's center the common point is and then blending using the masks (center weighting). We can also use laplacian pyramids on top of this as well.

What I've Learned

The most important thing I learned from this part of the project is that blending requires coming up with the right approach, which in this cases requires a 2d mask in order to get rid of both horizontal and vertical artifacts. The coolest thing I learned from this part of the project is how homographies can be useful to get interesting views of a certain part of the image, such as floor tiling as presented in lecture.

Part B

Harris Interest Point Detector

We use Harris Interest Point Detector to automatically find features/points in our image. It works by checking the gradient at each point of the image, and if it has a relatively large gradient for both the gradient with respect to x and with respect to y, then it is likely a corner. More specifically, the corner strength at a location is the determinant of the Harris matrix divided by the trace of the Harris matrix, where the Harris matrix is smoothed outer product of the gradients at the position. This process can be enhanced by doing it at multiple levels of a Gaussian pyramid. Below are images overlayed with the Harris corners detected.

Adaptive Non-Maximal Suppression

To reduce the number of Harris corners, we run adaptive non-maximal suppression. This algorithm finds the n points that have the largest minimal suppression radius, which evenly distributes the points selected spatially. The minimal suppression radius is defined as the minimal 2-norm between a point and another point (both in our list of Harris corners), such that the corner strength of our point is less than the corner strength of the other point multiplied by a robustness constant (we use c=0.9). Notice that the point with strongest corner strength might not be surpressed, and would thus have a minimal suppression radius of infinity. Below are images overlayed with the Harris corners remaining after ANMS.

Feature Descriptor Extraction and Feature Matching

For each point we have left, we do feature description extraction by taking a 40x40 patch around the point and then going to a lower frequency by blurring and subsampling down to a 8x8 patch. It is important to then normalize the patch to 0 mean and 1 standard deviation to make the features invariant to intensity. Using these features, we then try to find matches with the features in the image we are trying to stitch it with. We do so by finding the SSD between every pair of patches from different images. Theoretically, the best match would be its nearest neighbor, which minimizes the SSD. But it is possible to have incorrect matches, so we have a threshold (we use 0.5 here) that the ratio of the 1-NN over the 2-NN should be below, which gives better results because for a correct match, the SSD with the 1-NN should be a lot smaller than that of the 2-NN. Below is a pair of images overlayed with the remaining feature pairs after thresholding.

RANSAC

To compute homography matrices, we use 4-point RANSAC. This procedure takes four pairs of random points from the list of matched points from the previous step and computes an exact homography, before checking how many of all the matched points are within a certain epsilon of the transformation by homography, calling them inliers. After running this for a deterministic number of times, we pick the iteration that had the most inliers, and compute an exact homography the corresponding inliers. Below are the results of using said H to blend images, compared to using manually defined correspondences from part A.

What I've Learned

In this part of the project, the coolest thing I learned is how in this instance algorithms were layered on top of each other to improve results. ANMS and Multi-Scale Oriented Patches both build on top of Harris corners to improve its initial results to find good features, which are then matched to automatically stitch together a panorama.