Image Warping and Mosaicing

Kevin Lin, klinime@berkeley.edu

Image Warping and Mosaicing
Feature Matching for Autostitching

Introduction

This is part A of the image mosaicing project, primarily focused on stitching different photos taken the same location into a seamless mosaic. We choose a base image and warp the others to the same plane as the base image, via appropriate homographies computed from human labeled keypoints. Part B of the project will automate keypoint labeling via feature detectors and feature descriptors.

Shoot the Pictures

The following are photos of my room that I will be stitching, with the middle as the base image:
Imgur

Imgur

Imgur
Taken with an iPhone 11 without adjusting any fancy settings because I have no camera 😭.

Recover Homographies

We compute the homographies by least squares on a set of keypoints. Keypoints:
Imgur

Imgur

Some tricks to selecting good keypoints include

Select corners
Accuracy matters A LOT, any outlier will ruin the homography
Many is good, assuming similar accuracies, so small margins of errors get averaged out
Spreaded out is good, to make sure the homography retrieved is global

From the keypoints, we solve for the homography:

Imgur
We want to perform least squares on the homography matrix, so we need to rewrite the equations. Taking advantage of gx + hy + 1 = 1, we can derive:

Imgur
which is now in the form we can solve with least squares.

Image Rectification

With the homographies computed, we simply need to warp the pixels with the transformation. In general, we want to first define the pixel positions in our target image and interpolate the inverse warped pixels. Since homography preserves the convexity of our image, we can forward warp the corners (i.e. rectangle) to our target image (now a convex polygon), then define the pixels within the polygon as our target and inverse warp. Results of rectifying:
Imgur

Imgur

Blending

Before blending, I first padded the rectified images such that 1. they all have the same dimensions and 2. corresponding keypoints are at the same position. We also have the exact mask corresponding to our warped images, but since the masks are not mutually exclusive (rather cannot be or there will be no corresponding keypoints), we cannot directly blend in a similar fashion as our previous project of blending the “Oraple”.
Instead, we define a seam between two neighboring images as the pixels that are equidistant to the edges of the images and update our mask according to the seam, such that the masks become mutually exclusive. Then, we can blend according to a two layer laplacian stack, i.e. a “two band blending”. The stitched mosaic:
Imgur

Conclusion

I was awed by the mosaic at first, but then small hiccups cought my eyes and I could not get the right set of keypoints to eliminate the hiccups. I hope part B can take care of that for me so I don’t need to deal with my butterfingers 😣. The “aha moment” when I figured out how to rewrite the projective transformation was very satisfying though 😊.

Feature Matching for Autostitching

Introduction

This is part B of the image mosaicing project, which is focused on automating keypoint matching via feature detectors and feature descriptors. On a high level, corners are automatically detected, and images patches around the corners are extracted as feature descriptors, then corners are matched for sufficiently similar descriptors.

Multi-Image Matching using Multi-Scale Oriented Patches

There are six steps to the matching algorithm

Extract corners with Harris Corner Detector on x different scales (default x=5)
Perform Adaptive Non-Maximal Suppression (ANMS) which assigns a suppression radius for each corner based on their harris values, and selects the top k (default k=500)
Compute the average angle for 40x40 image patches each centered on a corner via arctan2(dy/dx), where dy and dx are the average values of the image patch convolved with the y and x Sobel Operator.
Rotate each image patch by its angle and extract an 8x8 feature descriptor by downscaling the 40x40 image patch. Feature descriptors are demeaned and standardized.
Match corners by computing nearest neighbors on their feature descriptors, and keep only those with 1-NN/2-NN value < 0.2.
Perform RANSAC for k iterations (default k=100,000) to find the largest subset of matches that warp to coordinates at most epsilon pixels away (default epsilon=20)

Bell & Whistles: multiscale and rotation

The predicted keypoints and matches:
Imgur

Imgur

where the red dots are corners after ANMS, the blue dots are matches thresholded, and the yellow dots are matches after RANSAC. The orange lines are the orientations of the patches.

Then, stitching is the same as Part A with the homography computed via least squares on the yellow keypoints. The result is the following:

Imgur

Why did I choose a different set of images? Because unfortunately the automatically detected keypoints did not result in as good of a mosaic as hand-labeled keypoints 😢
Imgur
Hand-labeled:
Imgur
I’m fairly sure I moved too much when taking the 3 photos 😢

Conclusion

Understanding ANMS as described in the paper was a HUGE pain. I think I understood what Prof. Efros meant by “sometimes research papers are simply not written to be easily read”. However, the results are very cool, especially since this bypasses all the painful and careful labeling process! This also introduced me to the various keypoint matching algorithms such as SIFT, SURF, and ORB, and all kinds of tricks to improve the speed and accuracy!