Auto Stitching & Photo Mosaics

COMPSCI 194-26 // Project 5 // Spring 2020

By Naomi Jung // cs194-26-acs

Part 1: Image Warping and Mosaicing

In Part 1, our goal was to make an image mosaic by stiching two overlapping images together. This involved manually choosing overlap points between the two images, recovering the homographies between the two images, warping the images into their rectified versions, and finally blending the two warped images together.

Digitizing Photographs

Below is one pair of initial pictures that I took and digitized for this project so that I could stich them together into a mosaic.

Image 1

Image 2

Recovering Homographies

After choosing several overlap points between these images, we recovered the homography matrix H between the original points p and new points p', where p' = Hp. To do this, we used least squares regression to solve Ax=b, where A was a matrix built using the original coorinates of size 2n x 8, and b was a vector of length 2n consisting of the new points. We then reshaped the resulting vector x, into a 3x3 matrix with the last entry set to 1 to recover H.

Warping Images

Next, we warped our image using our homography matrix. We used inverse warping to map pixels in the output shape to corresponding pixel values in the original image. We took the dot product of H inverse and p' to and then normalized by w to recover the original (x,y) coordinates. We then performed interpolation using these float coordinates to determine the pixel values to use to avoid aliasing.

At this point, we were able to rectify images! By taking an image with planar suraces, we mapped this to a rectangular set of points to rectify the image so that its plane was frontal-parallel. Below are some of the results of image rectification, with the original image on the left and the rectified image on the right.

Below is a rectified image that highlights the use of interpolation to reduce aliasing effects. We used scipy's interp2d function to interpolate pixel values when the inverse warping mapped to float coordinate values.

Original Image
No Interpolation
With Interpolation
Mosaic Blending

The last step of manual stitching was mosaic blending. After warping the lefthand image to the righthand image's geometry using the computed homography, we stitched the two images together by applying an alpha blending using a feathered mask, which blended the images together by multiplying each pixel in each of the images by a mask value between 0 and 1.

Below are some mosaics that I created!

Kitchen Scene

Street View

House Garden
Part 1 Learnings

During this part, I enjoyed being able to learn how to warp images for image rectification. It was similar to some of the algorithms that we used in Project 3, but also incorporated new concepts such as homography and interpolation which we used to map multiple points in an image and find the best mapping to warp the images. Seeing the results after rectifying the images was also really rewarding!

Part 2: Feature Matching for Autostitching

Part 2 extended from Part 1 by adding feature matching to allow for stiching the images together automatically, rather than having to define correspondance points for the images manually.

Corner Feature Detection

The first step of this was to detect corner features in images. We started with the Harris Interest Point Detector, which is an algorithm that chooses points that have the highest corner strength within 3x3 neighbor grids. From there, we implemented Adaptive Non-Maximal Suppression (ANMS) to choose 500 Harris points so that we would be able to optimize feature matching using fewer points. The idea behind ANMS is to choose points that are distributed over the entire image, as opposed to simply choosing the points with the highest corner strengths, as the points may be unevenly clumped together. For each of the Harris points, we calculate the minimum suppression radius by finding the minimum distance between that point and any other point with a larger corner strength. Then, we take the 500 points with the largest suppression radii as our set of points to perform feature matching on.

In the images below, we see that while the 500 strongest Harris Corners are often clumped together in specific areas of the image, by using ANMS, we can choose strong corners that are also evenly distributed throughout the image.

All Harris Corners
500 Strongest Harris Corners
Adaptive Non-Maximal Suppression
Feature Descriptors

Next, for each of the 500 points determined by ANMS, we computed feature descriptors by downsampling a 40x40 patch centered around each of those points so that each point ended up having a blurred 8x8 feature patches associated with it. We made sure to bias/gain normalize these patches so that the patches would be invariant to changes in intensity.

Feature Matching

After extracting feature descriptors for each of the corner coordinates for both images, we then proceeded to match features between the two images. For each feature in image 1, we searched for its nearest neighbor in image 2, by computing the SSD between feature descriptor vectors and choosing the coordinate associated withh the feature vector that was the most simimlar to each image 1 feature. Furthermore, in order to ensure that each match was a true match, we also used the Lowe thresholding, where we compared the distance to the nearest neighbor and the second nearest neighbor. The idea behind this is if a feature has a true match, it should only have one that is the clear winner, as opposed to points that are all similarly distanced from that point. I played around with different values for this threshold and ended up using 0.4. If the ratio of the nearest neighbor to the second nearest neighbor was less than this threshold, we proceeded to classify that as a matched feature.

Below are screenshot with matched features highlighted in the two input images for the Kitchen Scene. We see that for the most part, there are clear matches between points in image 1 and points in image 2. However, we did notice a few outliers after this step, such as the one on the lefthand side of the fridge in image 1.

Image 1 Matched Features
Image 2 Matched Features
Estimating Homography Using RANSAC

The final step of automatically identifying correspondance points was to estimate the homography using Random Sample Consensus (RANSAC). In this process, we selected four matched features at random and computed the homography matrix. We then applied the homography to all the matched points and determined the SSD for each feature. If the SSD was less than some error threshold epsilon, we kept those points as our set of inliers. We iterated through this process 100 times, and kept the largest set of inliers as the set of correspondance points.

Below are screenshots of both images and their correspondance points highlighted after 100 iterations of RANSAC using an epsilon value of 0.8. These points were then used to perform the process from Part 1, which included computing the homography, warping the image, and blending them.

Image 1 After RANSAC
Image 2 After RANSAC
Autostitched Mosaics

Below are my final results using both manual stitching and automatic stitching! Overall, they look pretty good, and there isn't a very comparable difference between them. In some of the outputs, the automatic stitching looks better, likely because I wasn't able to use a photo editing program to exactly pick out my correspondance points when choosing those manually, so there was likely some margin of user input error in manual stitching.

Kitchen Scene: Manual
Kitchen Scene: Automatic
Street View: Manual
Street View: Automatic
House Garden: Manual
House Garden: Automatic
Concluding Thoughts

I definitely learned a lot through implementing this project! The Adaptive Non-Maximal Sampling was a particularly interesting strategy to pick out points that were spread out spatially within the image so that feature matching wouldn't be restricted to points that were clumped together. I also thought that Lowe thresholding was really interesting to learn about, and how that enables us to distinguish between true and false matches when feature matching. Overall, I was really impressed with how the automatic stitching was able to work, and how it was able to identify appropriate correspondance points to make the mosaics!