Project 4: [Auto]Stitching Photo Mosaics

By Sriharsha Guduguntla

Project 4 entails taking images that are at the same center of projection and same position, but tilted slightly to allow us to get multiple images that can be then stitched together to make a panorama (also called a mosaic). Part A of the project involves manually selecting correspondence points between the images we want to stitch together and trying to use those to align them and then blend them together to make one large panoramic mosaic. Part B of the project involves automatically selecting correspondence points instead of manually selecting.

I shot pictures for the mosaics!

Here are the pairs of pictures that I shot and will be using for the mosaics.

A House in Berkeley - Left

A House in Berkeley - Right

My desk setup - Left

My desk setup - Right

My Lebron James poster - Left

My Lebron James poster - Right

Rectification Picture

Picture of my laptop from an angle on a desk

Homography Computation

To do my homographies, I first selected 4 correspondence points in each of my left and right images for the mosaic. Then, I created a computeH function that basically computes the homography matrix H given the correspondence points of the left image and the right image. For example, here are two images of the house pictures with the selected correspondence points. Then, I used this homography matrix H to actually do my warping in the next step to warp the left image to look like the right (explained more in the next section). To compute the homography matrix H for p' = Hp, I used the least squares solution. p' are the right image correspondence points and p is the left image correspondence points. I used regular rounding for the interpolation rather than interp2d and it worked out fine for me.

Left Image Warped to Right

Right Image Warped to Left

Image Rectification

Here is an example of ground plane rectification of my laptop on a desk. As you can see my laptop goes from an picture from an angle (original image) to a top down view of the laptop after warping. To perform this warp, I set the destination points to a rectangle that I found through simply selecting the corners of the laptop and converting the polygon into a rectangle by just moving the corners to align and make 90 degree corners.

Original Image with correspondence points

Destination Image correspondence points

Top-down View (warped) version of left image

Blending Into a Mosaic

To blend the images into a single mosaic, I first warped my left image to my right image and then I had to find a way to align both images so the pixels align when they are concatenated together side by side. Before aligning both images, I padded the right image with zeroes all around so that it is the same size as the two images put next to each other. Then, I find the warped correspondence points of the left image and used those to translate my right image's correspondence points to the warped correspondence points of the left image. To translate my right image, I used np.roll and translated until all my pixels in the right image aligned with the left warped image based on the warped correspondence points. Then, in order to create a smooth blend between the two images, I used the Laplacian blending technique from project 2 with 5 Laplacian layers and a regular mask that spans half the image. However, after manually selecting corresponding points, I then implemented auto-point selection through implementing Adaptive Non-Maximal Suppression (ANMS), extracting 8x8 feature descriptors, matching corresponding feature descriptors in both images, and using the RANSAC algorithm to finalize a group of correspondence points (inliers) to compute a more accurate homography for the images. More details below on autostitching below:

Adaptive Non-Maximal Suppression (ANMS)

get_anms_corners(im, c_robust=0.9, nip=500)

I used Adaptive Non-Maximal Suppression as outlined in this paper. To do so, I first computed all the corners (aka interest points) for each of the left and right images using the Harris corner detection algorithm. However, these were too many points and there were a lot of points that were not necessary. Now, I needed a way to select corners that were well spread out across the image, so I applied ANMS by looping through each of the corners and for each corner, I compared it to all the other corners that maintained the following invariant and calculated the minimum radius r_i for that interest point:

I set c_robust = 0.9 as suggested in the paper for best results, and then chose the nip = 500 interest points that corresponded to the 500 largest radiuses.

Here is an example of all the corners that were detected at first by the Harris detector and then the 500 points that were selected after applying ANMS. As you can see, the post-anms image has only 500 points and they are well spread out across the image as we wanted.

Harris Corners - BEFORE applying ANMS

Harris Corners - AFTER applying ANMS (500 points)

Feature Descriptor Extraction

get_feature_descriptors(im, corners, size=40)

After selecting a subset of interest points from the ANMS algorithm, I extracted feature descriptors from the image for each interest point. I first selected 40x40 descriptors and then rescaled each descriptor to be 8x8. Here are images of what the 40x40 descriptors look like for different images. The red square indicates a descriptor and the blue point indicates the interest point associated with that descriptor.

Initial 40x40 descriptors for house left image

Initial 40x40 descriptors for desk left image

Initial 40x40 descriptors for Lebron James left image

Matching Feature Descriptors

find_matching_descriptors(left_interest_pts, right_interest_pts, left_feature_descriptors, right_feature_descriptors, lowe_thres=0.5)

After extracting the feature descriptors for both the left and right images, I had to find a way to match feature descriptors/interest points together between the two images. To compare them, I ran a matching algorithm where I looped through each left descriptor and computed the SSD of the left descriptor with each right descriptor. Then, I chose the 2 nearest neighbors of that left descriptor and calculated the Lowe ratio e1−NN /e2−NN. If the ratio was <= than my selected lower_thres = 0.5, then I selected these pair of interest points/descriptors to move on to the next part of my algorithm. At the end, I ended up selecting the following points for the house example. The blue points are the corresponding points selected by the feature descriptor matching while the red points are all the original points from ANMS.

Left Image Blue points - corresponding points selected by feature matching, Red points - all points

Right Image Blue points - corresponding points selected by feature matching, Red points - all points

RANSAC

ransac(pairs, num_iters=10000)

After getting a select number of interest points from the feature matching stage, we want to refine our interest points to get points that are better suited for computing the best homography. To do so, I applied the RANSAC algorithm which involves iterating some fixed number of times (10000 times in my case), selecting 4 random pairs in each iteration, and then computing a new homography matrix H from those new pairs and looping through all the original pairs again to check how close the destination points are to the source points with the homography applied to them. If the points are within an epsilon = 4 pixels of distance, then, I add them to an inliers set of pairs. At the end of the 10,000 iterations, I kept the largest inliers set and used those inliers to then compute a new (more accurate) homography matrix, and repeat all the warping and stitching steps as explained in the first paragraph of the blending section. Here is the house example with the final set of inliers overlayed on the images. As you can see, all the points in the overlapping section of the two images correspond with each other thanks to feature matching.

Left Image Inliers

Right Image Inliers

Final Mosaic Examples

My House

Left

Right

Mosaic of both images with hand-selected correspondence points

Mosaic of both images with automatically selected correspondence points

My Desk

Left

Right

Mosaic of both images with hand-selected correspondence points

Mosaic of both images with automatically selected correspondence points

A Lebron Poster on the Wall

Left

Right

Mosaic of both images with hand-selected correspondence points

Mosaic of both images with automatically selected correspondence points

What I Learned

This project was really cool and eye opening, and the most important thing I learned from Part A is how to warp an image using a homography matrix so that it can be warped to the likes of any other similar image. This is huge and it really opens up many possibilites for me to make cool panoramas and make optical illusions, etc. I learned about how to use the corners to first find a bounding box for the warped image and then use this to warp the image and then interpolate after. The coolest thing I learned from Part B was just how feature descriptors could be computed easily and compared via the SSD to actually find corresponding features in two images. I thought this part of the project was doing a lot of the so-called "magic" because it did a really good job of finding the exact matching features in my images without any sort of machine learning techniques as I would've guessed we needed.