Cs 194 Project 4:

Part A: Image Warping and Mosaicing

Part B: Feature Matching for AutoStitching

Part A


Overview

The goal of this part is to stitch images together into an image mosaic by recovering homographies and using them to warp images into alignment.

Part 1: Shoot the Pictures

The goal of this part is to take pictures that can be aligned and blended into a mosaic. It was important to take pictures standing from the same location, just rotating the persepctive of the camera. The pictures should also have a significant amount of overlap, allowing us to select correspondence points between images for alignment purposes. I also took pictures of items that would be used for image rectification. I made sure that these items were rectangular in nature, but due to camera perspective, they would appear slanted in the original image.


Images for rectification:

Music Folder
Laptop

Mosaic Image Sets


Image Set 1: Living Room
Living Room Left
Living Room Right

Image Set 2: Side of HMMB

HMMB Left
HMMB Right

Image Set 3: Wheeler Hall

Wheeler Hall Left
Wheeler Hall Middle
Wheeler Hall Right

Part 2: Recover Homographies

In order to recover homographies between images, we first select points of correspondence between our two images. Then, we can set up the following equation to map from the points of one image to the other:

We can then rewrite this equation in the form of a matrix multiplication equation, which will allow us to use least squares to solve for the elements of H. We solve for the first 8 elements of H, as we assume that the lower right element h33 is always 1.


Source: https://towardsdatascience.com/estimating-a-homography-matrix-522c70ec4b2c

Part 3: Warp the Images

Now that we have our homography matrix, we can warp images. To warp an image, I create a meshgrid of coordinates, and then inverse transform these coordinates to get their mappings. Then I call cv.remap with linear interpolation in order to map pixels in the image to the interpolated values at their mapped coordinates. See below parts for examples of warping in both image rectification and image stitching.

Part 4: Image Rectification

In image rectification, we want to warp an image such that our object of interest appears to be planar. We select the four corners of our object, and find the homography between those points and a rectangular point set. Then, we perform a warp using this homography.



Examples of rectified images

Original Music Folder
Rectified Music Folder
Original Laptop
Rectified Laptop

Part 5: Image Mosaic

The goal of image mosaicing is to stitch together two or more overlapping images in order to create one panoramic image. For my implementation of mosaicing, I always warped the left image onto the right image. I first found a homography from the left image to the right image. Then, I applied that transformation to my left image corners to see what the transformed corners would look like. This helped me determine the appropriate x and y shifts that I would need to perform on my transformed image in order to keep it fully in the image frame. Next, I transformed my left image correspondence points using the homography matrix. This helped me determine where the points would lie post transformation, allowing me to determine shifts I would need to perform on the right image to align the points properly. I then created a canvas large enough to hold both of my images. I performed the warp on my left image, and shifted it to the appropriate location in the canvas. I also shifted my right image to its appropriate location. To blend the edges nicely, I used a linearly changing weighted average at the intersection points of the two images, and preserved the pixel values of left and right images outside of the intersection.



Living Room

Room Left
Room Right
Room Mosaic


HMMB Side View

HMMB Left
HMMB Right
HMMB Mosaic


Wheeler Hall

For my Wheeler Hall mosaic, I attempted to stitch together three images by first stitching together the first two, and then stitching the result of that with the third. The result of stitching just the first two is relatively clean, but after it gets to the third image, it has already become increasingly warped, and the blending is not as clean. A better solution to this would be to stitch everything to the middle rather than always stitching from left to right.

Wheeler Hall Left
Wheeler Hall Middle
Wheeler Hall Right
Wheeler Hall Left and Middle
Wheeler Hall Full Mosaic


What I Learned from Part A

One thing I learned from this project is that maintaining good alignment in image stitching is a very challenging process. I had to reselect correspondence points many times when stitching together images, as selecting suboptimal point sets resulted in images that were very poorly aligned. This makes me excited for part B of this project, as I can imagine that selecting points with automatic feature detection has potential to be very good in selecting high quality correspondence points.

Part B


Overview

The goal of this part is to automatically select high quality correspondence points between images, which will be used to compute the homographies for image stitching, as opposed to manually selecting correspondence points.

Part 1: Harris Corners

For this part I used the provided Harris code to detect Harris corners in my input image. I converted the image to grayscale when finding the corners, as the provided code is designed for grayscale inputs. My edge discard parameter is set to 20 for the examples shown below.



Part 2: Adaptive Non-Maximal Suppression

As can be seen from the previous images, the provided Harris function results in a very large amount of points being selected. We use adaptive non-maximal suppression (ANMS) to filter to 500 points. We first compute the pairwise distances between all Harris corner points. Then, for each point, we will find the minimum distance from that point to another point where the h value is less than c_robust * h_value of the other point. Here, we set c_robust to 0.9. We return the 500 points that have the largest of such distances, as this indicates they have the strongest h values in a larger radius.



Part 3: Feature Descriptor Extraction

After filtering down our number of points, we can start extracting features associated with each of these points. For our purposes, we extract a 40 x 40 axis-aligned patch surrounding the points. We then downscale this patch to an 8x8 patch, and convert it to a 64 length vector. We also perform bias-gain normalization on the vector to get our resulting feature descriptors. Here is an example of what some of the extracted feature patches look like.



Part 4: Feature Matching

Once we've successfully extracted feature descriptors for our two input images, we can do further processing to match up the features, and to discard features that don't have a strong match between the images. For each point in image 1, we calculate the squared error between the feature descriptor of that point and the feature descriptors of all points of image2. We keep track of the lowest and second lowest squared error. We then calculate the ratio between the lowest and second lowest error, and if it is less than a threshold (which I tuned for different images), then we consider the corresponding points to be a match. This helps us reduce our number of points greatly, and only keeps points that have a clear match, in accordance with Lowe's thresholding.



Part 5: RANSAC

So far, we've filtered out most of the outliers, but RANSAC is the final step we need to do in order to filter out all of the outliers and compute a robust homography with only inliers. I ran RANSAC for 1000 iterations in order to achieve this. In each iteration, we randomly sample 4 pairs of correspondence points, and compute a homography matrix that transforms the image 1 points to the image 2 points. We then use this homography matrix in order to transform all of the image 1 points, and find the squared error between the transformed image 1 points and their corresponding image 2 points. Any pairs of points where the error is less than a threshold can be considered an inlier for that iteration. We keep track of the largest set of inliers seen across all of the iterations, and return that as the output of RANSAC once all 1000 iterations are complete. The output will be the final set of correspondence points that we will use to compute a homography with and stitch the images together. Below are examples of the points RANSAC selected. They match up pretty nicely across the two input images.



Part 6: Autostitched Mosaics

Using the points selected with RANSAC, we can now stitch our images together. Below I show the results of manual stitching compared to automated stitching. The images are very similar in quality, indicating that automated stitching does indeed select points that match up well across images and are conducive to good alignment in the stitched panoramas.

Room Manual Stitching
Room Automatic Stitching
Hearst Manual Stitching
Hearst Automatic Stitching
Wheeler Manual Stitching
Wheeler Automatic Stitching

What I Learned from Part B

I think Lowe's thresholding was one of the most interesting ideas I learned about for this project. If I had to implement a heuristic for selecting features matches myself, I might have naively just set a threshold and kept feature descriptor pairs if their best error fell below the threshold. However, this would result in a lot of false positives. Lowe's thresholding is such a good idea because it ensures that if there is a feature descriptor match, then it is a clear match because the second nearest neighbor is so much further. I also appreciate how the overall process of point selection using corners, feature descriptors, and RANSAC ends up selecting such good points and makes the alignment process so much more streamlined, as it was tedious to manually select points in part A.