Project 5 Part II - Image Warping and Mosaicing

by Ashna Choudhury

Overview

In the second half of Project 5, we worked to implement Automatic Image Mosaicing which essentially allows for our code to automatically find correspondence points, compute a homography matrix, and then warp and blend two images together. This project, in all honesty, was probably the most frustrating one for me: despite spending many days and countless hours on this code, I wasn't actually able to get a successful final image. In short, even though I tried my best, my code still does not perform as it should. This is of course disappointing, and if I had infinite time to spend I would keep trying to get this right, however, time is not at an endless disposal to me, and as such, I have decided to simply show what I have. Though my code is not fully functional, I will still talk about the work I did at each step of this project.

Part 1: Detecting Corner Features in an Image

Harris Interest Point Detector

The first main goal of this project was to come up with a way to automatically generate point correspondences between our two images (so that we don't have to manually select every point as we did in previous projects). To start this process, we needed to first generate a large number of points to look at. The best way to do this is to begin with the Harris Point Detector which essentially identifies key points in an image that are mainly classified as "corners" of various edges in the image. Though this method will return an incredibly large number of points, it is a good way to start the method of identifing good vs. bad correspondence points between the two images.

Harris Points - Image 1
Harris Points - Image 2

That's a lot of points...let's find a way to improve that!

Adaptive Non-Maximal Suppresion (ANMS)

At this point, we have a large number of potential interest points to work with, but we want to start eliminating points that do not give us useful information (i.e. points that will not match well between our two images). We also want to eliminate points so that we don't have to work with potentially hundreds of thousands of points every time we look at our interest points. The best way to do this is to implement Adaptive Non-Maximal Suppresion. In this process, we iterate through all of the interest points of a given image and compare the H Value of one point (let's call it our interest point) with the H value of the remaining interest points. If the H value of our interest point is less than a certain scalar multiple of the H value of our comparison point, we select this comparison point. At the end, we compare the distances of our interest point to all of the selected comparison points, and keep the points with the lowest distances. After all of this work, we are left with interest points of our given image that better represent our image and have managed to eliminate some of the unncessary points in the process.

ANMS Interest Points - Image 1
ANMS Interest Points - Image 2

Parts 2 & 3: Extracting & Matching Feature Descriptors

Now that we have our interest points, we want to find a way to create and match correspondence points between the images. This is done by creating Features or a small window of pixels that corresponds to each interest point. By iterating through our interest points, we can create a 8 by 8 patch of pixels to sample from. After creating our features, we must match our features from image 1, to the features from image 2. This is done by iterating through our features in image 1 and then evaluating the ratio of the SSD error of the best match and next best match to a feature in image 2. If said ratio is lower than some threshold value (0.4 in my case) then we name the correlating interest points in image 1 and image 2 a correspondence point.

Correspondence Points - Image 1
Correspondence Points - Image 2

Part 3: RANSAC

After all of this work, we have our correspondence points between image 1 and image 2. We can now use these points to compute our homography H matrix as we did in the previous part. However, in this project, we want to implement RANSAC to estimate our homography. Essentially, we will randomly select 4 points from our correspondence points and use these 4 points to compute a test homography. With this homography, we will proceed to warp our points from image 1. Finally, we want to evaluate the SSD error of the warped points to our original points in image 2. If the SSD error is below a certain threshold, we will call this an inlier. This process is repeated many times (100 times in my example), and at the end of it all, the homography matrix that produces the most inliers is returned as the computed homography. From this point, we can use this homography to warp the image and blend our resulting image as we did in part I.

It was at this part of the project that I had the most trouble: despite spending several days trying to debug and fix my code, it simply does not produce an estimated homography and thereby I am unale to produce the blended image. It is disappointing to go through all of these steps and not have a final product, however, I am hoping you will see that I gave it an earnest effort both in concepts and in actual implementation.

Observations

From working on this section, I found that what might seem like a simple process of picking correspondence points is in fact incredibly difficult to do automatically. While the manual metod is tedious and long, the automatic process is quite complex, and therefore, it is a tradeoff for the user/programmer.