Homework 6: Image Mosaicing

Jacob Green | cs194-26-afl

Part A: Manual Correspondences

Input Images

Flyer Board Image 1
Flyer Board Image 2
Library Image 1
Library Image 2
Roommate Image 1
Roommate Image 2

Annotated Images

Flyer Board Image 1 Annotated
Flyer Board Image 2 Annotated
Library Image 1 Annotated
Library Image 2 Annotated
Roommate Image 1 Annotated
Roommate Image 2 Annotated

Output Mosaics

Flyer Board Mosaic
Library Mosaic
Roommate Mosaic

Quick Thoughts - Part A

Honestly this homework was really frustrating as I felt that I understood all of the individual parts of the project (i.e. picking good correspondences, computing homographies, warping an image into another) but I had a ton of issues getting them to fit together. I would say that I learned the importance of understanding the semantics of individual functions that I write, as the two biggest hurdles for me were figuring out the direction that a homography converted points in and the orientation of coordinates (i.e. x,y or y,x) for different structures. I also learned some interesting techniques for blending, as I played around with a couple of different blending techniques and I landed on a two dimensional alpha blending technique that I think works pretty well for two image mosaics. Finally, I learned that it is very hard to pick good correspondences of an image with very few obvious corners. In the mosaic of my roommate, you can see the obvious flaws with trying to compute accurate correspondences without any clear corners to choose from. I think, with better correspondences, this image would have come out much better, similar to the other two.

Part B: Automatic Correspondences

For Part B I started with the starter code that implemented Harris corner detectors. Next, I implemented Adaptive Non-Maximal Suppression to narrow the number of corners down to a specific amount (500 in most cases), evenly distributed across the two images, to then compute correspondences from. After this, I implemented feature descriptor extraction, which basically boils down to taking a 40x40 sample from an image at each corner, and then downsampling it to 8x8 and performing bias and gain normalization to create a robust feature for the area surrounding the corner. After this, I implemented feature matching, where I found the closest matching features in each image based on these feature descriptors to create correspondences between the two images. Finally, I implemented RANSAC to remove outlier correnspondences to create the final set of correspondences, which I then passed in to my warp function from Part A. I recorded the output at each step for one of my images, and recorded the final output for three different images.

Harris Detector

This part was very straightforward, as the provided starter code implemented this portion for me. The only change I made to the existing code was to allow for a min_distance parameter which would allow me to vary the number of corners collected in the image to reduce calculation time.

Library Image 1 Harris Corners
Library Image 2 Harris Corners

Adaptive Non-Maximal Suppression (ANMS)

ANMS was the hardest piece of this project to implement. It was very hard to decipher what I was supposed to loop through and what I was supposed to calculate, but once I realized that I was supposed to simply compute r for every possible correspondence and then take the top num_points corners, sorted by r, it made a lot more sense. Once I had this intuition, I simply used a doubly nested for loop through all the corners (I know, a little ugly, but the runtime was not terrible), calculating the R value at each point between a corner xi and all other corners xj, taking the min of those values to get the ri value, and repeating this for all ri. Once I had this, I simply sorted the corners by r value and returned the first num_points of them. I usually chose num_points to be 500, except in the case of the flyer board image where I found it helpful to increase to 750.

Library Image 1 Corners after ANMS
Library Image 2 Corners after ANMS

Feature Extraction and Matching

Feature Extraction was very simple. I simply iterated through all corners, grabbed the 40x40 image patch surrounding the corner, downsampled it to 8x8, subtracted the mean and divided by the standard deviation (to do bias and gain normalization). This generated robust features for the areas surrounding every corner. Then, I performed feature matching using the 1-nn/2-nn thresholding technique. Basically, for every image patch in the first image, I calculated its SSD from all image patches in the second image (using the provided dist2 function). Then, I calculated the closest and second closest nearest neightbor for every image patch in the first image (the two min distances), divided the first by the second, and compared it to a threshold value. This value varied between 0.1 and 0.5 in practice, but I found that being more permissive was better logically as the points would still be passed to RANSAC afterwards to remove outliers.

Library Image 1 with matched correspondences
Library Image 2 with matched correspondences

Random Sample Consensus (RANSAC)

RANSAC was relatively easy to implement. I simply ran a loop for a large number of iterations (usually 10,000 in practice). Within this loop I would randomly sample four points from the correspondences, calculate the exact homography (using the code from Part A) between these points, and then calculate the inliers within that homography from the entire set of correspondences. To calculate if a point was an inlier, I took the L2norm of the difference between the input correspondence of one image and the output of dotting the homography with the other image's correspondence, and then thresholding the value. At each loop, I checked if the current set of inliers was larger than the current maximum, and if it was, I saved it to be the current maximal set. Then, I returned the maximal set afterwards.

Library Image 1 after RANSAC
Library Image 2 after RANSAC

Final Output

Finally, I used the correspondences outputted by the RANSAC algorithm and passed them to the warp and blend function I created for part A. The results are below. Although it worked very well for the library mosaic, the flyer board mosaic ended up a little off and therefore you see some image doubling, while the mosiac of my roommate was a full disaster. However, I believe the failure of that mosaic is due solely to the input data being very weak, as there simply aren't enough distinctive corners for the algorithm to detect. By the time I finished running RANSAC, it usually only had 12 correspondences at most, and they often had multiple mappings to the same correspondence point.

Flyer Board Mosaic
Library Mosaic
Roommate Mosaic

Final Thoughts - Part B

The second part of this homework was much more satisfying, as I did not spend hours meticulously inputting points in the hope that I simply needed better data to make my code work as in Part A. The coolest thing I learned from this project was definitely the power of downsampling for robustness. I had often heard of the technique, but to watch it work so effectively with little to no wiggling on the data was really impressive. Overall, this was a really fun second half of the project, as the steps were pretty clear and the results were insightful!