|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Honestly this homework was really frustrating as I felt that I understood all of the individual parts of the project (i.e. picking good correspondences, computing homographies, warping an image into another) but I had a ton of issues getting them to fit together. I would say that I learned the importance of understanding the semantics of individual functions that I write, as the two biggest hurdles for me were figuring out the direction that a homography converted points in and the orientation of coordinates (i.e. x,y or y,x) for different structures. I also learned some interesting techniques for blending, as I played around with a couple of different blending techniques and I landed on a two dimensional alpha blending technique that I think works pretty well for two image mosaics. Finally, I learned that it is very hard to pick good correspondences of an image with very few obvious corners. In the mosaic of my roommate, you can see the obvious flaws with trying to compute accurate correspondences without any clear corners to choose from. I think, with better correspondences, this image would have come out much better, similar to the other two.
For Part B I started with the starter code that implemented Harris corner detectors. Next, I implemented Adaptive Non-Maximal Suppression to narrow the number of corners down to a specific amount (500 in most cases), evenly distributed across the two images, to then compute correspondences from. After this, I implemented feature descriptor extraction, which basically boils down to taking a 40x40 sample from an image at each corner, and then downsampling it to 8x8 and performing bias and gain normalization to create a robust feature for the area surrounding the corner. After this, I implemented feature matching, where I found the closest matching features in each image based on these feature descriptors to create correspondences between the two images. Finally, I implemented RANSAC to remove outlier correnspondences to create the final set of correspondences, which I then passed in to my warp function from Part A. I recorded the output at each step for one of my images, and recorded the final output for three different images.
This part was very straightforward, as the provided starter code implemented this portion for me. The only change I made to the existing code was to allow for a min_distance parameter which would allow me to vary the number of corners collected in the image to reduce calculation time.
|
|
ANMS was the hardest piece of this project to implement. It was very hard to decipher what I was supposed to loop through and what I was supposed to calculate, but once I realized that I was supposed to simply compute r for every possible correspondence and then take the top num_points corners, sorted by r, it made a lot more sense. Once I had this intuition, I simply used a doubly nested for loop through all the corners (I know, a little ugly, but the runtime was not terrible), calculating the R value at each point between a corner xi and all other corners xj, taking the min of those values to get the ri value, and repeating this for all ri. Once I had this, I simply sorted the corners by r value and returned the first num_points of them. I usually chose num_points to be 500, except in the case of the flyer board image where I found it helpful to increase to 750.
|
|
Feature Extraction was very simple. I simply iterated through all corners, grabbed the 40x40 image patch surrounding the corner, downsampled it to 8x8, subtracted the mean and divided by the standard deviation (to do bias and gain normalization). This generated robust features for the areas surrounding every corner. Then, I performed feature matching using the 1-nn/2-nn thresholding technique. Basically, for every image patch in the first image, I calculated its SSD from all image patches in the second image (using the provided dist2 function). Then, I calculated the closest and second closest nearest neightbor for every image patch in the first image (the two min distances), divided the first by the second, and compared it to a threshold value. This value varied between 0.1 and 0.5 in practice, but I found that being more permissive was better logically as the points would still be passed to RANSAC afterwards to remove outliers.
|
|
RANSAC was relatively easy to implement. I simply ran a loop for a large number of iterations (usually 10,000 in practice). Within this loop I would randomly sample four points from the correspondences, calculate the exact homography (using the code from Part A) between these points, and then calculate the inliers within that homography from the entire set of correspondences. To calculate if a point was an inlier, I took the L2norm of the difference between the input correspondence of one image and the output of dotting the homography with the other image's correspondence, and then thresholding the value. At each loop, I checked if the current set of inliers was larger than the current maximum, and if it was, I saved it to be the current maximal set. Then, I returned the maximal set afterwards.
|
|
Finally, I used the correspondences outputted by the RANSAC algorithm and passed them to the warp and blend function I created for part A. The results are below. Although it worked very well for the library mosaic, the flyer board mosaic ended up a little off and therefore you see some image doubling, while the mosiac of my roommate was a full disaster. However, I believe the failure of that mosaic is due solely to the input data being very weak, as there simply aren't enough distinctive corners for the algorithm to detect. By the time I finished running RANSAC, it usually only had 12 correspondences at most, and they often had multiple mappings to the same correspondence point.
|
|
|
The second part of this homework was much more satisfying, as I did not spend hours meticulously inputting points in the hope that I simply needed better data to make my code work as in Part A. The coolest thing I learned from this project was definitely the power of downsampling for robustness. I had often heard of the technique, but to watch it work so effectively with little to no wiggling on the data was really impressive. Overall, this was a really fun second half of the project, as the steps were pretty clear and the results were insightful!