Below are the pictures I took for this project. All pictures were captured on an iPhone X, but were resized digitally for more efficient computation.
Images for Rectification
Images for Stiching
Below are the pictures I took for this project. All pictures were captured on an iPhone X, but were resized digitally for more efficient computation.
Images for Rectification
Images for Stiching
The relationship between two photos taken at the same center of projection can be represented with a Perpsective Projection Transformation. A homography matrix can represent the transformation.
Image Rectification
Example 1: For this image, I selected the four corners of the iPad and spoofed coordinates to transform them into: [100, 100], [500, 100], [500, 650], [100, 650]. The application of the resulting homography matrix H is shown on the right.
Example 2: For this image, I selected the four corners of the painting and spoofed coordinates to transform them into: [100, 100], [650, 100], [650, 650], [100, 650]. The application of the resulting homography matrix H is shown on the right.
Because I placed my images on a larger canvas, I was able to warp the left and right images so that they would directly line up with the middle image after the homography transformation. Then, I used alpha feathering to blend the three images together. For the area of overlap between two images (say im1 and im2), I used the following formula to set the pixel values: pixel = (alpha)(im1) + (1 - alpha)(im2). Filling in the pixels from left to right (im1 to im2), the alpha value started off at 1 and gradually decreased to 0 by the time the pixels closest to im2 were filled. This way, the pixels on the edges of the overlapped region had a higher weighting for the images they were closest to, and the pixels in the middle of the overlap region were an equal blend of the two images. This is what accounts for the faded lines at the top and bottom of the stiched image - they're the remnants of blending an image with the black background.
Example 1:
Example 2:
Example 3:
The most challenging part of this project was determining how to set up the images so that it would be easy to blend together the warped
images in the final step. If you look at my code, I have written a few different warp functions that all perform the same main job, but they output the
results a little differently (in terms of dimensions). For the rectification of images, I determined the output size of my warp by looking at what the four corners
of the original image mapped to based on the homography matrix H. However, for actually stiching the images together, I placed the images on a larger canvas and just
warped them in place so that it would eventually overlay over the image that the points were being mapped to. This simplified my blending process in the final step.
It was really interesting to see how you can stich the images together using simple transformations. However, it definitely was very tedious to pick all the correspondence points one by one. You
can see in some of my panoramas that the trees/foliage are a little blurry because I did not select very many correspondence points here.
I used the provided starter code to implement the Harris Interest Point Detector with a minimum distance of 10 between the corners. Here are the Harris corners overlaid on the images I took.
Here are the results of the Harris Corners algorithm on the images from Example 2:
I implemented the Adaptive Non-Maximal Suppression algorithm described in the Multi-Image Matching using Multi-Scale Oriented Patches paper. To do this, I calculated the minimum distance from each harris corner interest point to the nearest interest point that had a corner value satisfying the following condition: H(current interest point) <= 0.9 * H(other interest point). Then, I sorted all these minimum distances in descending order, and took the first 300 points to be my relevant points of interests to use in future transformations.
Here are the results of Adaptive Non-Maximal Suppression on the images from Example 2:
For each of the 300 points identified by the Adaptive Non-Maximal Suppression algorithm, I extracted a feature descriptor. The feature descriptor consisted of a grid of 40x40 pixels sampled around each interest point, which was subsequently subsampled to be 8x8 pixels. The subsampling is necessary to provide robustness. I normalized each 8x8 feature descriptor in order to account for any changes in brightness.
Here is an example of a feature descriptor from one of my images.
In order to find matching features between two images, I used the following approach:
Here are the results of the feature matching for Example 2:
However, the matching features algorithm above was not robust to outliers. There are some matches that were produced by the algorithm but were not actual matches. I implemented the RANSAC algorithm in order to prevent these outliers from ruining the least squares estimate. I randomly sampled the matching features in sets of 4, and determined the inliers and outliers. After randomly sampling 10,000, I picked the largest set of inliers to be the relevant correspondence points to determine the homography. Then, I proceeded as I did in Part 1 to warp and blend the images together.
Example 1:
Example 2:
Example 3:
Even though this part required implementing more algorithms, it was, by far, the most rewarding exercise because I could avoid hand-selecting the correspondence points. It was really interesting to see the process of actually selecting the final correspondence points to be used in calculating the homography. The algorithms were very logical, almost as if they were imitating the steps that a human would take to identify correspondence points. I especially liked RANSAC and Lowe's Method because they were really simple in their premise, but their results were astounding.