In this project, we will be stitching a bunch of photos together to create composite photos, mosaics, and panoramas. The project has two parts, one where we manually define correspondences and one where we automate this process.
Here are the pictures I shot for this project as plotted in the main.ipynb. I used these lower res display images from the notebook rather than the initial ones I had shot so as to ensure that they would fit on this webpage without downscaling them too much.
Note that I didn't actually shoot these, as I didn't have convenient access to many nicely shaped / tiled surfaces. Here are the sources for the following two images that I did use: (1) https://www.hgtv.com/design/remodel/bathroom-remodel/reasons-to-choose-porcelain-tile (2) https://unsplash.com/s/photos/floor-tiles
Next, I defined a function, computeH, as specified in the spec, that would compute the homography given two images and two sets of correspondence points (that I specified). Below, we visualize the correspondence points I used for (in this case) the street view images, state what the resulting homography matrix I computed was, and display the left street view warped using the homography matrix and the right street view warped using the homography matrix. In order to do the latter, I defined a warpImage function, per the spec, that given a Homography matrix would warp the image using that Homography.
Now we can use the warpImage function we defined in the last subpart to "rectify" images. Specifically, we'll start with images with planar surfaces and warp those images such that the plane is then visible from a top-down view (rather than with whatever angle we may have been looking at it before). This was helped by using images with square markings in planar surfaces that I could then roughly estimate distances of using Euclidean distance.
As I had displayed (and credited) the two floor images I used above, here I'll get right to it by showing the images with the correspondence points I defined and then the result of warping them to the scaled version of the [0 0; 0 1; 1 0; 1 1] matrix I computed by guesstimating side lengths.
Finally, in this part we'll be creating mosaics by projectively warping the left out of the two images we're blending into the projection of the other image and then doing some weighted averaging & alpha blending to then smoothly blend the two images around the boundary of the composite so as to output a clean, nice, mosaic image. Some examples of my results below!
I still find it amazing that we can do what we did in the rectifying images section. That is, that we can take an image at an angle and recover a different view of it (e.g. top down view). It's so fascinating to me that all the information we have is there and just needs to be projected a different way in order to recover it all (e.g. using homographies). I wonder if there's a future for art where instead of an artist having to plan how some intricate design will look like at the particular angle they're hoping for there's some easy access tool where they just design everything in 2D and project it sort of however they want to accomplish the desired effect. Will hand-drawn illustration soon be replaced for some easier, projective based image compositing software?
Now, we'll be aiming to automate the process of part A (such as defining correspondences and computing homographies for eventual mosaic stitching) using RANSAC and insights from the “Multi-Image Matching using Multi-Scale Oriented Patches” by Brown et al. paper.
In order to automate the process of defining correspondences, we first used the Harris Interest Point Detector to find corner points in our images. I first resized my images to make running Harris less computationally expensive. Then I reformatted the resulting coordinates and plotted them on their original images with a scatter plot. See results below.
Next, we used Adaptive Non-maximal Suppression (ANMS) to select the top 500 points out of the Harris corners. This process was eased through the computation of h values (corner strength values) when we computed the Harris Corners. Below are the top 500 points as determined for my two example images below.
In this part, we analyzed both images a patch at a time and at a lower frequency (by convolving the ims with a Gaussian) to build a general (rather than overly exact) and efficient feature detector. Afterwards, we actually went about matching the features through computing a ratio between a given feature and its nearest neighbor and determining if it exceeded a certain threshold (Lowe's method). Below is the result of our feature matching on our left and right street view images.
In this part, we used the RANSAC algorithm to compute a homography between our two images. RANSAC essentially served as an outlier detection amongst our matching features, paring down our choice of features until we had just enough to effectively model the homography matrix. The resulting points / features post RANSAC displayed below.
Then came the actual stitching part! Here, we reused the blend function we defined in part A to mosaic our images together. Below are three examples of images that were automosaiced along with the manual mosaic we had computed prior as a comparison / reference.
Still think it's crazy that we can match points / features across images using a simple nearest neighbor ratio and thresholding per Lowe's! How convenient, especially when our images have a lot of potential features!