I used a DSLR and a tripod to shoot images of a wall at my parent's house, using exposure and focus locking, with significant field of view overlap. I additionally took a picture of a hallway in my house for rectification, using my iPhone to enable AE/AF locking. Additional mosaic photos were likewise taken with my iPhone using AE/AF locking.
I first manually defined the correspondences between the two images, or in the case of image rectification, one image and an artificial rectangle. Overlapping field of view helps with image correspondence, giving a greater area to define correspondence upon, since manual correpsondence is easiest picking sharp edges or well defined, smaller objects. I have visualized the correspondences for a mosaic of my room
room-1 correspondence
room-2 correspondence
hallway
Using the image of a hallway within my house, I manually defined seperates correspondence around a doorway and a mural, both perpendicular to the camera shooting direction, then computed the homographies with a manually defined rectangle [[0, 0], [w, 0], [0, h], [w, h]]. I have visualized the rectified images below:
hallway rectified (doorway)
hallway rectified (mural)
house-1
house-2
room-1
room-2
wall-1
wall-2
After defining correspondence between the images, I designate a reference image and warp the other image to it, as visualized below:
house-1 warped
room-1 warped
wall-1 warped
After placing the images (warped and non-warped) into a empty image, which required specific predictive computations to determine output size, I initial used naive image blending to overlap the images, producing strong edge artifacts. I later reused my Project 2 multiresolution blending code to blend the images using a Laplacian pyramid scheme.
house-1 house-2 mosaic naive
house-1 house-2 mosaic
room-1 room-2 mosaic
wall-1 wall-2 mosaic
The coolest thing I learned in this project was the power of homographies. The concept of rectification is somewhat intuitive from a mathematical pov but absolutely mind-blowing in practice with images. Likewise, the potential for mapping surfaces using homographies in terms of mosaicing as well as non-projective mapping (which I wish I had time to explore) puts homographies among the coolest techniques I have learned in this class.
green room left
green room mid
green room right
Extending my implementation to support an arbitrary number of image vectors required some fine tuning to properly calculate output sizes, but the overall implementation follows the same formula. Currently my implementation only supports one set of correspondences, extending to support an arbitrary number of correspondences between arbitrary images would allow full panoramic, as opposed to the current ~180° field of view. I warp images taken to the left/right of a user-defined reference image towards the reference image using homographies, as visualized below:
green room left warped
green room right warped
Finally I used my Project 2 multiresolution blending implmentation to iteratively join the warped/ref images, calculating offsets to place them on the same image plane and then a mask for the Laplacian pyramid. The output is visualized below:
green room mosaic
clark-kerr-1
clark-kerr-2
skyline-1
skyline-2
I used the given starter code harris.py
to generate harris points for both images. Given the high resolution
of modern cameras, I was initially computing upwards of 100,000 harris points per image. Due to the high computational
complexity of the ANMS algorithm, I was forced to resize images. Anecdotally the optimal number of harris points seems
to be around 10,000 - achieved with a resize fraction of 0.2. Visualized below are the harris points of several source
images:
clark-kerr-1 harris
clark-kerr-2 harris
house-1 harris
house-2 harris
room-1 harris
room-2 harris
I implemented ANMS by iterating over points, computing the minimum squared distance for each point to another point where H_point < 0.9 * H_other. I then sort in the list in descending order and choose the top 500, as suggested in the paper. Visualized below are the results of ANMS across a variety of source images:
clark-kerr-1 ANMS
clark-kerr-2 ANMS
house-1 ANMS
house-2 ANMS
room-1 ANMS
room-2 ANMS
To produce feature descriptors for each point in the ANMS output, we extract an axis-aligned 40x40 patch around each feature then downsample to a 8x8 patch, which is saved as a length 64 vector. In order to match features, we utilize Lowe's thresholding on e1-NN/e2-NN, only adding features where the best distance / secondbest distance is less than the threshold, usually set around 0.3. Visualized below are the results of Lowe's thresholding across a variety of source images:
clark-kerr-1 lowe
clark-kerr-2 lowe
house-1 lowe
house-2 lowe
room-1 lowe
room-2 lowe
To implement a RANSAC iteration, I randomly subsample a 4-point correspondence to generate a homography, using it to compare the points warped under the homography to their actual counterparts in the other image and selecting only features with squared error below some threshold, usually around 1 pixel. Finally we upscale these points to fit source image sizes. Visualized below are the results of RANSAC using 1000 iterations across a variety of source images:
clark-kerr-1 RANSAC
clark-kerr-2 RANSAC
house-1 RANSAC
house-2 RANSAC
room-1 RANSAC
room-2 RANSAC
After implementing RANSAC, I spent several days generating tons of 2-image mosaics, but was consistently unable to tune the thresholds for feature matching and RANSAC to choose correct points. My code was usually finding 2-3 accurate correspondences, but failed to properly match 4 and thus produced an incorrect warp which generated several unintended artifacts and usually resulted in an unusable gradient of pixels. Visualized below is an interesting incorrect homography as well as an almost correct mosaic output from my incorrect implementation:
skyline accidental rectification & green room mosaic
Since my code seemed to almost align the images, I figured my implementation must be mostly correct and there was some error in the kind of resolution or subject matter of the photos I was taking. I spent several days banging my head against the wall trying to figure this out. It was only after Prof Efros mentioned his incorrect implementation which still produced almost correct results due to the sheer power of the algorithm. After examining my code more carefully, I realized I made a similar error - sorting the ANMS minimum distances in ascending rather than descending order.
After fixing this bug, I used my Project 2 multiresolution blending code to produce the following autostitched mosaics:
clark kerr mosaic
skyline mosaic
room mosaic
house mosaic
wall mosaic
green-room mosaic
In general, my autostitching code generates mosaics of equal or better quality than my hand labeled ones from Part A. There are times (as in room-mosaic) where the autostitched alignment is far more flush and there are fewer edge artifacts than in the manual correspondences:
autostitched room mosaic
manual-correspondence room mosaic
autostitched wall mosaic
manual-correspondence wall mosaic
autostitched green-room mosaic
manual-correspondence green-room mosaic
The entire autostitching algorithm is by far the most interesting technique I have learned, maybe in any class. I was skeptical of the power of such a relative simple algorithm to identify correct correspondences and generate good looking mosaics. My expectations, however, were blown away by the quality of the mosaics produced by autostitching, which exceeded even my tedious intricate manual correspondences from Part A. ANMS and Lowe's thresholding, in particular, were the most mind blowing algorithms I have seen thus far in this class.