Proj 6: Panoramas

Vivian Liu, cs194-26-aaf

Part 1: Technical Implementation

    For this project, we implemented the projective transform, which requires at least 4 pairs of correspondences. Having eight points across two images, we are able to synthetically warp images to a common plane, even if they were taken at different angles. The important thing is that we keep the same center of projection.

    The math behind the projective transform is as follows. We first solve for eight variables, a-h. These eight variables represent the degrees of freedom in the 3x3 projective matrix, which describes how two planes relate to one another. Though there are 9 elements, the last one--i--can be set to 1. The rest of the eight variables are related to each other by collapsing the 3 equations from the matrix into 2, by substitution of the last equation gx + hy + 1 = w . Afterwards, we solve for these eight variables with the least squares approximation for a linear system.

     Then we have to warp every image to the plane of the base image. To do so, I used the inverse mapping, which warped from destination to source to avoid splatting.

     To stitch the images together into a mosaic, I Laplacian blended the outputs so pixels at the edges of the images were weighted averages of the combined images. To eliminate artifacts, I extrapolated values from the edges of each image.

     The coolest thing that I have learned about this project is that planes can be related to each other by matrices--even nonlinear ones. I was very excited to see that by changing the coordinates space that we operate in we could also mosaic and stitch images in cylindrical or spherical space.

    Another thing I learned was the importance of not translating the camera in between shots. Many potential panoramas were ruined by the shift in center of projection.

Results

Input Images









Part 2: Technical Implementation

ANMS

    To automate feature selection and matching, we use adaptive non-maximal suppression (ANMS). What this means is that we find for each corner a minimum radius to another corner that has at least 90% of its corner strength. This corner strength is defined according to how much of a Harris corner it is. Here are all the Harris corners returned on the first image.



     And the graph below shows all the Harris corners kept after ANMS. As we can see, these ANMS points are more spaced out on the image canvas, but still quite regular in gestalt. This is because after running ANMS, we pick the largest 500 points. These points are the strongest maxima because they are not easily suppressed by nearby other feature points.



Feature Description

     Afterwards, we create feature descriptors to get the gist of the image around the points. This window is downsampled and normalized so that we can compare the feature descriptor sets across two images. An example feature descriptor looks like this.


1NN/2NN Thresholding

     Then we calculate the SSD between each feature patch to find the first nearest neighbor and the second nearest neighbor. Dividing the first nearest neighbor by the second nearest gives us a relative discriminative ratio, which we can use as a quantitative correlate for outliers. The lower this ratio, the better, because it means we are more confident about first nearest neighbor. By thresholding, we pick the best pairs of points to run RANSAC on.

     Prior to RANSAC, we have matching pairs that plotted on an image canvas look like this.



     (Thank you Ajay Ramesh for open sourcing the debug visualization tool!)

RANSAC

     RANSAC is how we refine our points pairs. We randomly sample for a set of 4 points and then compute a homography matrix, H from them. We calculate p' = H* p and then find the SSD between p' and p. Through 1000 iterations we find the homography matrix that best drives down this SSD and grows the largest set of inlier points.

     From there on out, it is the same procedure as 6A. We warp the destination coordinates according to the best homography matrix and then interpolate and blend.

Coolest Thing

The coolest thing was implementing the research paper and automation. At first, the ideas seemed conceptually difficult and distant, but after sitting with them and debugging for hours, they seemed really elegant and remarkable. The automated feature matching also comes in handy for my own personal projects, so I'm so glad.

Results

Source images for the last panorama:

Side by side: manual, automatic