CS180 Project 4

Image Warping and Mosaicing

By Lance Mathias

Introduction

In this project I performed image rectification using homography and created panoramas from sets of multiple images.

Bells and whistles: * Multiscale features * Rotation invariant features * Auto panorama detection * Projection into spherical coordinates * Rotation-only model * 360 degree panorama

Computing the Homography

First, I marked keypoints between relevant features in pairs of images:

Then, using the coordinates of the keypoints, I computed the homography by solving for the minimum-norm solution of the following system using Numpy’s least-squares solver: \[\begin{bmatrix} x_1 & y_1 & 1 & 0 & 0 & 0 & -x'_1 x_1 & -x'_1 y_1 \\ 0 & 0 & 0 & x_1 & y_1 & 1 & -y'_1 x_1 & -y'_1 y_1 \\ x_2 & y_2 & 1 & 0 & 0 & 0 & -x'_2 x_2 & -x'_2 y_2 \\ 0 & 0 & 0 & x_2 & y_2 & 1 & -y'_2 x_2 & -y'_2 y_2 \\ & & & & \vdots\end{bmatrix} \begin{bmatrix} h_1 \\ h_2 \\ h_3 \\ h_4 \\ h_5 \\ h_6 \\ h_7 \\ h_8 \end{bmatrix} = \begin{bmatrix} x'_1 \\ y'_1 \\ x'_2 \\ y'_2 \\ \vdots \end{bmatrix}\]

Once the \(h_i\) coefficients were computed, the homography matrix could be constructed as follows:

\[H = \begin{bmatrix} h_1 & h_2 & h_3 \\ h_4 & h_5 & h_6 \\ h_7 & h_8 & 1 \end{bmatrix}\]

Coordinate warps could be computed using homogeneous coordinates by multiplying by the homography matrix and then scaling:

\[\begin{bmatrix} x' \\ y' \\ w \end{bmatrix} = \begin{bmatrix} h_1 & h_2 & h_3 \\ h_4 & h_5 & h_6 \\ h_7 & h_8 & 1 \end{bmatrix} \begin{bmatrix} x \\ y \\ 1 \end{bmatrix}, \text{ transformed coordinates are } \begin{bmatrix} \frac{x'}{w} \\ \frac{y'}{w} \end{bmatrix}\]

Image Rectification

To rectify images, I first marked keypoints on the corners of the surface I wanted to rectify. I then computed the homography and used inverse warping and some interpolation magic to warp the desired surface onto a flat plane:

Image Mosaicing

To create a panorama from multiple images, the procedure is pretty similar:

  1. Create a large output image and choose a “reference image” to be on the same plane as the output canvas
  2. Normalize each image’s histogram to account for differences in exposure
  3. Using keypoints, compute the homography and warp all other images to be on the same plane as the reference image
  4. For each image, compute a “distance to zero” metric using , bwdist then use this metric to create a mask for each image (only include points from each image which are farthest from an edge (i.e. farthest from a zero))
  5. Use Laplacian Pyramid blending on the masks to smoothly blend all images together

More on the Blending Process

For each image, I inverse warp it to the final image plane. This means that the area that the image warps to is nonzero, but the majority of the rest of the canvas is zero. Below we can see our partial output and an example of a newly warped image side-by-side.

We can then call bwdist on each of these partial images to get this kind of distance map (one for the interpolated image and one for the output image so far):

Then we can make a mask of all of the areas where the distance mask value is bigger for the newly warped image:

Then we can use this mask to perform Laplacian Pyramid blending on the two partial images, yielding the results you see below.

Results

For each of my image sets, see the raw images, resulting pairwise keypoints, and final panorama below:

San Francisco from Berkeley

Unfortunately, there was a difference in exposure which was too great for our usual tools of histogram normalization and pyramid blending to fix completely.

Some Mountains

Flamingos

Detecting Corner Features in an Image

Running the sample code gives us a transform of the image denoting the “corner strength” at each pixel, as well as a list of possible corner coordinates which have higher corner strength than their immediate neightbors:

There are far too many corners here to know what to do with them.

Adaptive Non-Maximal Suppression

After getting Harris corners, we can implement ANMS to reduce the number of potential keypoints to only points which represent strong corners which are roughly evenly spaced throughout the image. See a before/after ANMS comparison:

Extracting Feature Descriptors

We can extract 40x40 patches around each corner returned by ANMS, and then downsample to get 8x8 feature patches. After bias/gain normalization, patches look something like this:

Matching Feature Descriptors

To match feature descriptors, we can compute the 1-NN and 2-NN of each feature patch (pretending feature patches are 64-dimensional vectors and using Euclidean distance using a nearest-neighbor library). We can then divide the distance of the 1-NN by the distance of the 2-NN, and if this ratio is below a certain threshold, we can consider the 1-NN to be a match.

I couldn’t come up with a pretty visualizaton for what essentially boils down to nearest-neighbor queries in a 64-dimensional space so no pictures here :(

Robust Feature Matching

Despite previous preprocessing, some features are matched incorrectly. To fix this, we can use RANSAC to filter out “outlier” correspondences which don’t match very well. To illustrate this, we can plot correspondences between two images before and after applying RANSAC:

The Actual Results

Since I didn’t have to mark keypoints by hand, I could try out some bigger and more impressive (different) image sets than before:

SF from Across the Bay

Base images:

Autostitched result:

Berkeley Marina

Base images:

Autostitched result:

Some Mountains

OK, technically these are the same set of images as before, but I was able to use even more of them because I didn’t have to mark keypoints by hand. Thus, I consider it different since the output covers a much wider field of view:

Base images:

Autostitched result:

Bells and Whistles: Multiscale Features

To implement multiscale feature selection, we can run the above steps on multiple levels of a Gaussian pyramid. Here, we visualize the corner features at 4 different scales:

After matching feature pairs between images at each scale, we can combine correspondence pairs from each scale to improve our results vs. single-scale implementations.

Bells and Whistles: Rotation-Invariant Features

To get rotation invariance, we can compute the orientation of each corner as described in the paper:

We can then extract rotated feature patches, instead of just axis-aligned patches:

Then, we can rotate each feature patch by its corner orientation to get a rotation-invariant feature descriptors: (note that these are shown before downsampling for improved clarity for humans)

Auto-Matched Mosaics

Given a set of images in arbitrary order, some of which form panoramas, and some of which do not, we can automatically detect which images form panoramas together. To do this, we can calculate the number of robust (post-RANSAC) correspondences between each pair of images, and determine that image pairs with more correspondences than a predetermined threshold are paired together in the same panorama. Once we’ve determined appropriate image pairs, we can create a graph where each node is an image, nodes are connected if they are adjacent in a panorama, and each connected component is a panorama:

We can then discard all singleton nodes and run the autostitching algorithm from above on each connected component.

Projection into Spherical Coordinates and Rotation-only Model (combined)

For some images, we get mosaics with undesirably warped images, especially if the panorama represents a particularly wide field of view:

By projecting our images into spherical coordinates and finding feature patches and correspondences between the spherical images, we can take advantage of a rotation-only model which only matches two parameters: a pitch offset and an angle offset (since the pitch is the same for all images since I used a tripod, I could theoretically get away with only angle offset, but I didn’t try this since both methods are nearly identical implementation-wise).

This results in a greatly improved result with noticably less distortion in the foreground:

360 Degree Panorama

After the previous part was completed, I could use the same techniques as before to easily create a 360 degree panorama. I took 12 images of the Berkeley Marina using a tripod:

And the final result:

Reflection

My favorite part of this project was the rectification part. I was surprised to learn that even in cases with pretty extreme skewing, we could stretch the image pretty heavily in our homography without running into any aliasing issues. In the image mosaic portion of the project, I thought it was cool that I could apply my Laplacian pyramid algorithm from before to improve my results .

In the second part of the project, I enjoyed seeing the correspondences between two images slowly emerge after doing ANMS, then matching features, and finally performing RANSAC so only the best correspondences remained.

In implementing the bells and whistles, I was impressed with how much I could automate the process. When I was done, I could just feed in a bunch of images and get multiple panoramas out with no input required from me.

I also liked how we didn’t have to manually mark correspondences after implementing the second part of the project.