Part I

Part II

# Image Warping and Mosaicing (Part I)

## 1. Recover Homographies

The homography, also known as a perspective transform, operates on homogenous coordinates. We have

$\begin{bmatrix} x' \\ y' \\ w' \end{bmatrix} = H \begin{bmatrix} x \\ y \\ 1 \end{bmatrix}$

where H is an arbitrary 3 x 3 matrix,

$H = \begin{bmatrix} h_{00} & h_{01} & h_{02} \\ h_{10} & h_{11} & h_{12} \\ h_{20} & h_{21} & 1 \end{bmatrix}$

To solve for the 8 unknowns, we set up a linear system of n equations of the form Ah = b. These can be constructed using our coordinates:

$(h_{20}x + h_{21}y + h_{22})x' = h_{00}x + h_{01}y + h_{02}$ $(h_{20}x + h_{21}y + h_{22})y' = h_{10}x + h_{11}y + h_{12}$

Thus, given im1_pts and corresponding im2_pts, the computeH function gives a homography matrix that recovers the transformation. In this first part, the corresponding points are found using a mouse-clicking interface. I labeled the following images, which were taken on my phone.

## 2. Warp the Images

For this part, I wrote a function, warpImage, that takes in an image im and a homography matrix H, and computes the warped image.

## 3. Image Rectification

The image warping function also allows us to "rectify" an image, which means transforming images with planar surfaces so that the plane is frontal-parallel. Here are a few results:

#### The Economist (frontal-parallel)

The most notable difference is perhaps the line containing date information. It was barely readable from a distance, but now, after the transformation, it is much more legible.

## 4. Blend the images into a mosaic

Finally, we can blend the images to create a single image mosaic. We leave one image (image 2, the "target" image) unwarped, and warp the other into its projection. Here are the two input images again:

#### Image 2

Here is the mosaic result:

### Example 2

#### Fire Trail 2

Here is the mosaic result:

#### Mosaic

Manually selecting correspondences is both exhausting and inaccurate, which causes the blurry central region in the image. In the second part of the project, I will work on automatically selecting correspondences.

### Example 3

#### Tyndall Effect 2

Here is the mosaic result:

#### Mosaic

Since the two images are essentially screenshots from a video, they are significantly affected by changes in lighting conditions. Smoothing out these lighting variations to make the two parts of the mosaic appear more natural is another aspect I will address in the upcoming phase of the project.

# Feature Matching for Auto-Stitching (Part II)

In this second part of the project, I re-implemented the algorithms for multi-image matching described by Brown et al but with several simplifications. Specifically, the five steps are:
• Interest Point Detection (with single-scale Harris corners)
• Feature Descriptor Extraction
• Feature Matching
• Random Sample Consensus (RANSAC) for estimating homography

## 1. Interesting Point Detection

Using the harris package provided by staff, I was able to find the Harris corners for each image.

#### Indoor 2 (Harris Corners)

One immediate thing we can notice here is that there are way too many Harris corners. Ultimately, we are looking for invariant and distinctive point descriptors, so we need to get rid of the noisy corners and only keep the best results.

## 2. ANMS

To this end, we use the adaptive non-maximal suppression (ANMS) method discussed in the original paper. Here are the results of both images after we've applied ANMS to suppress the less optimal corners.

## 3. Feature Descriptor Extraction

Next, we extract the descriptions of local image structures to support the reliable and efficient matching of images. As decscribed in the MOPS paper, we start with a 40 x 40 window, and downsample it to 8 x 8 (this is performed at a higher pyramid level, i.e. with Gaussian filter applied). The descriptor vector is then normalized to mean 0 and standard deviation 1.

## 4. Feature Matching

With the descriptors extracted, we can move to the feature matching stage. We are essentially finding nearest-neighbors for the descriptors. It works as follows:
• For each descriptor in the first image, we compute the distances to all descriptors in the second image.
• Then, we identify the 1-NN and 2-NN and calculate the ratio of the distance of the 1-NN to the 2-NN.
• If this ratio is below the specified threshold (0.7, as shown in the paper), the match is considered good and is added to the list.
Here are the matched features:

#### Indoor 1 (ANMS)

Most matches are good, although there are a few outliers, which we will address using the RANSAC algorithm.

## 5. RANSAC

The RANSAC loop works as follows:
repeat {
Select four feature pairs at random.
Compute homography H.
Compute inliers where $dist(p_i',\textbf{H}p_i) < \epsilon$
}
Keep largest set of inliers.
Re-compute least-squares H estimate on all of the inliers.
Here is the mosaic generated using homography H computed this way:

#### Indoor (Mosaic)

Let's compare it to the previous version, whose homography was computed using hand-labeled points:

#### Indoor (Manual)

Auto-stitching provides us with a more stable and reliable way to create photo mosaics.

## More Examples

Example 2: Trail

#### Trail (Mosaic)

We can compare and see the improvements:

Example 3: Sky