Ajay Ramesh, November 10, 2018

Auto-Stitching Photo Mosaics

Overview

In Part A, we focus on warping at least two images into the same basis by applying a perspective transform to all but one image. We define feature correspondences by hand.

1. Recover Homographies

A homography describes a perspective transform, or perspective projection. A homography matrix has 9 variables, but since homographies are scale invariant, we can set H(3,3) = 1, thus solving for only 8 degrees of freedom. Each point correspondence provides two linearly independent equations, so we need at least four point correspondences to recover the 8 degrees of freedom. Since we will be using linear least squares to solve for the vector representing the elements of H, we want more than four correspondences to robustly estimate the homography. I'm using the "Direct Linear Transform" technique described in this paper.

2. Rectification

I'm using an inverse warp technique that does the following.
  1. Estimate H between some image, and a "rectified" set of points which are user defined
  2. Apply H to the four corners of the image, and recover the warped corners
  3. Find the points inside the polygon enclosed by the warped four corners
  4. For each of the points inside the warped polygon, apply H^-1 to find the coordinate of the original image, which you want to sample a color from
  5. Set the value of the coordinate inside the polygon to be interpolated color value corresponding to the coordinate found in (4)

For these image pairs, the image points are the four corners of the paper (or iPad), and the rectified points are the four corners of the image itself. I tried mimicking how popular apps like CamScanner and Scanner Pro work.

3. Mosaic

In order to construct the image mosaics I used a technique similar to the one described in Part 2. But instead of finding the correspondences between some image and a rectified set of points, I find the correspondences between some image and another image. For these examples, I keep the right image fixed, and warp the left image towards the right image by applying an inverse perspective warp parameterized by the estimated homography matrix. After warping, I used a nonlinear blend (the element wise max function) to "stitch" the images together, and the resulting mosaics are satisfactory. Since I took these pictures with my shaky hand, I was unable to ensure that there was no translation between each capture. Therefore, you may notice some ghosting artifacts in some images if you look really closely.

4. Summary

I learned about how we can use homographies to warp images to be in the same "projective basis" as another image. I did not enjoy hand labeling features.