Ajay Ramesh, November 10, 2018
Auto-Stitching Photo Mosaics
Overview
In Part A, we focus on warping at least two images into the same basis by applying a perspective transform to all but
one image. We define feature correspondences by hand.
1. Recover Homographies
A homography describes a perspective transform, or perspective projection. A homography matrix has 9 variables,
but since homographies are scale invariant, we can set H(3,3) = 1
, thus solving for only 8 degrees
of freedom. Each point correspondence provides two linearly independent equations, so we need at least four
point correspondences to recover the 8 degrees of freedom. Since we will be using linear least squares to solve
for the vector representing the elements of H
, we want more than four correspondences to robustly
estimate the homography. I'm using the "Direct Linear Transform" technique described in
this paper.
2. Rectification
I'm using an inverse warp technique that does the following.
- Estimate
H
between some image, and a "rectified" set of points which are user defined
- Apply
H
to the four corners of the image, and recover the warped corners
- Find the points inside the polygon enclosed by the warped four corners
- For each of the points inside the warped polygon, apply
H^-1
to find the coordinate
of the original image, which you want to sample a color from
- Set the value of the coordinate inside the polygon to be interpolated color value corresponding to
the coordinate found in (4)
For these image pairs, the image points are the four corners of the paper (or iPad), and the rectified points
are the four corners of the image itself. I tried mimicking how popular apps like CamScanner and Scanner Pro work.
3. Mosaic
In order to construct the image mosaics I used a technique similar to the one described in Part 2. But instead
of finding the correspondences between some image and a rectified set of points, I find the correspondences between
some image and another image. For these examples, I keep the right image fixed, and warp the left image towards the
right image by applying an inverse perspective warp parameterized by the estimated homography matrix. After warping,
I used a nonlinear blend (the element wise max function) to "stitch" the images together, and the resulting mosaics
are satisfactory. Since I took these pictures with my shaky hand, I was unable to ensure that there was no translation
between each capture. Therefore, you may notice some ghosting artifacts in some images if you look really closely.
4. Summary
I learned about how we can use homographies to warp images to be in the same "projective basis" as another image.
I did not enjoy hand labeling features.