Project 4 - Image Warping and Mosaicing by Matthew Lacayo

Overview

In this project, we explore how to take multiple images that were taken from the same focal center, and combine them into a single panoramic. We will break down the process below This involves a few steps: First, we must assign keyoint correspondences between two images that we wish to mosaic. This is because the keypoints are necessary for determining the actual homography that warps one persepctive to the other.

Homography

Second, we must actually compute the homography. The matrix that captures the transformation is 3x3, however we can set the bottom right to be 1 because it is simply a scaling factor. Thus, we need to determine 8 of the entries of the homography matrix. In order to do this, we have to create at least 8 equations. Luckily, each keypoint correspondence yields two equations, and so we need a minimum of 4 correspondences. However, the more points we give the better since with only 4 points provided the recovered homography will be very sensitive to minor deviation in pixels when assigning keypoints. With more than 4 points however, we will have to turn to least squares to recover the optimal homography. Here is the math for how we recover the homography: We first note that a homography matrix looks like this: $$ H = \begin{bmatrix} a & b & c \\ d & e & f \\ g & h & 1 \end{bmatrix} $$ Moreover, we have that: $$ H \begin{bmatrix} x\\ y\\ 1 \end{bmatrix} = \begin{bmatrix} wx'\\ wy'\\ w \end{bmatrix} $$ Where (x,y) is a keypoint in image 1, and (x',y') is the corresponding keypoint in image 2. Let us break this into 8 equations as follows: $$ x = \begin{bmatrix} a\\ b\\ c\\ d\\ e\\ f\\ g\\ h \end{bmatrix}, A = \begin{bmatrix} x_i & y_i & 1 & 0 & 0 & 0 & -x_i x'_i & -y_i x'_i \\ 0 & 0 & 0 & x_i & y_i & 1 & -x_i y'_i & -y_i y'_i \\ \dots \end{bmatrix} $$ where we create 2 rows for each datapoint. Then. we set $$ y = \begin{bmatrix} x'_i\\ y'_i\\ \dots \end{bmatrix} $$ This is in the form of least squares, and so we can use least squares to solve the following equation: $$ Ax = y $$

Image Rectification

Now that we can successfully recover a homography, we can apply this technique for a basic use case: image rectification. In this use case. we are assigning keypoints to some portion of an image that we know the true shape of, and then recovering a homography that warps the image to that shape. Here are two examples where we recover a birds eye view: (Note: the keypoint pictures are blurry due to cropping)

Mosaics

Finally, we were tasked with using the homography technique to mosaic multiple images together. The process for doing this is as follows: Once we have computed our homography matrix, we want to warp the "left" image so that it matches the persepctive of the "right" image. To do this, we first predict the bounding box that will contain the warped left image. To do this, we can manually pass each corner of the original image through the homography to see where it will end up. For example, consider the following diagram:

When we apply our homography to our left image (and then dehomogenize the coordinates) we will end up with an image that lives in the coordinate plane of the right image. There are 3 possibilities for a given coordinate: it is negative (and thus lives outside the area of the right image), it is contained within the dimensions of the right image, or it is greater than the dimensions of the rightt image (and again, lives outside the area of the right image). This distinction is helpful when figuring out how to adjust/pad each image so that they fully reside in the same coordinate system. The way that we achieve this is by applying the appropriate translation to both the left and right image such that the origin corresponds to the minimum x coordinate and minimum y coordinate between the right image and the warped left image. Once we have the images in the same coordinate system with the correct origin, we can combine them by putting them both inside the bounding box containing both of them. They will have a region of intersection, and a disjoint region as well. We can simply copy over the disjoint regions to our output, but then we need a way of smoothly blending the overlap regions together. To do this, I made gradient masks going from left to right and from right to left. Here is an example of two images that I stitched together, along with their intersections and the gradient mask for the intersection:

I then apply these gradient masks to the portion of the left image that intersects and the portion of the right image that intersects, and combine them to get the final intersection region that blends nicely into both sides. Putting it all together yields results like this:

Multi Mosaics

We can repeat this process for multiple images to create large mosaics. Here are some of my results that come from stitching 3-4 images together (note that there is minor blur due to poor keypoint selection. I aim to fix this by getting better pictures for the next part that have more significant overlap.):