Project 4: Image Warping and Mosaic

Aaron Sun 3033976755 Fall 2021

Shoot the Pictures

First we took photos which we would later rectify. This means we took photos of planes from a certain perspective, so that we could eventually warp them such that the scene was front parallel.

Then we took images which we intended to combine into mosaics:

Recover Homographies

Our problem statement is given as recovering a 3x3 matrix H = [[a, b, c], [d, e, f], [g, h, 1]] such that Hz_1 = z_2 for our set of corresponding points. When we write out the equations, we find that for a given correspondence z1 = (x1, y1) and z2 = (x2, y2):

x2 * w2 = a*x1 + b*y1 + c

y2 * w2 = d*x1 + e*y1 + f

w2 = g*x1 + h*y1 + 1

We can combine and rearrange these equations:

-x2 = -a * x1 - b * y1 - c + g * x1 * x2 + h * y1 * x2

-y2 = -d * x1 - e * y1 - f + g * x1 * y2 + h * y1 * y2

Finally we can make this a least squares problem Ax = b for x = [a, b, c, d, e, f, g]^T where each correspondence contribues in A the rows:

[[-x1, -y1, -1, 0, 0, 0, x1 * x2, y1 * x2], [0, 0, 0, -x1, -y1, -1, x2 * y2, y1 * y2]]

and these match rows in b:

[[-x2], [-y2]].

Warp the Images

Then we warped the images using the homographies we recovered. To do this, we took the homography and applied it to the 4 corners of the input image. Then we ued the inverse of the homography to find where in the original image each pixel in the output mapped to. Finally using bilinear interpolation we found the values of each pixel.

Let's look at our results for rectifying images. First, let's look at the sheet of paper on the table.

We mapped the corners of the piece of paper to the points [0, 0], [11, 0], [11, 8.5], [0, 8.5] (scaled by a fixed constant). This is making use of the fact that we know the actual size of the paper. The output paper comes out very nicely. Let's look at it a bit more closely:

Wow it looks great! We can even read the notes as if we were looking straight at the page.

Let's try the same thing with the keyboard. I estimated the keyboard to have a width to height ratio of 10:3.

And now looking more closely at the keyboard itself:

Blending the Images into a Mosaic

Now we can pick points as correspondences between our images and combine them into mosaics.

We first find the homography H to map image 1 to image 2. Then once image 1 is in "image 2 space" we combine the images. The combining process involves using the blending procedure from project 2.

Choosing a mask automatically was a bit more tricky. First, we set a mask of all zeros. Then, we set pixels in which image 1 is nonzero but image 2 is zero to be 1.

We want the mask to be in the overlap between the images for good blending, so we need to extend the mask. To do this, we repeatedly dilate the image using a 3x3 convolution. We repeat this 20 times.

Then we blend the image as in project 2. Notably we used sigma = 3 and a pyramid of height = 2 in order to save time. This gives us the resulting images we desire.

Doe library, in a wide view:

The full toprope wall at Pacific Pipe in a single image:

Staircase in Physics building. Here, my phone camera changed settings between the two images, causing drastic lighting differences between the two. However otherwise the image came out well.

What I learned

From this part, I learned that images need to be very well aligned in order to find homographies appropriately. I originally spent a very long time trying to generate a mosaic using another pair of images, but I was unable to do so (or figure out why it didn't work). It must've been because the images weren't taken from the same point, since when I simply took a new image it worked instantly. The lesson is to always try on various data points before drawing any conclusions.