Project 6: Allan Zhou

Part A

Photos

I used my phone to take photos of scenes from the same position (approximately), making sure to overlap.

The engineering library:

library

The view outside my apartment complex:

library

Recovering Homographies

For each set of images, I marked 8 corresponding points using Matplotlib's ginput, trying to click corners in the overlapping regions.

When warping a set of images, I artificially defined the destination plane's points using the average of the points from each image. Then we can warp each image (or source) to the destination, using that image's marked points. For notation, say a source point is $(x,y)$ , and the destination point is $(x', y')$ . We want to find a homography from the source plane to the destination plane:

$\begin{bmatrix}x' \\\ y' \\\ 1\end{bmatrix} \cong \begin{bmatrix}wx' \\\ wy' \\\ w\end{bmatrix} = H\begin{bmatrix}x \\\ y \\\ 1\end{bmatrix}, \text{where } H=\begin{bmatrix}a & b & c\\ d & e & f\\ g & h & i\end{bmatrix}$

H is a $3\times 3$ matrix, but there are only 8 free parameters (8 DOF) because $H$ can be scaled without changing behavior. For example, let us divide by $i$ to set the bottom-right corner to $1$ , and show that its effect remains the same:

$\left(\frac{1}{i}H\right)\begin{bmatrix}x \\\ y \\\ 1\end{bmatrix} = \left(\frac{w}{i}\right)\begin{bmatrix}x' \\\ y' \\\ 1\end{bmatrix} \cong \begin{bmatrix}x' \\\ y' \\\ 1\end{bmatrix}$

So lets assume $i=1$ for simplicity, and we only have to worry about the remaining $8$ parameters $a$ through $h$ . By writing out the equations for each row-vector multiplication and doing some algebra, we can rewrite this in a form:

$\begin{bmatrix} x'\\\ y' \end{bmatrix} = \begin{bmatrix}x & y & 1 & 0 & 0 & 0 & -xx' & -yx'\\ 0 & 0 & 0 & x & y & 1 & -xy' & -yy'\end{bmatrix}\begin{bmatrix}a \\ b \\ c \\ d \\ e \\ f \\ g \\ h \end{bmatrix}$

This is convenient because we can "stack" these equations for each pair of points to form a linear system (over-constrained if you have more than 4 pairs of points) and then solve for the $8$ parameters of $H$ using least squares.

Using this approach, I recovered homographies for mapping each image to my defined destination plane.

Warping the images

Using the recovered $H$ 's, I warped each image to the destination plane. To actually do the warp I use inverse warping. I use $H$ to find the corners of the image in the destination plane, and then find all the pixels inside the polygon defined by those corners. Since these are the destination pixels, these are the $(x', y')$ from the formula above. I multiply by $H^{-1}$ to find $(x,y)$ and then use bilinear interpolation to sample the value from the source.

Image rectification

To rectify an image, I take only a single image of a planar surface and then apply the warping from before to warp it to a frontal-parallel plane.

As a test, I use this picture of an intersection from a traffic camera (this is actually the background extracted from traffic cam video):

traffic-cam

I want to create a rectified (top-down) view of the center of the intersection, which is a planar surface. I selected four points at the corners of the intersection using ginput. Since I know the intersection is approximately square, I defined the corresponding destination points to be the corners of the square output image. Applying the warp function to this image gives the top-down view:

intersect-rectified

Another rectification example, an advertisement for a nice rug. The ad helpfully defines the true dimensions of the rug itself, making it easier to choose the destination points.

rug

The rectified top-down view after warping (I rotated the image to conserve space on the page):

rug-rectified

Making mosaics

To make a mosaic, I first define an output canvas, creating an empty canvas large enough to contain all the corners of the warped images. Then, I apply the warp to each image (I also compute the mask here) and then put all the warped images into the mosaic.

The mosaic will have overlapping regions from different images. One way to resolve this issue is to just choose one image's pixels. For example, here is what happens if you use the naive method of choose the right image's pixels for any overlap:

library-naive

If you look closely, there is a seam around 1/3d of the way through the image left-to-right.

An improved method is to linearly blend the images together by linearly interpolating the overlapping region from left to right:

library-mosaic

street-mosaic

Linear blending works pretty well compared to the naive method, but you can still notice a little bit of the boundary. Additionally, in either method the images do not line up exactly--they are a couple of pixels off. This probably because the method of selecting corresponding points (me clicking on things with ginput) is not very accurate.

What I learned

Getting the linear blending to work properly without making wedge artifacts above and below the overlap is very tricky.
Creating rectified view of single images is surprisingly simple with homographies and pretty cool.