Project 4. Image Warping and Mosaicing

Michael Wan, SID: 3034012128


Part A

Section 1. Shoot Pictures

To get the data for this project, I took pairs of pictures on my iPhone throughout the day, whether it's of my Berkeley neighborhood, apartment building, or living room.

        
        
        
Image pairs.


Section 2. Define Homographies

We have the following equations, which we can rewrite to set up a linear system of equations to solve for the homography matrix $H$ using least squares. $$\begin{bmatrix}x_a \\ y_a \\ z_a \end{bmatrix} = H \begin{bmatrix}x \\ y \\ 1 \end{bmatrix}, \begin{bmatrix}\hat{x_1} \\ \hat{y_1} \\ 1 \end{bmatrix} = \frac{1}{z_a} \begin{bmatrix}x_a \\ y_a \\ z_a \end{bmatrix}$$

We can rewrite these equations to get the following $$ \begin{bmatrix}\hat{x_1} z_a \\ \hat{y_1} z_a \\ z_a \end{bmatrix} = \begin{bmatrix} h_{11} & h_{12} & h_{13} \\ h_{21} & h_{22} & h_{23} \\ h_{31} & h_{32} & h_{33} \\\end{bmatrix} \begin{bmatrix} x_1 \\ y_1 \\ 1 \end{bmatrix} $$

We can then substitute $z_a = h_{31} x_1 + h_{32} y_1 + h_{33}$ into the equations $\hat{x_1} z_a = h_{11} x_1 + h_{12} y_1 + h_{13}$ and $\hat{y_1} z_a = h_{21} x_1 + h_{22} y_1 + h_{23}$. We can repeat this process for all of the correspondence points we've defined $(x_i, y_i, \hat{x_i}, \hat{y_i})$. After setting $h_{33} = 1$, we can define the following system of equations: $$ \begin{bmatrix} x_1 & y_1 & 1 & 0 & 0 & 0 & -x_1 \hat{x_1} & -y_1 \hat{y_1} \\ 0 & 0 & 0 & x_1 & y_1 & 1 & -x_1 \hat{y_1} & -y_1 \hat{y_1} \\ x2 & y2 & 1 & 0 & 0 & 0 & -x2 \hat{x2} & -y2 \hat{y2} \\ 0 & 0 & 0 & x2 & y2 & 1 & -x2 \hat{y2} & -y2 \hat{y2} \\ \vdots & \vdots & \vdots & \vdots & \vdots & \vdots & \vdots & \vdots \\ x_n & y_n & 1 & 0 & 0 & 0 & -x_n \hat{x_n} & -y_n \hat{y_n} \\ \end{bmatrix} \begin{bmatrix} h_{11} \\ h_{12} \\ h_{13} \\ h_{21} \\ h_{22} \\ h_{23} \\ h_{31} \\ h_{32} \\ \end{bmatrix} = \begin{bmatrix} \hat{x_1} \\ \hat{y_1} \\ \hat{x_2} \\ \hat{y_2} \\ \vdots \\ \hat{x_n} \\ \hat{y_n} \end{bmatrix} $$

    
Correspondence points used to compute homography matrix.


Section 3. Warping

After defining the homography matrix, we can warp the images to be in the same projection space, letting us combine them down the road. Below are the resized im2's, the warped im1's, and the original im1's.

         
         
         
Resized im2, warped im1, and original im1.


Section 4. Image Rectification

For image rectification, I define the correspondence points of a square-like object in the image and compute the homography matrix with the manually defined points $[[0, 0], [w, 0], [w, w], [0, w]]$. This flattens the square and "rectifies" the image.

    
    
Square-like correspondence points and the rectified image.


Section 5. Mosaic Blending

For this part, I programmatically generated alpha masks to blend the resized im2 images with the warped im1 image. The masks were generated such that zones where only im2 pixels exist were set to $\alpha=1$, zones where only im1 pixels exist were set to $\alpha=0$. Zones where both pixels exist (contested regions) are set to $\alpha$ values that are negatively correlated with the distance to the center of the correspondence points.

$$\text{mask}[i, j] = \begin{cases} 0 & (i,j) \in (\text{im1} \setminus \text{im2}) \\ 1 & (i,j) \in (\text{im2} \setminus \text{im1}) \\ \gamma \sqrt{(i - c_r)^2 + (j - c_c)^2} & (i,j) \in (\text{im2} \cap \text{im1}) \end{cases}, \exists \gamma \in [0, 1]$$



Resized im2, new im1, and mask for apartment image pairs.
Mosaic image for apartment image pairs.

Resized im2, new im1, and mask for living room image pairs.
Mosaic image for living room image pairs.

Resized im2, new im1, and mask for neighborhood image pairs.
Mosaic image for neighborhood image pairs.



Section 6. What I Learned

This project was extremely interesting, but I learned how painstakingly difficult it is to define quality correspondence points. Furthermore, the importance of masking really stood out to me, because even if the correspondence points were good, stitching them together into a seamless photo is still a nontrivial task. While I only used simple alpha masking with a naive mask-creation algorithm, a better implementation would be to do multiresolution blending with a more sophisticated mask-creation strategy. Lastly, learning about how homography matrices can project images into the same plane was really cool!




Part 2

Section 1. Harris Interest Point Detector

In this section, I used the starter code provided in harris.py. However, I modified it such that it only took points with $f_{HM}$ values higher than a parameter threshold threshold_abs, since the original code only looks for local minimums with no thresholding.

Detected Harris Corners.



Section 2. Adaptive Non-Maximal Suppression

As the paper suggests, I set c_robust = 0.9, and I keep the top 250 points with the largest radius. Using adaptive non-maximal suppression allows us to find interest points that are spaced out well, allowing us to better define the homography.

Interest points after applying Adaptive Non-Maximal Suppression.



Section 3. Feature Descriptor Extraction

To generate the feature descriptors, I extracted 45x45 image patches and then downsampled them into 9x9 feature patches. Here are some examples:



Section 4. Feature Matching

By extracting feature descriptors from each of the ANMS interest points, we can compare how similar two points are using SSD as the metric. For each interest point $i$, we measure the value $\frac{E_{NN_{i1}}}{E_{NN_{i2}}}$, where $E_{NN_{ij}}$ is the $j$th smallest SSD error between interest point $i$ and other interest points. If this value is below a predetermined threshold, then we accept it as a match.

Matched points using feature descriptor SSD values. Notice that there are some mismatches, i.e match #18 in the first image, or match #0 in the second image.



Section 5. Robust method (RANSAC) to compute homography

Given our various matches, we can use a robust method like RANSAC to find true matches to compute our homography. In my RANSAC implementation, I randomly select four matches, compute the respective homography matrix $H$, and check to see how many other pairs satisfy this homography (the inliers). I repeat this process iters amount of times, and take the largest set of inliers. This corresponds to the most accurate homography matrix. To determine whether a pair satisfies the homography, we use $H$ to project the first point, and measure this projection's error with the second point. If the projection error is smaller than a parameter eps, then the pair is an inlier.

RANSAC matched points. All of the bad matches are filtered out!



Section 6. Autostitching

With our auto-calculated homography matrices and correspondence points (from RANSAC), we can generate auto-stitched mosaic images by recycling code from Project 4A. The results are below.

Manual mosaics (left) vs. Auto-stitched mosaics (right).



Section 7. What I Learned

It was a really great learning experience to implement auto-stitching. I learned how to digest research papers and implement the described methods / algorithms, and I also learned about the practicality and power of RANSAC.