CS 194-26 Project 4A Report: Image Warping and Mosaicing

Kristy Lee

kristylee@berkeley.edu

This project involves working on code to warp images into a mosaic using transformations from a homography matrix. I take multiple photographs from the same point of view but with different view directions/overlapping fields of view, and use those to create a mosaic. I also learned about image rectification, or warping an image so that certain lines are aligned/straightened.

Shoot the Pictures

I use my iPhone camera with general settings and exposure and focus locking (AE/AF) to take a few photos. Here are the images I've taken:

Whiteboard1
Whiteboard Left
Whiteboard2
Whiteboard Right
Stands1
Stands Left
Stands2
Stands Right
Door1
Door Left
Door2
Door Right
Vendor1
Vendor Left
Vendor2
Vendor Center
Vendor3
Vendor Right

Recover Homographies

In this section, I compute the homography matrix which will transform points from one image to another image. The goal of the homographic matrix is to map each set of points p = (x,y) from an original image to p' = (x',y'), a set of points in the target image. H is defined as \[H = \begin{bmatrix} a & b & c \\ d & e & f \\ g & h & i \end{bmatrix} \] For this assignment, the scale factor i is equal to 1 so that the homography matrix has 8 degrees of freedom. Let \( \textbf{p'} = \begin{bmatrix} wx' \\ wy' \\ w \end{bmatrix} \), \( \textbf{p} = \begin{bmatrix} x \\ y \\ 1 \end{bmatrix} \). and the mapping from p to p' is defined as \[\textbf{p'} = H\textbf{p}\] The following implies that \(x' = ax + by + c - x'x - x'y\), and \(y' = ax + by + c - y'x - y'y\). In H, there are 8 unknown variables. To solve for the 8 unknowns, I need 8 equations, which can be provided if I used at least 4 corresponding points in each image. Realistically, though, using 4 corresponding points doesn't make the homographic transformation prone to noise/error. So it is more typical to select multiple points, and solve the following system of equations using least squares. \[\begin{bmatrix} x_{1} & y_{1} & 1 & 0 & 0 & 0 & -x_{1}'*x_{1} & -x_{1}'*y_{1}\\ 0 & 0 & 0 & x_{1} & y_{1} & 1 & -y_{1}'*x_{1} & -y_{1}'*y_{1}\\ x_{2} & y_{2} & 1 & 0 & 0 & 0 & -x_{2}'*x_{2} & -x_{2}'*y_{2}\\ 0 & 0 & 0 & x_{2} & y_{2} & 1 & -y_{2}'*x_{2} & -y_{2}'*y_{2}\\ x_{3} & y_{3} & 1 & 0 & 0 & 0 & -x_{3}'*x_{3} & -x_{3}'*y_{3}\\ 0 & 0 & 0 & x_{3} & y_{3} & 1 & -y_{3}'*x_{3} & -y_{3}'*y_{3}\\ x_{4} & y_{4} & 1 & 0 & 0 & 0 & -x_{4}'*x_{4} & -x_{4}'*y_{4}\\ 0 & 0 & 0 & x_{4} & y_{4} & 1 & -y_{4}'*x_{4} & -y_{4}'*y_{4} \\ ..... \\ \end{bmatrix} \begin{bmatrix} a\\ b\\ c\\ d\\ e\\ f\\ g\\ h \end{bmatrix} = i * \begin{bmatrix} x{1}'\\ y{1}'\\ x{2}'\\ y{2}'\\ x{3}'\\ y{3}'\\ x{4}'\\ y{4}' \\ ... \end{bmatrix} \text{, } i = 1 \] In my code, I defined H = computeH(im1_pts, im2_pts) to do the computation where im1_pts are the source points and im2_pts are the destination points. If \[A = \begin{bmatrix} x_{1} & y_{1} & 1 & 0 & 0 & 0 & -x_{1}'*x_{1} & -x_{1}'*y_{1}\\ 0 & 0 & 0 & x_{1} & y_{1} & 1 & -y_{1}'*x_{1} & -y_{1}'*y_{1}\\ x_{2} & y_{2} & 1 & 0 & 0 & 0 & -x_{2}'*x_{2} & -x_{2}'*y_{2}\\ 0 & 0 & 0 & x_{2} & y_{2} & 1 & -y_{2}'*x_{2} & -y_{2}'*y_{2}\\ x_{3} & y_{3} & 1 & 0 & 0 & 0 & -x_{3}'*x_{3} & -x_{3}'*y_{3}\\ 0 & 0 & 0 & x_{3} & y_{3} & 1 & -y_{3}'*x_{3} & -y_{3}'*y_{3}\\ x_{4} & y_{4} & 1 & 0 & 0 & 0 & -x_{4}'*x_{4} & -x_{4}'*y_{4}\\ 0 & 0 & 0 & x_{4} & y_{4} & 1 & -y_{4}'*x_{4} & -y_{4}'*y_{4} \\ ..... \\ \end{bmatrix} , b = \begin{bmatrix} x{1}'\\ y{1}'\\ x{2}'\\ y{2}'\\ x{3}'\\ y{3}'\\ x{4}'\\ y{4}' \\ ... \end{bmatrix} \\ \] I solve for \[h = \begin{bmatrix} a\\ b\\ c\\ d\\ e\\ f\\ g\\ h \end{bmatrix}\] using np.linalg.lstsq(A,b), and construct \(H\).

Warp the Images [2 Examples of Rectified Images]

Now knowing how to compute the homograpic matrix from the previous part, I can now warp images using a specifically computed homographic matrix between the images to perform a forward warp. In my code, I defined imwarp = warpImage(im, dest_im, H) to perform the warp (dest_im is the image with the coordinate system to map the source image to. It has dimensions that may be padded relative to the source image im). I compute the bounding box region/dimensions of the warped image, adjust the homography matrix accordingly, and then proceeded with a forward warping method: compute the points corresponding to the source points by using the transformation homography matrix H. Use cv2.remap(src, map1=warped_pts, map2 = None, interpolation=cv2.INTER_CUBIC) to map the pixels of the source image to the corresponding positions of the points defined by warped_pts. After I've completed the warp code, I proceeded to work on image rectification for some images with planar surfaces: I warp those images to images with frontal-parallel plane (i.e. aligned horizontally and/or vertically). For the first example, I see four corners of a tile floor that surround the alcohol bottle. With the tile being square, I used four points corresponding to the corners of the tile, and calculated a homographic mapping to a square defined arbitrarily (I used coordinates \([[300, 1500], [2000, 1500], [300, 2800], [2000, 2800]]\) which were approximately similar to the coordinates of the tile in the unwarped image, \([[300.590909090909, 1694.772727272727], [1958.7727272727275, 1192.954545454545], [147.86363636363603, 3211.1363636363635], [2896.954545454546, 2349.3181818181815]]\)). I use the book front cover in the second page and map the corners to an arbitrary vertical rectangle shape's corners. Here are the corresponding images:

Alcohol1
Alcohol
Alcohol
Alcohol Tile Corner Pts
Alcohol
Alcohol Square Pts
Warped Alcohol
Warped Alcohol Image (by square tile)
OS Book
OS Book
OS Book
OS Book Corner Pts
OS Book
OS Book Rectangle Pts
Warped OS Book (by rectangular shape)
Warped OS Book (by rectangular shape)

Blend the Images into a Mosaic

Now, I blend the images into a mosaic through weighted averaging in code. In most cases, I leave the right (source2) image constant while warping the left (source1) image to its projection. For the mosaic with multiple images, I choose one base image, compute 2 series of images warped consecutively to the base image (from both directions), and then stitched the images together. Thus, my code supports the stitching of images together to create a mosaic. Here are some examples of photos:

Whiteboard1
Whiteboard Left
Whiteboard2
Whiteboard Right
Whiteboard Left Warped
Whiteboard Left Warped
Whiteboard Right
Whiteboard Right
Bookshelf3
Whiteboard Left
Bookshelf3
Whiteboard Right
Blended Whiteboard
Whiteboard Mosaic
Stands1
Stands Left
Stands2
Stands Right
Stands Left Warped
Stands Left Warped
Stands Right
Stands Right
Stands3
Stands Left
Stands3
Stands Center
Blended Stands
Stands Mosaic
Door1
Door Left
Door2
Door Right
Door Left Warped
Door Left Warped
Door2
Door Right
Vendor3
Door Left
Vendor3
Door Right
Door Mosaic
Door Mosaic
Vendor1
Vendor Left
Vendor2
Vendor Center (Base)
Vendor3
Vendor Right
Vendor3
Vendor Left Warped
Vendor2
Vendor Center (Base)
Vendor3
Vendor Right Warped
Vendor3
Vendor Left
Vendor3
Vendor Center (Base)
Vendor3
Vendor Right
Vendor3
Vendor Mosaic Left
Vendor3
Vendor Mosaic Right
Vendor3
Vendor Mosaic

What I Learned

I thought that the coolest thing I learned when doing this project part is creating warps/transformations for projecting source images through using the homographic matrix computed using a system of equations. Creating the image mosaic and stitching two photos (that capture the same view but are taken in different directions) into the image mosaic through using the transformation from the homography turned out to be a major accomplishment that I enjoy from the project.

Acknowledgements

  • CS194-26 Project 4A Description
  • I've taken my own photographs for the mosaic / image rectification using iPhone camera. The images are shown above in the website, and also are included in my submission.

CS 194-26 Project 4B Report: Feature Matching for Autostitching

Kristy Lee

kristylee@berkeley.edu

For this portion of the project, I worked on Python code that would automatically stitch images together into a mosaic through a process of identifying harris corners for feature points, determining feature matches, and using the RANSAC loop to find a set of inlier correspondence points that would be used to compute a homography matrix for the image transformation.

Harris Interest Point Detector (Section 2)

For this section, I use the starter code harris.py to obtain the functions get_harris_corners for computing the harris corner points and the corresponding values for the harris corner strength. The Harris matrix for (x,y) is computed using an outer product of gradients, shown below:

HarrisMatrix
Harris Matrix Equation
and the corner strength \(f_{HM}\) returned for a given point (x,y) is the determinent of the Harris matrix divided by the trace of the Harris matrix.
CornerFunction
Corner Strength Function Equation
The interest points are obtained by examining the point where the corner strength \(f_{HM}\) is a local maximum in a 3x3 region. Here is an example image for which I computed Harris corner points/interest points for:
Whiteboard1
Stands Left Corner Features
Whiteboard2
Stands Right Corner Features

Implementing Adaptive Non-Maximal Suppression (Section 3)

Matching interest points between images costs a lot of time, so thus in this section we implement adaptive non-maximal suppression method to filter the amount of interest points we consider. We compute the minimum suppression radius \(r_i\) for each interest point \(i\) by considering all other interest points \(j \neq i\) and finding the interest point \(j \neq i\) that provides the minimum value \(r_i\) through the following equation:

Whiteboard1
Minimum Suppression Radius Equation
Here, we use the corner strength function and a constant called \(c_{\text{robust}}\), set to 0.9, as an condition before taking the distance between the points. I have the list of points for an image sorted in decreasing order of corner strength. Once I find \(r_i\) for all interest points, I keep 500 interest points that yield the largest radii. I perform this process on the images individually. Here are my resulting photos with the interest points filtered by ANMS:
Whiteboard1
Stands Left Interest Points(ANMS)
Whiteboard2
Stands Right Interest Points(ANMS)

Implementing Feature Descriptor Extraction (Section 4)

For this section, for a given image I take each interest point's coordinate (x,y), obtain a 40 x 40 region of the image around the point by taking patch = im[y-20:y+20, x-20:x+20], normalize the patch such that the entries of the patch have mean 0 and standard deviation 1, and then resizing the 40 x 40 patch to an 8 x 8 patch using skimage.transform.resize. I perform the operation on the filtered interest points for both images. Here are some example feature descriptors:

Whiteboard1
Feature Descriptor Stands 1 Example 1
Whiteboard2
Feature Descriptor Stands 1 Example 2
Whiteboard2
Feature Descriptor Stands 1 Example 3
Whiteboard1
Feature Descriptor Stands 2 Example 1
Whiteboard2
Feature Descriptor Stands 2 Example 2
Whiteboard2
Feature Descriptor Stands 2 Example 3

Implementing Feature Matching(Section 5)

Here, I identify good pairs of features here that closely correspond to each other. For each interest point (x,y) of a fixed (anchor) image, I iterate through all of the other image's interest points (x',y'), compute the SSD between both interest points' feature descriptors (np.sum((fd1-fd2)**2)), and store the information into NN[(x,y)]'s list by appending (x',y',SSD) to the list. Then, I sort the NN[(x,y)] list by SSD, and then identify (x,y)'s 1-NN and 2-NN. If 1-NN's SSD/2-NN's SSD \(<\) threshold (which I defined as 0.6), then I would keep (x,y) and 1-NN's (x',y') in the list of pairs of potential correspondence points. I loop through each (x,y) repeating the aforementioned process to get a complete list of kept interest points in both images. After I finish identifying the kept interest points, I display the interest points in the images as shown below:

Whiteboard1
Stands Left Kept Points
Whiteboard2
Stands Right Kept Points

4-point RANSAC Loop

Here, I compute a homography matrix mapping points from a source image to the coordinate system of another image using the 4-point RANSAC method. The four steps of the RANSAC loop are as follows: 1. Select 4 random indices and use those to select 4 kept interest points in the first image and 4 kept interest points in the second image. 2. Compute an exact homography H using the 8 aforementioned points. 3. Compute inliers where \(\text{dist}(p_i', Hp_i) < \epsilon\), where I define \(\epsilon=1\), and 4. Keep the set of inliers largest across all iterations so far. Once a set of inliers is obtained, compute H using those inliers. I found that using 1000 iterations produced sastisfactory homography matrices.

Whiteboard1
Stands Left Final Inliers
Whiteboard2
Stands Right Final Inliers

Comparisons of Mosaics

Having the H matrix from the RANSAC loop, I can now create the mosaic using image warping and mosaic stitching methods similar to part A. Here's a comparison between the Stands mosaic from manual selection of feature points and the Stands mosaic from autostitching:

Whiteboard1
Stands Mosaic(Manual)
Whiteboard2
Stands Mosaic(Autostitching)

Notice that autostitching allows for the "Center for Connected Learning" stand text to be readable and nonblurry, while the manual version has some blurry "Center for Connected Learning" stand text.

Whiteboard1
Stands Mosaic(Manual)
Whiteboard2
Stands Mosaic(Autostitching)

Here is another example:

Whiteboard1
Whiteboard Mosaic(Manual)
Whiteboard2
Whiteboard Mosaic(Autostitching)

For the whiteboard, autostitching method produces clearer text on the whiteboard than the manual method:

Whiteboard1
Whiteboard Mosaic(Manual)
Whiteboard2
Whiteboard Mosaic(Autostitching)

Here's another example:

Whiteboard1
Door Mosaic(Manual)
Whiteboard2
Door Mosaic(Autostitching)

And finally, here is an example of a comparison between 2 mosaics where 3 images were used to create each of the mosaics. Vendor:

Whiteboard1
Vendor Mosaic(Manual)
Whiteboard2
Vendor Mosaic(Autostitching)

Thus I've produced 4 example mosaics through autostitching and manual selection of feature points.

What I Learned

I thought that the coolest thing I learned when doing this project part is the math/algorithms involved with the interest points, including computing nearest neighbors and the minimum suppression radius. I also liked understanding the RAMSAC loop and understanding how the inliers are programmatically selected to serve as correspondence points for the images that are involved in the homography computation.

Acknowledgements

  • CS194-26 Project 4B Description
  • I've taken my own photographs for the mosaic using iPhone camera. The images are shown above in the website, and also are included in my submission.