CS 194-26 Project 4A Report: Image Warping and Mosaicing

Kristy Lee

kristylee@berkeley.edu

This project involves working on code to warp images into a mosaic using transformations from a homography matrix. I take multiple photographs from the same point of view but with different view directions/overlapping fields of view, and use those to create a mosaic. I also learned about image rectification, or warping an image so that certain lines are aligned/straightened.

Shoot the Pictures

I use my iPhone camera with general settings and exposure and focus locking (AE/AF) to take a few photos. Here are the images I've taken:

Whiteboard Left	Whiteboard Right
Stands Left	Stands Right
Door Left	Door Right
Vendor Left	Vendor Center	Vendor Right

Recover Homographies

In this section, I compute the homography matrix which will transform points from one image to another image. The goal of the homographic matrix is to map each set of points p = (x,y) from an original image to p' = (x',y'), a set of points in the target image. H is defined as \[H = \begin{bmatrix} a & b & c \\ d & e & f \\ g & h & i \end{bmatrix} \] For this assignment, the scale factor i is equal to 1 so that the homography matrix has 8 degrees of freedom. Let \( \textbf{p'} = \begin{bmatrix} wx' \\ wy' \\ w \end{bmatrix} \), \( \textbf{p} = \begin{bmatrix} x \\ y \\ 1 \end{bmatrix} \). and the mapping from p to p' is defined as \[\textbf{p'} = H\textbf{p}\] The following implies that \(x' = ax + by + c - x'x - x'y\), and \(y' = ax + by + c - y'x - y'y\). In H, there are 8 unknown variables. To solve for the 8 unknowns, I need 8 equations, which can be provided if I used at least 4 corresponding points in each image. Realistically, though, using 4 corresponding points doesn't make the homographic transformation prone to noise/error. So it is more typical to select multiple points, and solve the following system of equations using least squares. \[\begin{bmatrix} x_{1} & y_{1} & 1 & 0 & 0 & 0 & -x_{1}'*x_{1} & -x_{1}'*y_{1}\\ 0 & 0 & 0 & x_{1} & y_{1} & 1 & -y_{1}'*x_{1} & -y_{1}'*y_{1}\\ x_{2} & y_{2} & 1 & 0 & 0 & 0 & -x_{2}'*x_{2} & -x_{2}'*y_{2}\\ 0 & 0 & 0 & x_{2} & y_{2} & 1 & -y_{2}'*x_{2} & -y_{2}'*y_{2}\\ x_{3} & y_{3} & 1 & 0 & 0 & 0 & -x_{3}'*x_{3} & -x_{3}'*y_{3}\\ 0 & 0 & 0 & x_{3} & y_{3} & 1 & -y_{3}'*x_{3} & -y_{3}'*y_{3}\\ x_{4} & y_{4} & 1 & 0 & 0 & 0 & -x_{4}'*x_{4} & -x_{4}'*y_{4}\\ 0 & 0 & 0 & x_{4} & y_{4} & 1 & -y_{4}'*x_{4} & -y_{4}'*y_{4} \\ ..... \\ \end{bmatrix} \begin{bmatrix} a\\ b\\ c\\ d\\ e\\ f\\ g\\ h \end{bmatrix} = i * \begin{bmatrix} x{1}'\\ y{1}'\\ x{2}'\\ y{2}'\\ x{3}'\\ y{3}'\\ x{4}'\\ y{4}' \\ ... \end{bmatrix} \text{, } i = 1 \] In my code, I defined H = computeH(im1_pts, im2_pts) to do the computation where im1_pts are the source points and im2_pts are the destination points. If \[A = \begin{bmatrix} x_{1} & y_{1} & 1 & 0 & 0 & 0 & -x_{1}'*x_{1} & -x_{1}'*y_{1}\\ 0 & 0 & 0 & x_{1} & y_{1} & 1 & -y_{1}'*x_{1} & -y_{1}'*y_{1}\\ x_{2} & y_{2} & 1 & 0 & 0 & 0 & -x_{2}'*x_{2} & -x_{2}'*y_{2}\\ 0 & 0 & 0 & x_{2} & y_{2} & 1 & -y_{2}'*x_{2} & -y_{2}'*y_{2}\\ x_{3} & y_{3} & 1 & 0 & 0 & 0 & -x_{3}'*x_{3} & -x_{3}'*y_{3}\\ 0 & 0 & 0 & x_{3} & y_{3} & 1 & -y_{3}'*x_{3} & -y_{3}'*y_{3}\\ x_{4} & y_{4} & 1 & 0 & 0 & 0 & -x_{4}'*x_{4} & -x_{4}'*y_{4}\\ 0 & 0 & 0 & x_{4} & y_{4} & 1 & -y_{4}'*x_{4} & -y_{4}'*y_{4} \\ ..... \\ \end{bmatrix} , b = \begin{bmatrix} x{1}'\\ y{1}'\\ x{2}'\\ y{2}'\\ x{3}'\\ y{3}'\\ x{4}'\\ y{4}' \\ ... \end{bmatrix} \\ \] I solve for \[h = \begin{bmatrix} a\\ b\\ c\\ d\\ e\\ f\\ g\\ h \end{bmatrix}\] using np.linalg.lstsq(A,b), and construct \(H\).

Warp the Images [2 Examples of Rectified Images]

Now knowing how to compute the homograpic matrix from the previous part, I can now warp images using a specifically computed homographic matrix between the images to perform a forward warp. In my code, I defined imwarp = warpImage(im, dest_im, H) to perform the warp (dest_im is the image with the coordinate system to map the source image to. It has dimensions that may be padded relative to the source image im). I compute the bounding box region/dimensions of the warped image, adjust the homography matrix accordingly, and then proceeded with a forward warping method: compute the points corresponding to the source points by using the transformation homography matrix H. Use cv2.remap(src, map1=warped_pts, map2 = None, interpolation=cv2.INTER_CUBIC) to map the pixels of the source image to the corresponding positions of the points defined by warped_pts. After I've completed the warp code, I proceeded to work on image rectification for some images with planar surfaces: I warp those images to images with frontal-parallel plane (i.e. aligned horizontally and/or vertically). For the first example, I see four corners of a tile floor that surround the alcohol bottle. With the tile being square, I used four points corresponding to the corners of the tile, and calculated a homographic mapping to a square defined arbitrarily (I used coordinates \([[300, 1500], [2000, 1500], [300, 2800], [2000, 2800]]\) which were approximately similar to the coordinates of the tile in the unwarped image, \([[300.590909090909, 1694.772727272727], [1958.7727272727275, 1192.954545454545], [147.86363636363603, 3211.1363636363635], [2896.954545454546, 2349.3181818181815]]\)). I use the book front cover in the second page and map the corners to an arbitrary vertical rectangle shape's corners. Here are the corresponding images:

Alcohol	Alcohol Tile Corner Pts	Alcohol Square Pts	Warped Alcohol Image (by square tile)
OS Book	OS Book Corner Pts	OS Book Rectangle Pts	Warped OS Book (by rectangular shape)

Blend the Images into a Mosaic

Now, I blend the images into a mosaic through weighted averaging in code. In most cases, I leave the right (source2) image constant while warping the left (source1) image to its projection. For the mosaic with multiple images, I choose one base image, compute 2 series of images warped consecutively to the base image (from both directions), and then stitched the images together. Thus, my code supports the stitching of images together to create a mosaic. Here are some examples of photos:

Whiteboard Left	Whiteboard Right
Whiteboard Left Warped	Whiteboard Right
Whiteboard Left	Whiteboard Right	Whiteboard Mosaic
Stands Left	Stands Right
Stands Left Warped	Stands Right
Stands Left	Stands Center	Stands Mosaic
Door Left	Door Right
Door Left Warped	Door Right
Door Left	Door Right	Door Mosaic
Vendor Left	Vendor Center (Base)	Vendor Right
Vendor Left Warped	Vendor Center (Base)	Vendor Right Warped
Vendor Left	Vendor Center (Base)	Vendor Right
Vendor Mosaic Left	Vendor Mosaic Right	Vendor Mosaic

What I Learned

I thought that the coolest thing I learned when doing this project part is creating warps/transformations for projecting source images through using the homographic matrix computed using a system of equations. Creating the image mosaic and stitching two photos (that capture the same view but are taken in different directions) into the image mosaic through using the transformation from the homography turned out to be a major accomplishment that I enjoy from the project.

Acknowledgements

CS194-26 Project 4A Description
I've taken my own photographs for the mosaic / image rectification using iPhone camera. The images are shown above in the website, and also are included in my submission.

CS 194-26 Project 4B Report: Feature Matching for Autostitching

Kristy Lee

kristylee@berkeley.edu

For this portion of the project, I worked on Python code that would automatically stitch images together into a mosaic through a process of identifying harris corners for feature points, determining feature matches, and using the RANSAC loop to find a set of inlier correspondence points that would be used to compute a homography matrix for the image transformation.

Harris Interest Point Detector (Section 2)

For this section, I use the starter code harris.py to obtain the functions get_harris_corners for computing the harris corner points and the corresponding values for the harris corner strength. The Harris matrix for (x,y) is computed using an outer product of gradients, shown below:

Harris Matrix Equation

and the corner strength \(f_{HM}\) returned for a given point (x,y) is the determinent of the Harris matrix divided by the trace of the Harris matrix.

Corner Strength Function Equation

The interest points are obtained by examining the point where the corner strength \(f_{HM}\) is a local maximum in a 3x3 region. Here is an example image for which I computed Harris corner points/interest points for:

Stands Left Corner Features

Stands Right Corner Features

Implementing Adaptive Non-Maximal Suppression (Section 3)

Matching interest points between images costs a lot of time, so thus in this section we implement adaptive non-maximal suppression method to filter the amount of interest points we consider. We compute the minimum suppression radius \(r_i\) for each interest point \(i\) by considering all other interest points \(j \neq i\) and finding the interest point \(j \neq i\) that provides the minimum value \(r_i\) through the following equation:

Minimum Suppression Radius Equation

Here, we use the corner strength function and a constant called \(c_{\text{robust}}\), set to 0.9, as an condition before taking the distance between the points. I have the list of points for an image sorted in decreasing order of corner strength. Once I find \(r_i\) for all interest points, I keep 500 interest points that yield the largest radii. I perform this process on the images individually. Here are my resulting photos with the interest points filtered by ANMS:

Stands Left Interest Points(ANMS)

Stands Right Interest Points(ANMS)

Implementing Feature Descriptor Extraction (Section 4)

For this section, for a given image I take each interest point's coordinate (x,y), obtain a 40 x 40 region of the image around the point by taking patch = im[y-20:y+20, x-20:x+20], normalize the patch such that the entries of the patch have mean 0 and standard deviation 1, and then resizing the 40 x 40 patch to an 8 x 8 patch using skimage.transform.resize. I perform the operation on the filtered interest points for both images. Here are some example feature descriptors:

Feature Descriptor Stands 1 Example 1	Feature Descriptor Stands 1 Example 2	Feature Descriptor Stands 1 Example 3
Feature Descriptor Stands 2 Example 1	Feature Descriptor Stands 2 Example 2	Feature Descriptor Stands 2 Example 3

Implementing Feature Matching(Section 5)

Here, I identify good pairs of features here that closely correspond to each other. For each interest point (x,y) of a fixed (anchor) image, I iterate through all of the other image's interest points (x',y'), compute the SSD between both interest points' feature descriptors (np.sum((fd1-fd2)**2)), and store the information into NN[(x,y)]'s list by appending (x',y',SSD) to the list. Then, I sort the NN[(x,y)] list by SSD, and then identify (x,y)'s 1-NN and 2-NN. If 1-NN's SSD/2-NN's SSD \(<\) threshold (which I defined as 0.6), then I would keep (x,y) and 1-NN's (x',y') in the list of pairs of potential correspondence points. I loop through each (x,y) repeating the aforementioned process to get a complete list of kept interest points in both images. After I finish identifying the kept interest points, I display the interest points in the images as shown below:

Stands Left Kept Points

Stands Right Kept Points

4-point RANSAC Loop

Here, I compute a homography matrix mapping points from a source image to the coordinate system of another image using the 4-point RANSAC method. The four steps of the RANSAC loop are as follows: 1. Select 4 random indices and use those to select 4 kept interest points in the first image and 4 kept interest points in the second image. 2. Compute an exact homography H using the 8 aforementioned points. 3. Compute inliers where \(\text{dist}(p_i', Hp_i) < \epsilon\), where I define \(\epsilon=1\), and 4. Keep the set of inliers largest across all iterations so far. Once a set of inliers is obtained, compute H using those inliers. I found that using 1000 iterations produced sastisfactory homography matrices.

Stands Left Final Inliers

Stands Right Final Inliers

Comparisons of Mosaics

Having the H matrix from the RANSAC loop, I can now create the mosaic using image warping and mosaic stitching methods similar to part A. Here's a comparison between the Stands mosaic from manual selection of feature points and the Stands mosaic from autostitching:

Stands Mosaic(Manual)

Stands Mosaic(Autostitching)

Notice that autostitching allows for the "Center for Connected Learning" stand text to be readable and nonblurry, while the manual version has some blurry "Center for Connected Learning" stand text.

Stands Mosaic(Manual)

Stands Mosaic(Autostitching)

Here is another example:

Whiteboard Mosaic(Manual)

Whiteboard Mosaic(Autostitching)

For the whiteboard, autostitching method produces clearer text on the whiteboard than the manual method:

Whiteboard Mosaic(Manual)

Whiteboard Mosaic(Autostitching)

Here's another example:

Door Mosaic(Manual)

Door Mosaic(Autostitching)

And finally, here is an example of a comparison between 2 mosaics where 3 images were used to create each of the mosaics. Vendor:

Vendor Mosaic(Manual)

Vendor Mosaic(Autostitching)

Thus I've produced 4 example mosaics through autostitching and manual selection of feature points.

What I Learned

I thought that the coolest thing I learned when doing this project part is the math/algorithms involved with the interest points, including computing nearest neighbors and the minimum suppression radius. I also liked understanding the RAMSAC loop and understanding how the inliers are programmatically selected to serve as correspondence points for the images that are involved in the homography computation.

Acknowledgements

CS194-26 Project 4B Description
I've taken my own photographs for the mosaic using iPhone camera. The images are shown above in the website, and also are included in my submission.