Kristy Lee
kristylee@berkeley.edu
This project involves working on code to warp images into a mosaic using transformations from a homography matrix. I take multiple photographs from the same point of view but with different view directions/overlapping fields of view, and use those to create a mosaic. I also learned about image rectification, or warping an image so that certain lines are aligned/straightened.
I use my iPhone camera with general settings and exposure and focus locking (AE/AF) to take a few photos. Here are the images I've taken:
|
|
|
|
|
|
|
|
|
|
|
|
In this section, I compute the homography matrix which will transform points from one image to another image. The goal of the homographic matrix is to map each set of points p = (x,y) from an original image to p' = (x',y'), a set of points in the target image. H is defined as
\[H = \begin{bmatrix} a & b & c \\ d & e & f \\ g & h & i \end{bmatrix} \]
For this assignment, the scale factor i is equal to 1 so that the homography matrix has 8 degrees of freedom. Let \( \textbf{p'} = \begin{bmatrix} wx' \\ wy' \\ w \end{bmatrix} \), \( \textbf{p} = \begin{bmatrix} x \\ y \\ 1 \end{bmatrix} \).
and the mapping from p to p' is defined as
\[\textbf{p'} = H\textbf{p}\]
The following implies that \(x' = ax + by + c - x'x - x'y\), and \(y' = ax + by + c - y'x - y'y\).
In H, there are 8 unknown variables. To solve for the 8 unknowns, I need 8 equations, which can be provided if I used at least 4 corresponding points in each image. Realistically, though, using 4 corresponding points doesn't make the homographic transformation prone to noise/error. So it is more typical to select multiple points, and solve the following system of equations using least squares.
\[\begin{bmatrix}
x_{1} & y_{1} & 1 & 0 & 0 & 0 & -x_{1}'*x_{1} & -x_{1}'*y_{1}\\
0 & 0 & 0 & x_{1} & y_{1} & 1 & -y_{1}'*x_{1} & -y_{1}'*y_{1}\\
x_{2} & y_{2} & 1 & 0 & 0 & 0 & -x_{2}'*x_{2} & -x_{2}'*y_{2}\\
0 & 0 & 0 & x_{2} & y_{2} & 1 & -y_{2}'*x_{2} & -y_{2}'*y_{2}\\
x_{3} & y_{3} & 1 & 0 & 0 & 0 & -x_{3}'*x_{3} & -x_{3}'*y_{3}\\
0 & 0 & 0 & x_{3} & y_{3} & 1 & -y_{3}'*x_{3} & -y_{3}'*y_{3}\\
x_{4} & y_{4} & 1 & 0 & 0 & 0 & -x_{4}'*x_{4} & -x_{4}'*y_{4}\\
0 & 0 & 0 & x_{4} & y_{4} & 1 & -y_{4}'*x_{4} & -y_{4}'*y_{4} \\
..... \\
\end{bmatrix}
\begin{bmatrix}
a\\
b\\
c\\
d\\
e\\
f\\
g\\
h
\end{bmatrix}
=
i * \begin{bmatrix}
x{1}'\\
y{1}'\\
x{2}'\\
y{2}'\\
x{3}'\\
y{3}'\\
x{4}'\\
y{4}' \\
...
\end{bmatrix} \text{, } i = 1
\]
In my code, I defined H = computeH(im1_pts, im2_pts)
to do the computation where im1_pts
are the source points and im2_pts
are the destination points.
If \[A = \begin{bmatrix}
x_{1} & y_{1} & 1 & 0 & 0 & 0 & -x_{1}'*x_{1} & -x_{1}'*y_{1}\\
0 & 0 & 0 & x_{1} & y_{1} & 1 & -y_{1}'*x_{1} & -y_{1}'*y_{1}\\
x_{2} & y_{2} & 1 & 0 & 0 & 0 & -x_{2}'*x_{2} & -x_{2}'*y_{2}\\
0 & 0 & 0 & x_{2} & y_{2} & 1 & -y_{2}'*x_{2} & -y_{2}'*y_{2}\\
x_{3} & y_{3} & 1 & 0 & 0 & 0 & -x_{3}'*x_{3} & -x_{3}'*y_{3}\\
0 & 0 & 0 & x_{3} & y_{3} & 1 & -y_{3}'*x_{3} & -y_{3}'*y_{3}\\
x_{4} & y_{4} & 1 & 0 & 0 & 0 & -x_{4}'*x_{4} & -x_{4}'*y_{4}\\
0 & 0 & 0 & x_{4} & y_{4} & 1 & -y_{4}'*x_{4} & -y_{4}'*y_{4} \\
..... \\ \end{bmatrix} , b = \begin{bmatrix}
x{1}'\\
y{1}'\\
x{2}'\\
y{2}'\\
x{3}'\\
y{3}'\\
x{4}'\\
y{4}' \\
...
\end{bmatrix} \\
\]
I solve for \[h = \begin{bmatrix}
a\\
b\\
c\\
d\\
e\\
f\\
g\\
h
\end{bmatrix}\] using np.linalg.lstsq(A,b)
, and construct \(H\).
Now knowing how to compute the homograpic matrix from the previous part, I can now warp images using a specifically computed homographic matrix between the images to perform a forward warp. In my code, I defined imwarp = warpImage(im, dest_im, H)
to perform the warp (dest_im is the image with the coordinate system to map the source image to. It has dimensions that may be padded relative to the source image im).
I compute the bounding box region/dimensions of the warped image, adjust the homography matrix accordingly, and then proceeded with a forward warping method: compute the points corresponding to the source points by using the transformation homography matrix H. Use cv2.remap(src, map1=warped_pts, map2 = None, interpolation=cv2.INTER_CUBIC)
to map the pixels of the source image to the corresponding positions of the points defined by warped_pts.
After I've completed the warp code, I proceeded to work on image rectification for some images with planar surfaces: I warp those images to images with frontal-parallel plane (i.e. aligned horizontally and/or vertically). For the first example, I see four corners of a tile floor that surround the alcohol bottle. With the tile being square, I used four points corresponding to the corners of the tile, and calculated a homographic mapping to a square defined arbitrarily (I used coordinates \([[300, 1500], [2000, 1500], [300, 2800], [2000, 2800]]\) which were approximately similar to the coordinates of the tile in the unwarped image, \([[300.590909090909, 1694.772727272727], [1958.7727272727275, 1192.954545454545], [147.86363636363603, 3211.1363636363635], [2896.954545454546, 2349.3181818181815]]\)).
I use the book front cover in the second page and map the corners to an arbitrary vertical rectangle shape's corners. Here are the corresponding images:
|
|
|
|
|
|
|
|
Now, I blend the images into a mosaic through weighted averaging in code. In most cases, I leave the right (source2) image constant while warping the left (source1) image to its projection. For the mosaic with multiple images, I choose one base image, compute 2 series of images warped consecutively to the base image (from both directions), and then stitched the images together. Thus, my code supports the stitching of images together to create a mosaic. Here are some examples of photos:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
I thought that the coolest thing I learned when doing this project part is creating warps/transformations for projecting source images through using the homographic matrix computed using a system of equations. Creating the image mosaic and stitching two photos (that capture the same view but are taken in different directions) into the image mosaic through using the transformation from the homography turned out to be a major accomplishment that I enjoy from the project.
Kristy Lee
kristylee@berkeley.edu
For this portion of the project, I worked on Python code that would automatically stitch images together into a mosaic through a process of identifying harris corners for feature points, determining feature matches, and using the RANSAC loop to find a set of inlier correspondence points that would be used to compute a homography matrix for the image transformation.
For this section, I use the starter code harris.py
to obtain the functions get_harris_corners
for computing the harris corner points and the corresponding values for the harris corner strength. The Harris matrix for (x,y) is computed using an outer product of gradients, shown below:
|
|
|
|
Matching interest points between images costs a lot of time, so thus in this section we implement adaptive non-maximal suppression method to filter the amount of interest points we consider. We compute the minimum suppression radius \(r_i\) for each interest point \(i\) by considering all other interest points \(j \neq i\) and finding the interest point \(j \neq i\) that provides the minimum value \(r_i\) through the following equation:
|
|
|
For this section, for a given image I take each interest point's coordinate (x,y), obtain a 40 x 40 region of the image around the point by taking patch = im[y-20:y+20, x-20:x+20]
, normalize the patch such that the entries of the patch have mean 0 and standard deviation 1, and then resizing the 40 x 40 patch to an 8 x 8 patch using skimage.transform.resize
. I perform the operation on the filtered interest points for both images. Here are some example feature descriptors:
|
|
|
|
|
|
Here, I identify good pairs of features here that closely correspond to each other. For each interest point (x,y) of a fixed (anchor) image, I iterate through all of the other image's interest points (x',y'), compute the SSD between both interest points' feature descriptors (np.sum((fd1-fd2)**2)
), and store the information into NN[(x,y)]'s list by appending (x',y',SSD) to the list. Then, I sort the NN[(x,y)] list by SSD, and then identify (x,y)'s 1-NN and 2-NN. If 1-NN's SSD/2-NN's SSD \(<\) threshold (which I defined as 0.6), then I would keep (x,y) and 1-NN's (x',y') in the list of pairs of potential correspondence points. I loop through each (x,y) repeating the aforementioned process to get a complete list of kept interest points in both images. After I finish identifying the kept interest points, I display the interest points in the images as shown below:
|
|
Here, I compute a homography matrix mapping points from a source image to the coordinate system of another image using the 4-point RANSAC method. The four steps of the RANSAC loop are as follows: 1. Select 4 random indices and use those to select 4 kept interest points in the first image and 4 kept interest points in the second image. 2. Compute an exact homography H using the 8 aforementioned points. 3. Compute inliers where \(\text{dist}(p_i', Hp_i) < \epsilon\), where I define \(\epsilon=1\), and 4. Keep the set of inliers largest across all iterations so far. Once a set of inliers is obtained, compute H using those inliers. I found that using 1000 iterations produced sastisfactory homography matrices.
|
|
Having the H matrix from the RANSAC loop, I can now create the mosaic using image warping and mosaic stitching methods similar to part A. Here's a comparison between the Stands mosaic from manual selection of feature points and the Stands mosaic from autostitching:
|
|
Notice that autostitching allows for the "Center for Connected Learning" stand text to be readable and nonblurry, while the manual version has some blurry "Center for Connected Learning" stand text.
|
|
Here is another example:
|
|
For the whiteboard, autostitching method produces clearer text on the whiteboard than the manual method:
|
|
Here's another example:
|
|
And finally, here is an example of a comparison between 2 mosaics where 3 images were used to create each of the mosaics. Vendor:
|
|
Thus I've produced 4 example mosaics through autostitching and manual selection of feature points.
I thought that the coolest thing I learned when doing this project part is the math/algorithms involved with the interest points, including computing nearest neighbors and the minimum suppression radius. I also liked understanding the RAMSAC loop and understanding how the inliers are programmatically selected to serve as correspondence points for the images that are involved in the homography computation.