Project 6: Auto-Stitching Photo Mosaics

Student Name: Divyansh Agarwal (cs194-26-aba)

Course: CS 194-26 (Computational Photography)

Part A

Background

In this project, we use homographies and blending techniques to create image mosaics.

Shoot and Digitize Images

I shot all the images that you will see in this write-up, using my iPhone's camera. For the images used to create the mosaics, I made sure that I only rotated the iPhone along the axis of the camera, and did not translate my iPhone.

Recover Homographies

In order to align our two images together to create a mosaic, I first defined a set of 10-16 correspondences (number of correspondences chosen based on number of clearly visible corners) between the two images of interest. I chose corners to be the correspondences. In order to determine the transformation needed to align the warp the first image to align both images together, I needed to find a matrix H such that p' = Hp, p and p' are coordinates of the correspondences defined, and H is a 3 by 3 matrix with 8 degrees of freedom. I computed H by solving the linear system of equations above by using the approach suggested in this document . In theory, I would have needed to define only 4 pairs of correspondences, but I defined 10-16 correspondences and had an overdetermined system of equations, since with only 4 points the homography would be very unstable and prone to noise.

Warp the Images

Using the computed Homography matrix, I warped one of my images to align its orientation with the orientation of the other image. I did this by using an inverse warping approach as follows (without loss of generality, let's refer to one of my images as Image 1 and the other as Image 2, with this naming being consistent throughout):

1. I first computed the Homography matrix H that mapped the correspondences between the two images to warp the coordinates of Image 1 to the alignment of Image 2.

2. I applied H to the four corners of Image 1, and got a rectangular bounding box of the warped corners.

3. Within each coordinate of the rectangular bounding box, I applied H^-1 to get the corresponding coordinates from Image 1 to get the pixel from.

4. I used interpolation if the recovered coordinates were decimals and not integers. If the recovered coordinate was not found within Image 1, I would set the pixel value to 0 (this was observed for the cases when I was doing an inverse warp on a coordinate within my bounding box that was not contained within the polygon enclosed by the warped corners)

Rectified images can be seen below:

Original Image

Forest

Rectified Image

Forest

Another pair of original and rectified images is as follows:

Original Image

Forest

Rectified Image

Forest

Stitching and Blending

I warped one of my images into the alignment of the other, and then carried out blending to create a mosaic. I found that alpha blending (feathering) gave me good results, i.e., the blended region was computed using alpha * im1 + (1 - alpha) * im2. I implemented a version of alpha blending where I took a blend over the entire overlap region between the two images, and changed the weight given to each image linearly as I moved from left to right. Towards the left of the overlap region, the image correspondiing to the left of the mosaic was given a higher weight, whereas, towards the right of the overlap region, the image corresponding to the right of the mosaic was given a higher weight. Overall, the stitched mosaics looked pretty good! (I made a small change to my code for blending previously submitted for Part A, to reduce the amount of black area towards the right of the mosaic - rest else is the same)

The first mosaic is as follows:

Image 1

Forest

Image 2

Forest

Mosaic

Forest

The second mosaic is as follows:

Image 1

Forest

Image 2

Forest

Mosaic

Forest

The third mosaic is as follows:

Image 1

Forest

Image 2

Forest

Mosaic

Forest

What I Learned

I gained some useful practice with transformations and warping, as I got to work with homographies. I found this project challenging to code up as I had to deal with lots of bugs that took time to debug, so it was a great way to improve my coding skills as well. The output was super cool and satisfying to perceive! I also learned the importance of defining correspondences well, and appreciate the importance of RANSAC for automatically determining correspondences (which will be the next part of this project) even more.

Part B

Background

In this part, we try to automatically stitch mosaics together, so that we don't have to manually define correspondences like we did in Part A. This involves automatically determining correspondences between two input images, so that an appropriate homography between the two can be computed.

Feature Detection

We need to obtain points of interest in images to be able to eventually determine correspondences automatically. I used the Harris Corner Detector to detect corners. I enforced that the detected corners need to be at least 10 pixels apart.

Example of Harris Corners Detected

Forest

Adaptive Non-Maximal Suppression

The Harris Corner Detector returns a lot of corners, and using all of the corners would slow computations down significantly. Ideally, we would want to select a subset of these corners, and we would want this subset of corners to be evenly distributed across the image.

In order to obtain a subset of corners that are evenly distributed across the image, I implemented the Adaptive Non-Maximal Suppression Algorithm. For each Harris corner A, I defined the suppression radius as the closest distance from Harris corner A to another Harris corner B such that the corner strength of A is less than 0.9 times the corner strength of B. I then picked the 400 Harris Corners with the largest suppression radius.

Example of Corners Chosen by Adaptive Non-Maximal Suppression

Forest

The corners chosen in the example image above do seem evenly spaced

Feature Descriptor Extraction and Feature Matching

To extract feature descriptors, I first took a 40 by 40 Gaussian blurred window around each corner from the image. For eeach patch, I subsampled every 5th pixel to get a 8 by 8 patch. Each patch was then bias/gain normalized so that it was zero mean and had unit standard deviation. I then computed a matrix of pairwise SSD distances between each pair of these bias/gain normalized patches, in order to eventually match feature spaces together. For each patch, I looked at the distance to the nearest neighbor (let's call this distance 1-NN), and the distance to the second nearest neighbor (let's call this distance 2-NN). I computed the ration 1-NN/2-NN, and determined a correspondence between a patch and its nearest neighbor only if 1-NN/2-NN for that patch was less than 0.6. If a correspondence was determined between a patch and its nearest neighboring patch, I said there was a correspondence between the Harris corners those patches corresponded to.

Example of Matched Corners

Matched Corners From One Image

Forest

Matched Corners From The Other Image

Forest

If we look closely at the corners displayed on each of the images above, we can see that they do correspond to the same points of interest.

RANSAC

In order to estimate the Homography needed to warp one image to another, I implemented the RANSAC algorithm. Without Loss of Generality, let's assume our two images are called Image 1 and Image 2, and we trying to warp Image 1 to the alignment of Image 2. In this algorithm, for each iteration of a pre-specified number of iterations (10000 iterations in this case), we randomly sample 4 correspondences out of those identified from the earlier section. We then compute a exact homography H between these 4 correspondences, and then, we apply this Homography to all the corners from Image 1 in the correspondences determined earlier. If applying H to a point from Image 1 gives us a point that matches closely with the corresponding point from Image 2, the point obtained is considered to be an inlier - this was determined by computing the Euclidean Distance between the coordinates of warped point from Image 1 and the expected point in Image 2, and considering it to be an inlier if the Euclidean Distance was less than 1, i.e., if the warped point was less than 1 pixel away from the expected point in Image 2. I counted the number of inliers for every iteration, and kept track of the largest set of inliers, and the correspondences that gave us those inliers. I then computed the Homography to warp Image 1 to the alignment of Image 2 using the correspondences that gave us the largest set of inliers, and determined this to be the Homography needed to warp Image 1 to the alignment of Image 2.

Example of Largest Set of Inlier Corners from RANSAC

Largest Set of RANSAC Inlier Corners From One Image

Forest

Largest Set of RANSAC Inlier Corners From The Other Image

Forest

Only a subset of the matched corners has been chosen by RANSAC

Mosaics

Without Loss of Generality, let's assume our two images are called Image 1 and Image 2, we trying to warp Image 1 to the alignment of Image 2. Using the Homography computed from RANSAC, I warped Image 1, using inverse warping as in Part A. I then blended the warped Image 1 with Image 2 as in Part A, to get a Mosaic.

Mosaics

The first mosaic is as follows:

Manually stitched

Forest

Automatically stitched

Forest

The automatically stitched mosaic is more visually appealing, as the manually picked correspondences may have been slightly misaligned, whereas this would not be the case with the automatically defined correspondences.

The second mosaic is as follows:

Manually stitched

Forest

Automatically stitched

Forest

The automatically stitched mosaic is more visually appealing, as the manually picked correspondences may have been slightly misaligned, whereas this would not be the case with the automatically defined correspondences.

The third mosaic is as follows:

Manually stitched

Forest

Automatically stitched

Forest

The automatically stitched mosaic is more visually appealing, as the manually picked correspondences may have been slightly misaligned, whereas this would not be the case with the automatically defined correspondences.

What I Learned

From Part B, I learned how to generate features, use them for matching, and implement RANSAC. I also learned how to parallelize my Python code (done to speed up Adaptive Non-Maximal Suppression)