Project 5: (Auto)stitching and photo mosaics

CS 194

By Won Ryu

Part A

Overview

This project is about image mosaicing. I created an image mosaic by registering, projective warping, resampling, and compositing them. This will be done by computing homographies, and using them to warp images.

Shoot the Pictures

I have taken pictures of Irvine and the Berkeley campus from the same point of view but from different directions and I made sure that there are overlapping fields of view that can be used for picking out corresponding points that can be used for computing the homographies.

Recover Homographies

The homography matrix is the matrix H that when point p corresponds to the same point as p’ on another image then the matrix transforms p to p’ p’=Hp. H has 8 degrees of freedom as it is a 3X3 matrix with the lowest right element being 1 so we can put the values of H in a vector and set up the equation Ah = b. Then a least squares solution can be used to find h and then reshaped in the 3X3 matrix H. The least squares solution was found using the pseudo inverse.

Warp the Images

Then the image can be warped by first forward warping the corners to find the dimensions of the warped image. Then going to the coordinate of the warped image, the pixel for each color channel was found by inverse warping to the coordinate of the original image and linear interpolating for the pixel value from the original image.

Image Rectification

To check if the homography and the warping was done correctly, a rectification of an image was done. These square tiles on the floor which are taken from an angle were rectified to make it seem they were taken from a view from right above.

Bathroom floor

bears

bears

Rectified image

bears

Kitchen tiles

bears

bears

Rectified image

bears

Blend the images into a mosaic

One of the images was warped to the other. Then they were put on a canvas and were blended using the Laplacian pyramid technique that was used for the Oraple in project 2.

Irvine view

First image.

bears

Second image.

bears

Second image warped to first.

bears

Mosaic

bears

Terrace Cafe view

First image.

bears

Second image.

bears

Second image warped to first.

bears

Mosaic

bears

East Asian Library view

First image.

bears

Second image.

bears

Second image warped to first.

bears

Mosaic

bears

What I’ve learned

I learned a new meaning of linear algebra. I was always aware matrix multiplications were transformations and that 2D vectors can be visualized in its transformation but it was this project that made me realize that the same concept can be used to warp images in whatever shape possible. It’s really cool that linear algebra can be used to rectify images and make them into shapes that we desire.

Part B

Overview

Now we will implement detecting the corresponding points for the warp automatically instead of doing it manually. This part was inspired by this paper. https://inst.eecs.berkeley.edu/~cs194-26/fa20/hw/proj5/Papers/MOPS.pdf

Detecting corner features in an image

To detect the points of interest, we start out first finding the corners using the Harris detector which finds the coordinates of the image where if you move in both the x and y directions there are significant changes.

bears

bears

Adaptive Non-maximal Suppression

Since the Harris detector detects a lot of points and we want the best point within each region we used Adaptive Non-maximal Suppression to select only the most significant within regions of the image. We set c_robust to 0.9 and the function for each points is the corner function. The I is the set of all points and we find the r_i value for each i points as shown in the equation. Then we only keep the 500 points with the highest r_i values. bears

bears

bears

Extracting a Feature Descriptor for each feature point

For each point we want a feature descriptor. This feature descriptor for each point will be a patch of 40X40 around the point which gets downsampled to a 8X8 patch and the values get normalized to have a mean of 0 and standard deviation of 1.

bears

bears

Matching these feature descriptors between two images

Using the feature descriptors of the points, the points were matched. The matching was done by for each point in the first image the ratio of the distance between the first nearest neighbor and the second nearest neighbor of the point in the second image was found. If that ratio was lower than 0.65, the point and the first nearest neighbor of the point in the second image was matched. This was done to ensure that the match has to be decisive and the 1st choice must be significantly better than the second choice to be matched.

bears

bears

RANSAC

Now that points were matched, we want to eliminate the matches that were done incorrectly. For this we are inspired by the idea that those that are correct are correct in the same way and those that are incorrect are incorrect in their own ways. We did this using RANSAC by iteratively (2000 times) picking 4 random points and computing a homography matrix with them and then with points p_i and p`_i which are the points for image 1 and image 2 respectively, we calculated the distance between p`_i and Hp_i. If this distance was less than 30 then we increment a counter for the match. If a match received more than 50 counts it was considered a good match and was kept.

bears

bears

Produce a mosaic

Using the matched points after RANSAC, a homography was calculated and the images were warped then stitched like part a.

Irvine view

First image.

bears

Second image.

bears

Second image warped to first.

bears

Auto Mosaic

bears

Manual Mosaic

bears

Terrace Cafe view

First image.

bears

Second image.

bears

Second image warped to first.

bears

Auto Mosaic

bears

Manual Mosaic

bears

East Asian Library view

First image.

bears

Second image.

bears

Second image warped to first.

bears

Auto Mosaic

bears

Manual Mosaic

bears

What I’ve learned

I learned that using computational methods, it is actually more accurate for the computer to detect corners and points of interest than humans. I also found it cool that life wisdom such as not choosing either one if you’re having a hard time deciding can be applied to computer science when matching points with as we used the ratio between the first and the second nearest neighbors as opposed to just the nearest neighbor distance.