CS 194-26: Project 4
For this project, we calculated the homography between images using manually selected shared points. The homography was then used to warp one of the images to the other to create a blending of the different images. By stitching multiple images together, we are able to create a panorama. We are also able to use the same idea of homography to rectify images so that rectangular object in the real world can be warped so that they are flat in the image as well.
Feature Detection
To know how the images should merge together, we need to select points that exist on both images. These potential points can be selected manually or automatically. Manual selection of the points is simple, but it requires someone to spend the time to figure out which points are in both images and select them in the same order for both images.
Instead, we can use Harris interest point detector to find features with corners in the images using Harris's algorithm. Below are random subsets of Harris points on three images that will make up a mosaic.
One problem with the Harris interest point detector is that it outputs a large number of points which make it slow to work with. Therefore, we can use adaptive non-maximal suppression (ANMS) to get a subset of points. Instead of randomly selecting points, ANMS gives points that are spaced out and their corner strengths are locally maximized. It uses the equation below to make the calculation:
(where is the Harris point, is the set of all Harris points, is used to determine the level of suppression, and is the corner strength of the Harris point)
We sort the r values of the points and take the top 500 points for a more management subset.
Descriptor Extraction and Feature Matching
Now that we have a set of points that have good corner strengths, we can extract the feature descriptors. A feature descriptor is the area around a feature point which will be used to match a point from one image to another. The feature descriptors were created by blurring the entire image using Gaussian then taking 40x40 area around each point. Each square was then downsampled to 8x8. It was then bias/gain-normalized.
With the feature descriptors on hand, we can compare each feature on one image to every feature on the other image to find which features match the closest with each other on the two images. We used sum of squared differences to determine how close two features are to each other. Since we want matches that we are confident with, we impose a threshold when selecting matches. As the MOPS paper found, using a threshold of 0.66 for the ratio between the first lowest SSD error and the second lowest SSD error produces a fairly good probability distribution for correct matches with limited incorrect matches.
Although there are points that are incorrectly matched, they will be eliminated in the next step.
RANSAC
In order to make sure there are no homography-ruining outlier points, we uses RANdom SAmple Consensus to estimate the homography matrix while avoiding feature-space outliers. We do the following steps 10,000 times with the feature matched points:
- Choose 4 random matching features
- Calculate the homography matrix with the 8 points
- Translate the feature points on one image using the homography
- Find the inliers using
- Store the set that has the largest number of inliers
In the end, we will have the largest set of inlier points. These points can be used to create the homography that will warp one image to the other.
Recover Homographies
By taking pairs of corresponding points, we are able to compute the homography matrix which can be used to warp an image into the perspective of the other image. We can recover the homography matrix using the equation where is a vector of the point in the first image and is a vector of the point in the second image. We can represent the equation as a matrix multiplication:
We can then rearrange the matrix multiplication to find the homography matrix using multiple points.
With the manually selected corresponding points, we can use least squares to find the homography matrix.
Warp the Images
Using the homography matrix, we can use the dot product on the image coordinates to get transform matrices for x and y. We then use the remap
function to interpolate the color values for the warped image. We are then able to create the warped image.
Image Rectification
By using a manually inputted rectangular coordinates, we can warp the image to make a particular element uniformly rectangular.
Blend the Images into a Mosaic
By warping images to a center image, we can then blend the images together to create a panoramic mosaic image. To blend the images, we add alpha values which start at 1 in the center column then linearly decrease to 0 at the ends. We then normalize the overlapping values to sum up to 1 which can then serve as weights for the color values. All the images are added together to create a blended panoramic image.
Outdoor example
Living Room Example
Kitchen Example
Conclusion
The coolest thing I learned from this part was how homography can be used to align points by warping the perspective. This was particularly interesting when we are able to warp images to make objects flat in rectify the image. This allowed unreadable parts of my monitor to be legible after rectifying the image.
Part B
The coolest thing I learned from this part was the feature matching. It's pretty impressive that something as small as 8x8 feature descriptors could be used to find matching feature descriptors in other images, resulting in points that are in the small location visually.