Christine Zhou, cs194-26-act
Music 26AC Reader
Flour
HMMB
Leconte
UCBSO
In this part, we wanted to find the homographies for the image in order to warp one image into the shape of an other. First we defined the correspondences between the two images. Then, using the correspondences as our pixels, we calculated the H matrix such that H * p, a correspondence in the original image, maps to the p', a correspondence in the warped image. Once we have this H matrix, we can use the inverse of H and do an inverse mapping. This will get all the pixel values of the warped image from the original. If this is done for each of the pixels in the warped image, we will get the original image warped into the shape of the warped image. Below are examples of two images that are warped and rectified (the output correspondences form a rectangle, effectively "rectifying" the image):
Using the warping that was discussed in the previous part, generate mosaics, panoramic-like photos. This is done by aligning the photos together and then blending the resulting image.
To do this, we again will map correspondences between the two images. Then we will keep one image static and warp the other image into the static image. Once we have the static image and the warped image, we can set each image on top of each other. There will be a visible seam between the two images, but this can be blended using alpha blending over the entire overlapping region. Some examples of the "mosaic"ed images are shown below (originals, the warped image, and then the blended images):
HMMB
Leconte
UCBSO (left and middle)
UCBSO (right and middle)
I learned that these panoramic-like photos do not involve math that is too complicated, and with just matrix multiplications we are able to align images together. I also learned that it is very important that the correspondences are carefully picked; if the points are just barely off then the resulting image will not be aligned very well.
For this part of the project, we will find corresponding points between two images automatically. We will then use these corresponding points to warp and create a mosaic of our images.
Using the provided code for finding Harris points of an image, we use this to detect possible correspondence points of each of the images. The Harris points are shown below:
The found Harris points are too numerous so there are a few ways that we can filter some of the points. The first way we can do is take the points with the highest corner strengths, though the distribution of the points is not good. Points using this idea are shown below:
To get a better distribution of points, we will use adaptive non-maximal suppression based on corner strength. ANMS is based on the following equation:
The general gist of the algorithm is as follows: we first iterate through each of the Harris points and determine the minimum suppression radius for that point (the minimum suppression radius is essentially the radius to the closest other point that has a much stronger corner strength than the current point). Once we calculate this for all points, we pick some number of points with the largest minimum suppression radii which will represent the points that are far from other points that have stronger corner strength. Performing ANMS on our above images is shown below, choosing the top 250 points:
Once we have used ANMS to determine a better distribution of points that have stronger corner strengths, then we want to determine the actual correspondences between the points that remain. To determine the correspondences between points, we will view a 41 x 41 window around each point, resize it down to an 8 x 8 window, and then subtract the mean and divide by the standard deviation of the 8 x 8 window. This will give a good feature descriptor of the patch around each point. Once we calculate the feature descriptors for each of the points in each image, we will use the sum of squared differences to measure the similarity between every patch in image 1 with every patch in image 2.
However, some of the points in one image may have a nearest neighbor in the other image that is incorrect or may not exist in the other image, so we will need to reject some of the points. Once we have the SSD's of each patch in image 1 with each patch in image 2, we will use Lowe's feature space outlier rejection ratio to determine which points to keep. For this ratio, we will divide the SSD of the nearest neighbor by the SSD of the second nearest neighbor to get a ratio. If this ratio is large, this means that the first and second nearest neighbors are very similar, which means that this point is not a very good point to use since there are many neighbors of equal SSD. Because of this, we will only keep points whose ratios between SSD of first and second nearest neighbors are lower than 0.6.
After doing the above two steps, we are left with the following points:
The following image shows the correspondences of the left and right images:
Some of the points in one image do not have a corresponding point in the other image, resulting in a strange correspondence. The next step is to remove these outliers.
As a final step, we will perform RANSAC on our images to determine the final set of correspondence points. We will choose a random number of feature points, create a homography matrix from those points, and try to map each feature point using the homography matrix. We will then calculate the SSD between H * (x, y) and the nearest neighbor to (x, y) = (x', y'). Any point with an SSD less than some threshold will be labeled an inlier. We will then count the number of inliers.
Since there is randomness involved, we will repeat this process for many iterations, picking the maximum set of inliers that appear of all iterations. Using this set of inliers as our correspondences, we will use these are our points and use our code from part 6A of this project to warp the images in the end.
The images and points below use iterations = 1000 and number of points = 12 with a threshold of 0.5.
The images shown below are the images blended together using manual correspondences followed by the automatically generated correspondences.
A lot of the images are similar in quality. The biggest difference I saw was with the images of Leconte at the very top of the blended image. The manual correspondences could have been chosen better to avoid the ghostly top edge of the building, but the automatically generated correspondences avoid this issue, where the top edge of the building is very smooth. In addition, in the last set of UCBSO images, the bottommost red poster's text is not lined up completely with the manual correspondences but line up very well with the automatic alignment.
I learned that you can do a lot with very simple techniques such as SSD's or Lowe's ratio. It was really eye opening to see all the steps help narrow down the correspondence points and produce images similar to the manual points that we chosen.