In this project, we use projective transformations to piece together separate pictures into cohesive mosaics.
I took some pictures. Here they are, with keypoints plotted.
We can formulate the problem of finding a projective transformation between two images as a least-squares problem with 8 unknowns. I used a least squares solver (np.linalg.lstsq) to solve this resulting problem. To test this out, I rectified the same image from 2 different angles, shown below.
We can create mosaics using the warp method we created above. First, we determine the final size of the mosaic we want. To do this, we forward warp the corners of each input image into the output space (the space of the middle image, in our application), then determine how big we need our output mosaic to be (using the min and max of the Xs and Ys). Then, we can perform inverse warping to fill in our mosaic based on bilinear interpolation of our input images. To properly patch together our images, I used Laplacian pyramids for blending.
The first mosaic turned out better than I expected. Using my iPhone camera, the exposure locking was suboptimal, so there is still a slight difference in exposure between the left and right sides of the mosaic. I'm pretty happy with the second mosaic for the most part, despite the slight distortion in the bottom edge of the image. The third mosaic turned out slightly worse. I attribute the error to the fact that the keypoints I chose make up a relatively small proportion of the image, so small errors are amplified more. Takeaway: even a small amount of error can lead to noticeable errors. However, with an overdetermined system, we can reduce the amount of error and hopefully minimize visual artifacts/inconsistencies.
In Part A, we created mosaics by manually labelling keypoints on each of our images. In Part B, we attempt to use an automatic method of identifying those keypoints. The first step is corner detection: below are Harris corners detected on each of the 3 sets of images I captured.
To filter out the weaker corners and keep only the relatively meaningful corners while also ensuring that our remaining corners are well distributed, we use Adaptive Non-Maximal Suppression to keep only the top 500 corners for each image. Below are the results.
In order to create correspondences between corners on pairs of images, we extract an 8x8 feature for each corner on each image. These 8x8 features are sampled from a larger 40x40 detection patch centered at each corner. They are used in the next step.
Using the feature descriptors extracted from the previous step, we perform the following matching process, which also includes outlier rejection: for each 8x8 feature in image A, we find the top 2 closest 8x8 features in image B based on squared distance. Then, letting the distance to the closest feature be d1 and the distance to the second closest feature be d2, if the ratio d1/d2 is below a certain threshold (0.675, as inspired by Brown et al.), we keep the correspondence to the closest feature. Otherwise, we discard the correspondence. Below are the resulting corners after rejecting the outliers.
Using RANSAC, we attempt to find the best homographies for mapping each image in a set to the base image in that set. There are several hyperparameters for this learning process:
Now that we have automatically generated our keypoints for each image, we can use the same method as in Part A to create our mosaics.
I'm mostly surprised by how effective simple corner detection can be when you apply the right metrics for choosing the keypoints. Without any sort of outlier rejection, I'm sure I would not have been able to generate satisfactory results. With a combination of adaptive non-maximal suppression and RANSAC, though, the resulting mosaics are better than the ones created from manually chosen keypoints. For example, for the Evans mosaic, the angles look more realistic in the automatic mosaic than in the manual mosaic. For the Hearst Mining building, the alignment is noticeably better.
|Mosaic Name||Manual Keypoints||Automatic Keypoints|