I shot 3 sets of 2 left-to-right pictures by the beach, at Caffe Strada, and at my desk. I rotated the camera significantly for the beach pictures and zoomed slightly for the desk pictures. My final projection code uses one image as reference and projects another onto that space, so for each set of images to fit in the projected result later in this project, I had to add a significant amount of right and/or bottom padding to the second image in each set.
I used ginput and a separate Python script to display both of these images and select distinguishing points. I selected 8 points from each image to ensure an overdetermined system when solving for the homography matrix. Then, I wrote a function to take in (p, p') and solve for the homography matrix mapping p to p'. Given the matrix of (x, y) coordinates, I set up a matrix of coefficients corresponding to the homography matrix entries that would operate on each coordinate, and used SVD to get the smallest nonzero homography matrix to solve for the unknown homography matrix entries. Since this came out to a 9-element vector, the last step was to reshape it into a (3, 3) homography matrix.
Next, I calculated the homography matrix warping the right image to the left image's POV in order to warp it to the same view. I created a grid of all pixel coordinates in the right image, applied the homography matrix transform to get the corresponding coordinates in the target space, and used a map function to transfer each pixel value over. Although the second image was padded to be much bigger, you can see that similar landmarks correspond to similar coordinates between the 2 images.
I verified my warping function by selecting 4 rectangular points on the right wall of this street image, and making a copy that snapped the top right and bottom left points to the same plane as the diagonal between the top left and bottom right points, recreating what the selection would look like if the right wall was parallel to the camera. Then, I calculated the homography matrix from the imaginary parallel coordinates to the actual points I selected, and used this to warp the original image to that imaginary point of view. In the result, the right wall successfully shows up parallel to the camera.
I got a similar result for a second image, rectifying my computer screen to be parallel to the camera view.
I tried using a simple alpha blending method, but there was significant ghosting, so I copied the Laplacian pyramid code from Project 2 and was able to make mosaics out of the sets of 2 images as below.
The most interesting thing I learned was that, as in the case of the rectified street picture and even my image after warping, applying a transform to change the point of view of an image can yield a completely natural-looking image from the shifted POV. I had assumed that certain parts on the extremes of the image would appear squished or distended.
First, here are the detected Harris corners for one of my images, and the same corners after Adaptive Non-Maximal Suppression. You can see that after ANMS, the corners are spread more more thinly while still maintaining uniformity relative to the original points.