If you have multiple images from the same XYZ location, but different rotations, you can combine the images using a projective transform.
For a projective transform, given initial coordinates \(x\) and \(y\), we want to calculate matrix \(H\) such that $$ \begin{bmatrix} a & b & c \\ d & e & f \\ g & h & 1 \end{bmatrix} \begin{bmatrix} x \\ y \\ 1 \end{bmatrix} = \begin{bmatrix} wx' \\ wy' \\ w \end{bmatrix}$$ That gives us the equations $$ax + by + c = wx'$$ $$dx + ey + f = wy'$$ $$hx + gy + 1 = w$$ \(w\) is not a variable we want to solve for (since it is not an element of the homography), so we convert the above 3 equations into 2 using \(w = hx + gy + 1\) $$ax + by + c - hxx' - gyx' - x' = 0$$ $$dx + ey + f - hxy' - gyy' - y' = 0$$ We want the bottom right parameter of the matrix to equal 1, so we add a \(1000i = 1000\) constraint (1000 so that it is sufficiently weighted vs the other pixel-valued constraints), and then use least squares on a sufficient number of points (at least 4, I used 6 points each time).
Once we have a homography \(H\) converting points from image 1 to image 2, we can perform the warp. The pseudocode goes like this:
sk.RectBivariateSpline
)Once you have a warped image, you can combine the warped image and the image it was warped to to create a mosaic! Here are some examples
There are some issues with the borders here: I used a simple take-the-pixel-by-pixel-max blend, which worked pretty well except for the chairs in the Soda 6th floor image and teh portrait in the pool table image.
I learned a lot about homographies and transforms: esp. how to solve for a homography using least squares.
Selecting points is cumbersome; wouldn't it be nice to do this automatically? One way to do so is to find shared features of each image. One easy feature to look for is corners: The harris corner detector detects corners of any angle in an image. Read up on it here. Here are some corners of an image:
So many corners! But let's say we have two sets of corner from two images we want to align together. First, we'll want our corners to be roughly evenly spaced throughout the image. You can achieve this with Adaptive Non-Maximal Suppresion from the MOPS paper. Here is the code vectorized!
I wrote up an explanation that doesn't quite explain how to do this on piazza (to not give away answers). After ANMS, you extract a feature descriptor of each corner (a zero-mean unit-variance 8x8 patch of every fifth pixel around a corner). After the feature descriptor, we compare all pairs of corners between images and take the best pairs, but only those where the best is "notably" the best: when it's difference score is at most around half the next best match. This is called 1-NN/2-NN in class. After this, we run RANSAC: repeatedly pick 4 random pairs (from the surviving pairs), see how many "inliers" there are from the \(H\) matrix determined by those 4 (well-matched pairs); remember the largest set of inliers and compute the final \(H\) from that. Voila!
This leads to some very nicely aligned images (still using the np.maximum for the seam)
Notice how much better aligned the last image is!
I learned that empirical parameters matter a lot: tuning the ANMS and 1-NN/2-NN hyperparameters (also the margin and spacing on the Harris corners) matters a lot here, and a general purpose program would probably want to try multiple settings.