CS194-26 Project 6
Joey Barreto
Project 6a: Image Warping and Mosaicing
When a sequence of images is taken about a fixed point--like different views from a tripod--the images may overlap if one is trying to construct a panoramic picture. There will be features shared across the pictures, but they will each look different due to the different viewing angles. It is possible to warp any image into any other's perspective using a projective transformation, also called a homography, provided the two images shared the same point of projection (i.e. the camera didn't translate between shots). Once corresponding feature points are chosen between two images, the image below shows the form of the homography matrix:

The matrix has 8 unknowns, and each pair of points gives us 2 equations, so at least 4 pairs will be needed to determine the system, but preferably more will be given to better stabilize the least-squares solution. Since the camera settings often change automatically to some extent between each shot, the warped image will need to be blended into the image whose perspective we kept in order to hide artifacts.


Rectification
Given an image, one can chose any shape or object viewed in perspective and warp its corners to a head-on view (hence the name, quadrilaterals become rectangles). The matrix is then applied to the rest of the image to rectify it. Below are two examples. The left portal of the Japanese lantern and the 18th calendar day are rectified.


before image

rectified image

before image

rectified image

Warping and Blending
Here three pairs of images taken at different but overlapping viewing angles are combined. In each case, the right image is warped into the left's perspective, and then linear blending is used to hide the edge artifacts. The tree image came out a bit off because it was hard to see the corresponding points on the bark.

left image

right image

panorama

left image

right image

panorama

left image

right image

panorama



Summary
I really enjoy the fact that homographies can undo sheared patterns in paintings that might otherwise be extremely hard to visualize.
Project 6b: Image Warping and Mosaicing
In part 6a, the feature points in each image had to be chosen manually by the user. In this part of the project, the points will be automatically selected to allow for effortless panorama stiching.
Detecting Corner Features
First, for each pixel we seek to find how a patch centered on that pixel correlates with a shifted version of the same patch.
S(x,y) is the correlation for a shift of (x,y) and patch centered at (u,v), and it can be approximated by the second equation.


A is known as the Harris matrix. Patches containing large variations in S for shifts in all directions can be expected to have some kind of corner--these are called interest points. The condition for being an interest point can be reexpressed as having a large value of R = det(A) / Tr(A).

The first row of images below contain the interest points for two images that we intend to warp. The points displayed are local maxima of R in a 1-pixel radius. We only need a few good matches, and many of the interest points are not strong local maxima, so we threshold the interest points, keeping only those with a response above 0.1. The remaining points are in the second row's images.

We choose the best k local maxima via a metric called Adaptive Non-Maximal Suppression (ANMS). For each point, the minimum distance to another point with a response value that is still greater than its own after being reduced by a scale factor is found. The minimum distance is called the suppression radius (see below equation), the scale factor is called c_robust and was set to 0.9, and the points with the k largest suppression radii were chosen. If a point has a large suppression radius, it is far from other points that have a much larger response value, and thus it is a good local max. For each image in the third row, the 500 best points are displayed.

Left image Harris points, n=12573

Right image Harris points, n=16028

Left image Harris points after cutoff, n=1904

Right image Harris points after cutoff, n=3394

Left image cutoff points after ANMS, n=500

Right image cutoff points after ANMS, n=500

Extracting and Matching Feature Descriptors
For each of the remaining interest points, a feature vector needs to be extracted.The vector is an 8x8 patch of pixels sampled from a 40x40 patch centered on the point. To avoid aliasing, the 8x8 patch was taken from a downsampled and interpolated version of the image, so that the 8x8 patch covered a region that originally spanned the 40x40 region. Each feature was bias/gain normalized. Then, each patch in the first image is compared to all in the second, with the distance between patches given by the sum of squared differences (SSD) formula. Following the MOPS paper, only matches where the ratio of the 1-NN dist / 2-NN dist was sufficiently high were kept--the chosen cutoff was 0.6. Below two matched interest points that satisfied the cutoff and their feature vectors are displayed.


Two matched feature points and their corresponding feature vectors

Recovering the Homography
The last step is to implement the Random Sample Consensus (RANSAC) algorithm to recover the best possible homography from the matched interest points. Four random pairs of matched points are chosen, and their homography is found. This homography is used to map all the remaining points of one image to the other, and the respective differences between points is found. If the error is low, a pair of points is counted as an inlier. This process is iterated a large number of times, and the set of the highest number of inlier points is kept. The final homography is recomputed on this set of inlier points, and is used to warp one image to the other's perspective. I iterated this process 15000 times and counted an inlier if the error was less than 20 (error was also computed with SSD).


Left image inlier points, n=31

Right image inlier points, n=31

The images stitched in part 1 are compared to their automatically-aligned counterparts. All parameters were the same as above except the second image had cutoff = 0.05, 1nn/2nn thresh = 0.8, and the third image had cutoff = 0.25, 1nn/2nn = 0.8.


Manual

Automatic

Manual

Automatic

Manual

Automatic

Summary
I think it's just generally cool that this all can be done automatically. I took these panoramic images a long time ago, knowing that I had software on my computer that I could use to stitch them together. I never thought that years later I would learn to implement what that software was doing!