Stitching Photo Mosaics

Image Warping, Mosaicing, and Automation

Gautam Mittal

Scroll down for Project 4B.

Shooting Scenes

Chemistry Library at UC Berkeley.
The interior walls of the Panoramic Berkeley apartment complex.
Philz Coffee at 1 Front Street, San Francisco, CA.

Recover Homographies

The goal of computing a homography is to solve the equation p' = Hp, where p is a set of points on one image and the homography matrix (H) transforms those points into warped points p' of another image. Since this matrix is a 3x3, there are 9 unknowns. However, rearranging this into a system of equations yields that one of the parameters of H is actually a free parameter which can be set to 1 for convenience. The following system of equations was used with 6-8 correspondences per pair of images in experimental results.

System of equations of solved to find a homography matrix H given a pair of images and their aligned correspondences. In experiments, we made this an overdetermined system by providing more than 4 correspondences (to deal with noise) and then solving with least squares. h33 is a free paramter that can be set to 1 for convenience.

Warp the Images and Image Rectification

Inverse warping with the H matrix was used to generate the new scene geometry, and below are some examples of warping being used to rectify an image such that a planar surface is frontal-parallel.

Left: Philz Coffee at 1 Front St, San Francisco. Right: Rectified image of painting in the original frame.
Left: UC Berkeley Chemistry Library. Right: Rectified image of tile on the ground.

Blend Images Into a Mosaic

Given a sequence of images, I first set the canvas coordinate system to be a translated version of the center image in the sequence. I then warped each image in the sequence and placed it on the canvas. To smoothly blend between images, I computed the intersecting area between a warped image and the rest of the canvas and used the middle line splitting the intersecting area as the axis to blend between the two images. The blend axis allowed for a smooth transition between images since this axis was blurred with a Gaussian filter. Below are results on three different scenes.

Philz Coffee at 1 Front Street, San Francisco, CA.
Interior walls of the Panoramic Berkeley apartment complex.
View of the Chemistry buildings from the entrance to Lewis Hall, UC Berkeley.

Lessons Learned

I learned how to effectively leverage homography to automatically stitch images with manually assigned visual correspondences and that the algorithm for doing this is surprisingly simple! I also learned a lot about the algorithm's sensitivity to change of camera settings or alignment of correspondences across images (it is not very robust to noisy correspondences or images at this point).



Project 4B: Autostitching

Harris Interest Point Detection

Using the starter code provided, we collect a large number of candidate points of interest.

Above: Harris corner points with min_distance=1, edge_discard=20 on the first image from Philz. Below: Harris corner points on the second image from Philz.

Adaptive Non-Maximum Supression

Using the ANMS technique presented in the MOPS paper, I compute the minimum supression radius for each interest point with c_robust=0.9. The points are then sorted by their minimum supression radius and the top 500 points are kept.

The minimum supression radius for an interest point is defined as:

Above: ANMS result (top 500 points) with c_robust=0.9 on first image. Below: ANMS result (top 500 points) with c_robust=0.9 on second image.

Feature Extraction and Matching

Feature descriptors are generated by taking a 40x40 window around each point of interest returned by ANMS, blurring the window with a Gaussian filter, and then downsampling to 8x8. Each feature descriptor is standardized to have 0 mean and 1 variance. To match descriptors across images, we compute the SSD between each descriptor and all descriptors in the other image. We use Lowe's trick and only keep a match if the ratio between the 1-NN error and 2-NN error is less than 0.3.

Matched points between the two images.

RANSAC

To remove outliers, RANSAC (as described in lecture) is run for 1000 iterations with epsilon=0.1 to find a robust set of inliers for computing the homography.

RANSAC points used to compute a robust homography between the two images.

Autostitching Mosaics

Combining all steps in this process, autostitched mosaics for each scene can be easily generated. Mosaics created with manually-defined correspondences are presented as well for comparison. In all scenes, the autostitched mosaics are of comparable (or better) quality than then hand-created mosaics. The stitching/blending process is identical for both methods (main difference is homography computation).

Philz Coffee in San Francisco. Above is mosaic with manually defined correspondences while bottom is mosaic with automatically defined correspondences.
Interior of the Panoramic Berkeley apartment complex. Above is mosaic with manually defined correspondences while bottom is mosaic with automatically defined correspondences.
Chemistry Library at UC Berkeley. Above is mosaic with manually defined correspondences while bottom is mosaic with automatically defined correspondences.

Lessons Learned

This automatic method is surprisingly effective! The fact that it can produce better or comparable results purely from images is mind-blowing and the simplicity of the implementation is especially impressive. The coolest subpart in this project was RANSAC, which is very simple but very useful for computing a robust estimate from noisy data.