Project 5: [Auto]Stitching Photo Mosaics

This two-part project is focused on transforming and combining multiple photographs to form a single, cohesive image.

Part A: Image Warping and Mosaicing

In this part of the project, I performed image mosaicing by taking two or more photographs and creating an image mosaic by registering, projective warping, resampling, and compositing them. Below, I will walk through the steps I took to produce my final result for this part.

1. Shoot and Digitize the Pictures

Below are the photograph sets I took. I aimed for 40-70% overlap between consecutive photos, using a tripod to maintain the same camera position while varying viewpoint.

Set 1: Scene

scene 0 scene 1 scene 2
sc0 sc1 sc2

Set 2: Tree

tree 0 tree 1 tree2
tree0 tree1 tree2

Set 3: Park

park 0 park 1 park2 park3
park0 park1 park2 park3

2. Recover Homographies

I warped my images into alignment with a projective transformation, or a homography, expressed as the matrix multiplication p' = Hp. To find the parameters for the transformation, I took corresponding points p from the original image and p' from the target image and calculated H with least-squares on Ah = b, where h is a length-8 vector of unknowns in H. A homography has four degrees of freedom, so four or more reference points are needed from each image. As an example, below are two of my photographs from above, each with eight lableled correspondence points.

scene 0 scene 1
sc0 sc1

3. Warp the Images

Once I have the homography matrix H, I can project any image onto any other by applying the transformation p' = Hp to the original image. To do this, I computed an inverse warp with bilinear interpolation. Projective warping is a very powerful tool. For example, I can use it to do...

4. Image Rectification

To make sure that my homography transformation implementation is correct, I first took some pictures of planar surfaces and warped them to make those surface planes front-parallel. Below are the original images, side-by-side with their warped counterparts. I have also plotted the reference points I used for rectification.

Before Rectification After Rectification
Window window window_warp
Bridge Sign bridge bridge_warp
Map of London london london_warp

5. Blend the Images into a Mosaic

Now, I will take my three overlapping photograph sets and blend each one into a single, continuous mosaic. First, I shall project the first and third images one by one to match the features of the middle image. Then, I will combine them using weighted averaging.

Here are the results of the warping:

Mosaic 1: Scene Mosaic

Scene 0 Scene 2
Before Warp sc0 sc2
Scene 1 Labels sc1 sc1
Warped to Match Scene 1 sc0_warp sc2_warp

Here are the pieces before combining:

Scene0 Scene1 Scene2
sc0_warp sc1 sc2_warp

And here is the final, combined result:

final_flat_mosaic

Mosaic 2: Tree Mosaic

Tree 0 Tree2
Before Warp sc0 sc2
Tree 1 Labels sc1 sc1
Warped to Match Tree 1 sc0_warp sc2_warp

Here are the pieces before combining:

Tree0 Tree1 Tree2
sc0_warp sc1 sc2_warp

And here is the final, combined result:

final_flat_mosaic

Mosaic 3: Park Mosaic

Park 0 Park 2 Park 3
Before Warp sc0 sc2 sc2
Target Labels sc1 sc1 sc1
Warped to Match Park 1 sc0_warp sc2_warp sc2_warp

Here are the pieces before combining:

Park0 Park1 Park2 Park3
sc0_warp sc1 sc2_warp sc2_warp

And here is the final, combined result:

This mosaic is not quite as well-aligned as the other ones. I suspect that the reason is a combination of wider total angle (combining four photos instead of one) and some slight variation in camera position from bumping the tripod.

final_flat_mosaic

Tell Us What You've Learned

One of the coolest things I learned from this project is how powerful projective warping is. Before this project (and the lecture covering the requisite material), I never suspected that a simple homography would be enough to completely transform the viewpoint angle of an image.

Part B: Feature Matching for Autostitching

In this second part of the project, I created a system for detecting matching features and automatically stitching images into a mosaic. The approach I used is based on the following paper: https://inst.eecs.berkeley.edu/[cs194-26/fa20/hw/proj5/Papers/MOPS.pdf

Step 1: Detecting Corner Features in an Image

First, I used a Harris Interest Point Detector to detect corners in my source images. Below are my images overlaid with automatically detected interest points.

Scene 0 Scene 1 Scene 2
Original scene scene scene
Harris Points scene scene scene
Tree 0 Tree 1 Tree 2
Original tree tree tree
Harris Points tree tree tree
Park 0 Park 1 Park 2 Park 3
Original park park park park
Harris Points park park park park

Adaptive Non-Maximal Suppression (ANMS):

As you can see, the above Harris points are very dense. To reduce the feature set size to a manageable number while maintaining a good distribution across the area of the image, I suppressed all Harris corners that do not represent a maximum corner strength within a given radius. I fine-tuned this radius to yield the desired number of points (500). The results are visible below:

Scene 0 Scene 1 Scene 2
Original Harris Points scene scene scene
After ANMS scene scene scene
Tree 0 Tree 1 Tree 2
Original Harris Points tree tree tree
After ANMS tree tree tree
Park 0 Park 1 Park 2 Park 3
Original Harris Points park park park park
After ANMS park park park park

Step 2: Extracting a Feature Descriptor for Each Feature Point

Next, I converted each corner feature to an 8x8 feature descriptor patch. To do this, I sampled a 40x40 pixel patch around each interest point, downsampled it to 8x8, and normalized it to achieve a mean of 0 and a standard deviation of 1. To avoid aliasing, I performed a Gaussian blur on each patch before downsampling. Below are a few examples of feature descriptors generated from my photos:

desc desc desc desc desc
desc desc desc desc desc

Step 3: Matching Feature Descriptors Between Two Images

After calculating the features, I found unique matching pairs of feature descriptors between pairs of adjacent images using approximate nearest-neighbors (NN). To do this, I calculated the SSD distance between all pairs of feature descriptors between image pairs. Then, I used the "Russian Granny Trick" of rejecting any pair that does not meet the threshold (1-NN Distance) / (2-NN Distance) < 0.3. Here are the resulting pairs of images marked with corresponding points:

First Image in Pair Second Image in Pair
pair pair
pair pair
pair pair
pair pair
pair pair
pair pair
pair pair

Step 4: Use RANSAC to Compute a Homography

I used RANSAC (RANdom SAmple Consensus) to get rid of any remaining outliers after applying my feature-space outlier rejection. To do this, I looped through my corresponding point pairs repeatedly, selecting random groups of four pairs and calculating the resulting homography. Then, I kept the inliers from the homography that had the most agreement among all the point pairs and generated a final homography from those inliers using least squares. Here are the final correspondance point pairs:

First Image in Pair Second Image in Pair
pair pair
pair pair
pair pair
pair pair
pair pair
pair pair
pair pair

Step 5: Use Homographies to Produce Mosaics

With all the homographies in place, I now had all the pieces I needed to produce photograph mosaics. As before, I warped every photograph to match the geometry of the second photo in its respective set. Below, I show the autostitched images side-by-side with the hand-stitched panoramas from Part A.

Scene:

Manually Stitched scene
Automatically Stitched scene

Tree:

Manually Stitched tree
Automatically Stitched tree

Park:

Manually Stitched park
Automatically Stitched park

Bells and Whistles 1: Cylindrical/Polar Mapping

As an experiment, I decided to project the scene mosaic onto a cylindrical surface. I did inverse sampling from my original, unwarped images to the mosaic. After some testing, I settled on a radius of 1600 and focal length of 1410.

Cylindrical Warps of Each Source Image:

Scene 0 Scene 1 Scene 2
Original scene scene scene
Cylindrical scene scene scene

Cylindrical Mosaic:

Flat Mosaic scene
Cylindrical Mosaic scene

Bells and Whistles 2: Use Homography to Hang Painting on Wall

As a non-standard Bell/Whistle, I decided to use homography to add some decoration to a room that looked rather spartan. First, I took a photo of a room with blank walls. Then, I projected a painting to match the geometry of one of those walls. By combining the two images, I managed to "hang the painting" onto the wall. The painting I used is Sergei Ivanovich Lukin's It Has Come to Pass.

Painting scene
Wall scene

I labeled the points on the wall where I wanted to hang the painting and projected the painting's picture to that shape, as shown below:

Painting scene
Wall scene

Final Result:

new wall

The Coolest Thing I Learned from this Project

As mentioned before, I was very impressed by how versatile homographic transformations are. Other cool things I learned in the course of completing this project include Harris corners and the RANSAC procedure. I think both of these are incredibly clever ways of quantifying seemingly subjective concepts like interesting feature points and matching sets of correspondence points. It is very impressive how "intelligent" image processing procedures can act even without incorporating more recent AI techniques like neural networks.