Image Warping and Mosaicing

Tianhui (Lily) Yang CS194 Spring 2020

please view on half screen for best visuals

Introduction

short summary of what this was all about :)

In this project, we utilized a homography (perspective) transformation to rectify images and produce warped versions for later image stitching. Points were manually selected and sources for images that are not my own are linked below. In the second half, we utilized automatic feature detection and matching, using the same stitching/warping algorithm as we did for the first half.
Museum Time Magazine: Museum

Part 1: Image Warping and Mosaicing

A common task is to align images to a standard frame. In this part of the project, we selected images with portraying objects with rectangular frames and transformed them into a front view rectangle through a process called rectification. We then extend the relationship between perspective transformed images into the realm of image stitching to create perspective-corrected panoramas.



r_pen


1.1: Methods

homography
Equation from Lecture Slides

To begin rectification we first analyzed the homography matrix as shown below. It contains 9 values with 8 degrees of freedom, which is enforced through assigning the last value of the matrix to be 1. We can solve for the scaling factor w, by utilizing the last row of the homography matrix. This then allows us to rewrite the equation in the form of least squares, approximating for the values within the homography matrix. Though only four equations are necessary for a homography to be calculated, there may be larger noise. To combat this, we overload our system of equations and find the best fit.

least squares
Slide Courtesy Robert Collins, CSE486, Penn State

In order to find the correct homography matrix, H, we needed to first define points of correspondence in our start image and our desired image (front view rectangle). We used ginput to annotate the feature points and utilized the four corners of the image as destination points.
After the correspondence points were found, we continued with the following steps
1. Computed the H matrix
2. Estimated output by finding points that correspond to corners via H
3. Retrieved all points in our output image using skimage polygon
4. Performed an inverse transformation to map output points to possible inputs, and then dividng the points by the scaling factor, w.
4. Clipped out of bound values and adjusted image size
6. Assigned the pixel values within our input image to their respective indexes in a blank version of our output image.


1.2: Results

Here are some outcomes of rectification!

kiwi r_kiwi


Here we see a kiwibot. We chose the four points of the kiwibot label and made them correspond to the four points of the image. Now we can see a rectivied version, though grainy.

art sideways

We then looked at the art pieces in a museum and it was able to make a distorted painting look the right proportions.

drawing r_drawing

I used to like perspective art a lot, guess it's been put to good use.

architecture building

This result was pretty interesting because it was able to change the view of a building. There were some parts of the image that were missing due to output image size constraints. In the next part (stitching and auto-stitching), there would be methods to account for empty locations and entire image transformations.

face r_face

More self art promo :').

1.3: Panoramas

Here we extend the applications of rectification and utilize image stitching to warp between images. Using a tripod and rotating the camera, we took 3 photos from the same place. Correspondence points of between 10-16 is satisfactory for producing results. First we performed warping between the left and the middle, and then the result of the left-middle warp and the right. After adjusting dimensions, we then warped each two images together. We used a gaussian mask and alpha blended the images.

2Lane_L

Below is the result of warping the middle image into the perspective of the left image. Here, we see that most things line up pretty well. The bounding box prediction is quite useful in determining how big the image would be. The seam between is almost unnoticeable. You can begin to tell where it is by looking at the grass. Some doubling effects occur due to miniscule misalignment.

2Lane_L

Now the final reveal.

pano

Another one!

1Road_L
pano

As you can see, there are some ghosting effects, especially in the middle of the image. Some places are also shifted higher than others. This may be due to greater need in correspondence points or unintended shifts while taking the images.

And finally, because I couldn't get better pictures. Images courtesy of Shivam Parikh, from course provided pictures.

pano
pano

Here, manually selected points performed pretty will. The doubling effects seen near the left-middle side is not immediately visible, but when looking close, there's double of the rods and buildings. Not great, but from a distance, seems natural.

1.4: Insights

It took me quite a while to debug my homography calculation, but through it, I recognized the importance of having w and having consistent ordering of correspondence points. Overall, this was pretty fun and I was glad I could go outside to try to take good photos (though not successful).

Part 2: Feature Matching for Auto-Stitching

After the gruelling half of hand-labeling points, now we can sit back and let our computer do the work. In this section, we designed an automatic feature detector and stitched together the images as we did in part 1. We split the work into 5 main portions:

2.1. Corner detection with Harris Interest Point detector
2.2. Identifying most dominant points with even spread using Adaptive Non-Maximal Suppression (ANMS).
2.3. Creating feature descriptors and matching them
2.4. Using Random Sample Consensus (RANSAC) to reduce outliers
2.5. Mosaicing and stitching the images together

We see the progression of these steps with our first set of photos from the manual section.

2.1: Harris Interest Point Detector

To detect corners, we find locations in the image where significant changes in image content are present in all directions. To measure this change, we utilize the outer product of gradients and define the corner strength or corner response as the determinant of the resulting matrix divided by the its trace. Interest points are points where the corner strength is a local maximum in a 3x3 pixel region.

Here we see the large proportion of interest points that the pre-written Harris points function provides. Setting the min_distance to 50, we make the viewing field a lot less cluttered, though still very packed.

pano pano

2.2: Implementing Adaptive Non-Maximal Suppression

From the two pictures above, it is clear that we have to be more selective with the points in order to create meaningful results. To do so, we decide to find the top 500 most dominant points. We define dominant with the help of a point's suppression radius, or minimum radius until they reach another point that has significantly greater corner strength than the current point. We follow the equation below from the paper Multi-Image Matching using Multi-Scale Oriented Patches by Brown, Szeliski, and Winder . pano ri represents the minimum suppression radius, the function f represents the corner strength as defined in 2.1, and Crobust is a constant, 0.9, that ensures that the the stronger point is stronger by a larger margin. I represents the set of all points that we selected from 2.1.

We implement this with the following general approach:

   
            For each interest point, p
                calculate the minimum distance to a significantly stronger point 
                save this minimum distance and p
            
Then, after we map the minimum distance/radius to their respective points, we sort in decending order and take the top N points to keep. For this project, I set N=500. The results are shown below. Much more pleasing to the eye than the jumble we saw before.

pano pano

2.3: Extracting and matching features

Even though we have made significant improvement, it's clear that not of the points will have a corresponding point in the other image. Here we utilize neighboring information from pixels near each point of interest to give us context on matches.

We first centered a 40x40 patch around every point, then we bias/gain normalized by subtracting the mean of the patch by its standard deviation. To reduce disturbances from higher frequencies, we then resize to 8x8 patches.

We utilzed the Sum of Squared Differences (SSD) to calculate the magnitude of difference between a patch from one image and a possible match from another image. Implementing Lowe thresholding, we test to see if the ratio of the best (minimum) error with the second best error is below a predefined cutoff. I specifically found a threshold of around 0.6-1 as satisfactory. Matching results are shown below. Just by looking at the images, we can see that the majority of these correspondences are indeed correct. A couple of outliers exist. For instance, in the left image, there is a point on the road and the tree that has no correspondence in the right image.

pano pano

2.4: Handling outliers with Random Sample Concensus (RANSAC)

As we have noticed, some points are matched incorrectly. Since we utilized least squares to compute our homography as in Part I, we are prone to large changes in transformations from the inclusion of outliers. To make sure that we only pick points that are informative, we implement RANSAC, which uses random samples of points, tests the estimate against expected values, and keeps the points that are estimated correctly within a threshold, effectively maximizing the inliers. This process is repeated until the probability of getting correct points falls below a threshold or when the user decides is enough. Guidelines on picking the number of iterations can be found HERE.

This can be summed up with the following procedure:

            For each iteration:
                Find 4 random set of points (4 from each image point set).
                Calculate the homography
                Test the estimate of the homography against all the points
                Keep the ones that fall below a threshold 
                Update a set of inliers only when the iteration inlier set has more elements than the global set
            end loop and calculate homography from largest inlier set.
            
This significantly reduces the amount of points. I set the iterations to 500 and error bound to 15, but found 0.5 was sufficient for most images.

pano pano

2.5: Creating a mosaic

Now after we have managed to reduce ~2000 points to less than 50, we are now ready to create the stitched version of these images. The procedure is the same as that of the panorama performed in Part 1. Here are side by side comparisions of the automatic stitched images and the manually stitched images. The Left image is MANUAL and the Right is AUTO.

pano pano

Though both manual and auto performed well, we can see some significant discrepancies between the two as we look closer. The Left half the image for both are similar, with doubling effects visible on the left grass patch. However, mannual ppear more pleasing on the right half. This is due to the fact that blurring happens along the sky portion, where colors and branches are lighter and less distinguishable.

pano pano

Auto performed better on this round. We see that the car is much clearer than that of the manual. Though, both struggled with aligning the trees and the house at the very center. This could be due to camera shifts.

pano pano

Once again, auto takes the cake. Though not very visible, we see that manual faces heavy shifting issue on the right side of the image. Auto, on the other hand is practically pristine, with minimal defects near the roof.

2.6: Final Insights

And yes, we have finally made it. This project was by far my favorite. It taught me that simple approaches may sometimes be the most effective. Though ANMS and RANSAC rest on simple logic, they undoubtedly work very well and required minimal fancy imports or head-breaking algorithms to implement. Additionally, I also changed the way I implemented things. In the beginning of the semester Prof. Efros told us to avoid for loops. This project ingrained this mentality into the way I now approach things, as I see just how much quicker vectorized functions could be.