CS 194-26: Project 4

Image Warping & Mosaicing

Ronak Laddha

Defining Correspondences

For this part, I used matplotlib's ginput() function to select the set of features that I would use to correspond the two images that would morph to create the panorama. I defined these points on paper, so that I could remember the order in which they were selected. To select the points, I focused on features in the shared region of the two images that stood out (e.g. corners, tops of trees, etc. ). Once I found the points, I saved the feature sets as CSV files so I wouldn't have to reselect them every time.

Above you see an example of the feature sets for a collection of photos that will be later blended together for a panorama.

Recover Homographies

Homography is defined as p'=Hp where H is a 3x3 matrix w/ 8 degrees of freedom (lower right corner is a scaling factor set to 1). We find the H matrix by using the correspondence points found earlier. We setup a linear system of equations of the form AH=b where b is the target points. We want to overconstrain the system, so we define more than 4 points in each image. Once we've setup the linear system, as defined above, we solve it using Least-Squares.

The following site was a terrific resource in guiding this process.

Warp the Images

Here is where we use the Homography matrix, H, previously calculated to warp one of our images so we can blend the collection together. This is similar to the previous project, but we are no longer doing an affine transformation. Rather, this is now a projective transformation. The steps to warp the image are as follow:

  • Take the original corners of the image & use H to find the warped corners
  • Once we've found the warped corners, we compute the inverse warp
  • We offset the indices in the warped image and multiply it by the inverse of H
  • Finally, we interpolate the pixel color values
  • Image Rectification

    A bit of a detour from the earlier part of the project, here we are attempting to rectify images. This means we take an image and warp it such that we are viewing it "head-on" (there are examples below). To rectify an image, we follow a similar process as above (the warp process). However, a key distinction here is that we are not warping our image to another image. Instead, we are "straightening" it so that the corners are aligned in a way that we can view it "head-on". To do so, I created an array manually that contained the "warped corners", or in other words, the straightened corners. This was the only difference, beyond that the process was the same as earlier: we define correspondences (in this case we only needed 4 - the corners of the object) and then take the steps described above to warp the image (this time to the straightened corners).

    Mosaic

    Here we put it all together, we combine our warped images and original images together to create a mosaic of the scene. The mosaic construction process is as follows:

  • Determine the dimensions of the final mosaic using the dimensions of the original and warped images
  • Use those dimensions to create a left mosaic and right mosaic region
  • Populate the known portions of the left and right mosaics from the provided images
  • Find the overlapping region between the images, this region will be used to compute a mask with alpha weighting
  • Use the overlapping region to create a range that starts at 1 and falls off linearly towards 0
  • Using this alpha range and the overlapping region, compute the mask by weighting the binary values in the region by the alpha range
  • Finally, apply the mask to the left and right sides of the mosaic and add them together to yield the final mosaic
  • One thing to note, you may notice that the overlapping region between the two images is brighter than the surrounding image. I wasn't able to figure out why this kept happening, I suspect somewhere in my mosaic code I am adding this overlapping region twice and thus amplifying the intensity of the pixels. I couldn't figure out a solution in time and so I decided to put a fix on the backburner until later. If you view the images in grayscale though, it looks perfectly fine :)

    Detecting Corner Features

    For this part, I used the starter code function "get_harris_corners.py" to retrieve the Harris corners and their respective corner strengths. One tweak I did make was to introduce a threshold as a parameter to the function as a way to limit the number of points returned (I aimed for ~5k points for each image).

    Implement Adaptive Non-Maximal Suppression (ANMS)

    Now, we'd like to further limit the number of corners by only retaining those that are a local maximum. More explicitly, for every corner xi, we calculate ri as follows (where xj is some other "neighboring" corner):

    Now, we have a list of radii for each xi such that the corresponding radius is the radius in which xi is a local maximum. We sort this list in decreasing order and take the top 500.

    Extract Feature Descriptors for each Point

    At this point, we extract a description of the area around our point as a way to efficiently and reliably match features across images. We start by convolving the image with a Gaussian to blur it, this is done to avoid aliasing. Now, for each point:

  • Sample a 40x40 window around it
  • "Break" the window into 5x5 patches
  • Sample the middle point from each of the patches to yield a final 8x8 patch
  • Normalize the 8x8 patch & leave a space of 5 pixels b/w this sample and the next one
  • Match Feature Descriptors Between Images

    Once we've extracted the patches, we match them between images to create a correspondence. We find the Nearest Neighbor (nn_1) and the 2nd Nearest Neighbor (nn_2), compute the SSD between our current patch and each of those (yielding e1_nn and e2_nn). Finally, we find the "Lowe Ratio", defined as e1_nn / e2_nn. If the Lowe Ratio is within some threshold, then we add it to our list of matched features.

    Use RANSAC to Compute Homography

    Once we've found the correspondences between features in images, we need to further prune the feature sets and find the "inliers". This is where RANSAC comes in. The RANSAC algorithm is defined as follows and is used to estimate the homography matrix.

  • Randomly sample 4 feature pairs (pi, pi') from im1 and im2
  • Use these pairs to solve for a candidate homography matrix, H'
  • If the dist(H' @ pi, pi') < epsilon, where epsilon is some small number, then we add the pair to a running list of inliers
  • We check the size of this inlier list, and if it's the largest inlier set we've seen so far, we keep it. Otherwise, we throw it out and repeat the process until we hit the MAX_ITERATIONS=500 mark
  • At the end, we return the largest inlier set and use it to solve for the homography matrix (H) that we will actually use for the image warping.
  • Produce Mosaics

    Finally, we put it all together to, once again, stitch together images to create a mosaic. We follow a similar procedure to earlier, but this time all of our features have been automatically selected! The comparisons between the two are highlighted below:

    Note: For keyboard auto-mosaic, there is a wedge artifact that slices through the image. This is a side-effect that sometimes occurs with the alpha-blending strategy I used. To eliminate this, I would use a Laplacian Stack to blend the two halves together instead.

    What I've Learned:

    The coolest part of part A of this project was the image rectification, I found it really cool that we were able to warp our perpective of an image from looking at it at an angle to directly head-on. I can see how this type of transformation would be especially useful for computer vision algorithms that need to analyze bank statements, driver's licenses, passports, etc. Since the user could upload an image that may be crooked, but you'd still be able to extract the key information as if it were taken straight on.

    The coolest part of part B of this project was the fact that we could develop code to automatically parse through images and identify the key features that would then be used to (quite accurately) estimate the Homography matrix that is then used to warp and stitch together the images, streamlining the whole panorama-creation-process tremendously!