CS194-26 Intro to Computer Vision and Computational Photography Project 5a

Image Warping and Mosaicing

Matteo Ciccozzi

Part A

Computing the Homography Matrix

The aim of this part is to be able to find the homography matrix between two sets of images. The way this is done is by first defining corresponding points between two images and then defining a system of equations that we can solve to retrieve the elements of the homography matrix. Recall that a homography matrix has 8 degrees of freedom since we assume the bottom right corner to be 1 since that is the scaling factor.
So the million dollar question is, how exactly do we set up the homography matrix equations? Well, we need at least 4 points to account for the 8 DOF's but it is often better to setup an overdetermined system and use least squares.

Here is the way I setup my system of equations and I have found using least squares gets me pretty good results, for simplicity I have only shown the first two rows (assume x,y are coordinates in the original image and x_prime, y_prime are coordinates we want to map to):

[x, y, 1, 0, 0, 0, -x_prime*x, -x_prime*y] = x_prime
[0, 0, 0, x, y, 1, -y_prime*x, -y_prime*y] = y_prime

Using this here is a sample Homography computed for the rectification part using my method (on top) and using cv2.findHomography (on bottom), as you can see there is lttle difference.

triangulation_matteo

Warping Image

The warping procedue is very similar to that of the average face project, with one major difference: in this case we can actually warp to coordinates that are negative or greater than the image size. To counteract this we first estimate the size of the bounding box by mapping the four corners of the image and then padding the image with an appropiate offset. Additionally, I also compose the homography matrix with some offset transformation making the new homography matrix T@H, obviously nromalizing so scaling factor is still 1. With this in mind we can do a simple remap and write the whole function without for loops.

Rectification

For this part I decided to make my ipad a square. To do this I selected the four corners of the ipad and then manually inputted for points to make a square centered at (600,600) with 400 pixel sides. I rectified my ipad to look like a square. The first image is the output of my own function and the second square is the output of cv2's warp function. As you can see they are almost identical.

triangulation_matteo
triangulation_matteo

Image Mosaicing

For this part I decided to select 8 points in both images so as to obtain a more accurate homography matrix. I then decided to project both of my images onto the plane defined by the average points and then stitch them together. For blending I have decided to implement a laplacian pyramid although it turned out it did not perfroms as well as using a simple masthat k blending where the mask was a 2d array generated using the sigmoid function. I created an array of values from -6 to 6 increasing by an amount such that this array would have the same number of cols as my blended image. I chose -6 to 6 since the sigmoid function goes from 0 to 0.98 on this range of values. I then blended the two images using this mask and the results were really good. Here is a summary of results:

triangulation_matteo
First image projected on average points
triangulation_matteo
Second image projected on average points
triangulation_matteo
Attempt at blending using laplacian stacks
triangulation_matteo
Blending using the sigmoid mask
triangulation_matteo
Summary/Comparison of results

Part B

Harris Corner Points

For this portion I slightly modifed the staff code to use corner_points instead of peak_local_max. I found that this, alongside with a min distance of 60, gave a good amount of corner points. Here are the corner points for image 1 and image 2:

triangulation_matteo
Harris Points for image 1
triangulation_matteo
Harris Points for image 2

Adaptive Non-Maximal Suppresion

For this portion I went ahead and implemented the adaptive non maximal suppresion method as described in the paper in the project spec. Essentially, for each point we compute a suppression radius, this is defined as the minimum distance to a harris point that has a harris corner value at least (1/0.9) times that of itself. Once we have computed this suppresion radius for all the interest/harris points we select the top P points, i.e. the P points with the highest suppression radii. I chose P to be 100 and it worked out very well, indeed, decreasing the number of interest points but at the same time ensuring they are well spread out in the image. Here are all the harris points alongside the interest points after applying anms.

triangulation_matteo
Blue points are all the harris interest points, and the red points are the remaining ones after applying anms

Feature Description

For this portion the idea is we want to be able to use the interest poitns obtained by the adaptive non maximal suppression to created features. The way we do this is by selecting a patch of 60x60 pixels with the interest point in the middle of it. After this I compute the x and y gradients of the patch and average them. Using these two average gradients I then computed the rotation angle required to make the patch axis aligned. After rotating the patch I crop the central 40x40 pixels, I then blur this patch using a gaussian kernel and finally take every fifth row and column to downsample it to an 8x8 patch. This will become the feature descriptor for a specific point.

Feature Matching

With feature descriptors set in place, we need to find matches between points in both images. To do so we use the method outlined in the paper. Basically, for each patch in image 1 we compute the ssd with every patch in image 2, after sorting these in increasing order if the ratio of the smallest ssd to the second smallest ssd is less than 0.45 then we declare a match with the patch that produced the lowest ssd. The 0.45 threshold was determined by looking at figure 6b in the paper, a threshold that is too high will result in a higher number of false positives, whereas a threshold that is too low will not produce enough points to perform ransac. Here are the results after performing feature matching:

triangulation_matteo
Correspondence points for image 1 and image 2

Final Automated Stitches

Here I present my final work. After having the set of points and projetcing them onto the average plane with the best Homography matrices I could find, I applied the sigmoid mask as in previous parts of the project. Here is my final result on the kitchen pictures:

triangulation_matteo
Notice how the top part did not blend too well, I believe this is from the difference in light/exposure between the two pictures. If you look at the picture with the manually found points this is not as prominent. Perhaps a Laplacian pyramid could fix this? After implementing the laplacian pyramid the results are still not as expected, this could be because the pictures were taken with a phone camera with no tripod therefore resulting in different exposures.

Here are other results of the automated stitching process, in total I have produced three stitched images.

triangulation_matteo
triangulation_matteo
triangulation_matteo
triangulation_matteo
triangulation_matteo
triangulation_matteo

As we can see in the last image, the blending worked out really well! I took this picture using my brother's (better) phone and his tripod, which helped me resolve the exposure issues I had in previous pictures. Here is another example of a picture I took with his phone and tripod.

What I learned

This was definitely one of the longer projects in this class and at least for me, one of the harder ones too. It took me forever to correctly implement the warping function, I don't know why but the remap fucntion always outputted some really weird values so I just ended up copying my project 3 code and modifying it a bit. I really liked the homography part and the ransac algorithm was my favorite section. I found very interesting how we can actually tune the points we find to make sure the matrix is numerically stable when solving for the homography "vector". In general, I noticed that avoiding collinear points was a big plus and really helped with the stability of the system, additionally, one could check before computing the homography matrix if the system is stable by looking at the condition number (ration of largest to smallest singular values) of the system matrix. If this value is too big then we should really find different sets of points. Another cool thing I had in mind but never ended up trying was performing the alpha blending at a lower resolution and then upscaling. I tried implementing the pyramid blendings from proj 2 but for some reason the blended image was still not good. I ended up just doing a simple alpha blending using a sigmoid gradient instead of a linear gradient and I found this to reduce the wedge like effects quite a bit.