I took three pictures of each mosaic (total 9 images). When I'm taking the picture, I did not move my hand and just rotated a phone to keep same center of projection. If I had a tripod, I might have a better images because I can just rotate the tripod in different angles.
In this part, I computed homograohy matrix H that was used for warping. H is 3 by 3 matrix and there are eight unkowns (a, b, c, d, e, f, g, h). Therefore, to solve for them, we need at least four points. Instead of solving 8 systems of equation, for this project, I used more than four points and did least square. To do least square, we need to make an equation Ax = b. Below, I wrote how to obtain two rows of A and b given one point. Once we obtained A and b, we can get x using least square, which are 8 entries of matrix H.
Using H, I did warping to do image rectification. By doing image rectification, I can see images from
different viewpoint.
This image was taken from left side of picture frame. Hence, we cannot see the picture frame
from front.
However, by doing image rectification, we can see the picture frame from front view.
Also, this image is taken from front view. Hence, it's hard to see words that are written at top of the box.
However, by doing image rectification, we can see the box from top, so we can read what is written on the top of the box.
From the previous part, we saw how image warping works. Now, in this part, we will blend three images from different angle into one image. First thing I did was padding the image. I did padding because without it, a lot of information is lost in the warped image. After padding the images, I warped first and third image into the second image.
After obtaining warped images, I blended them using laplcain stack of level 3. Since the image size already had been increased because of warping, I did not need to increase the image size for the blended one.
I used provided get_harris_corners to find harris corners of the image. Instead of dealing with entire harris points, I used top 1000 or 2000 or 4000 harris corners depends on the image. From that harris corners, I implemented Adaptive Non-Maximal Suppression algorithm. For every harris corner, I calculated minimum suppression raduis. Then, I selected 500 points with largest mimimun supression radius for ANMS points. Below shows the difference between harris corners and ANMS points.
Harris |
ANMS |
Harris |
ANMS |
Harris |
ANMS |
Harris |
ANMS |
Harris |
ANMS |
Harris |
ANMS |
Harris |
ANMS |
Harris |
ANMS |
Harris |
ANMS |
Then, for every ANMS point, I extracted 40x40 feature patch. Then, I downsacled it to 8x8 to make them less sensitive to intensity change. Also, I normalized the patch, so made them having mean of 0 and std of 1. Below shows some example patchs.
example patch of room(left) |
example patch of hallway(left) |
example patch of door(left) |
Once I extracted feature patch, I matched two feature descriptors between two images. For each feature in one image, I computed ssd between all features in another image. Out of all ssd's that I comuted, I found the least one (NN1) and the second least one (NN2). Then, only when (NN1 / NN2) < threshold, I chose that pair of points (NN1) as matching points. Below shows matching points between images
Although matching points in previous part looks good, there are some points that are not matched. To solve this problem, I implemented RANSAC. From matching points in previous part, I randomly sampled 4 pairs of points. Then, I computed Homography matrix with them. Using that Homography matrix, I transformed points in one image. Then, I computed ssd between transformed points and points in another image. Only when this ssd is less than epsilon, that point can be inliner point. Then, I computed the number of inliner point. When this is grater than current maximum number of inliner points, I set them as new inliner points. I looped through this process 4000 times, and returned maximum inliner points and Homography matrix corresponds to them. Below shows matching points after RANSAC between images
After RANSAC, I finally obtained points and homography matrix. I repeated same thing as 4-1 to create Mosaics. Below shows comparison between manual and auto stitching Mosaics.
Manual |
Auto-stitching |
Manual |
Auto-stitching |
Manual |
Auto-stitching |
From this project, the coolest thing was image rectification. In the hefty box example that I used, even though I cannot clearly see the words written in the box in original image, I can clearly see the words by using image rectification. It was amazing that I can see the image from different viewpoint by doing image warping using homography matrix H. Also, in auto-stitching, my favoite part was finding perfectly matching points using RANSAC. Without using RANSAC, there were some outliers that do not match well. However, after using RANSAC, I was able to get rid of all outliers and left with perfectly matching points.