CS194-26 Programming Project #4

[Auto]Stitching Photo Mosaics - Part 1 - Image Warping and Mosaicing

Shreyans Sethi (SID: 3034163305)

Part 1: Shoot the Pictures

For this part of the project, I chose to use my DSLR to take the pictures (Canon EOS80D) so that I could shoot in manual mode and keep exposure, shutter speed and aperture the same (so that the photos are as consistent as possible). I put the camera on a tripod so I could turn it without translating its position and also went with a wide-angle lens as this is the typical use case for a panorama. Finally, I chose to take photos of the exterior of the Unit 2 dorm buildings in Berkeley because the windows would be easily identifiable correspondence points between the different images.

Camera Setup for Shooting Photos

Note that all of the following images have been compressed for this website. The original images are 6000x4000 pixels each.

Image 1 (Left)
Image 2 (Mid)
Image 3 (Right)


Part 2: Recover Homographies

We know that since all the images were taken with the same center of projection and the camera was just rotated, that the transformation between each pair of images is a homography following p' = Hp, with H being a 3x3 matrix with 8 degrees of freedom (the last entry is a scaling factor = 1). If we want to recover the values of H, we can set up linear equations as follows:

Now that we know that each (p, p') correspondence will provide 2 rows to the A matrix above, it is clear to see that a minimum of 4 correspondences are needed. This is because this will give 8 equations to solve for 8 unknowns. However, with only 4 (p, p') pairs, any noise in the images can lead to errors in the homography matrix. Therefore, more correspondences were provided (E.g 8) and the A matrix and b vector were set up as above but then Least Squares was used to find the optimal solution for the h vector.

Below, the defined correspondences can be seen in all 3 images (they were identified as 4 corners of 2 windows present in all the images) and these were collected using ginput. Then, the parameters of the homography were recovered using the system of equations explained above.



Part 2: Warp the Images

To warp the images, I first defined a find_bounds function that takes in an image and a Homography matrix and then does a matrix multiplication between H and the 4 corners of the original image. It then divides each (wx,wy, w) coordinate in the result by the 3rd value (i.e. the w) to bring it back into 2D. Between these 4 new coordinates, I found the y_max and the x_max value and initialized the dimensions of the warped image to x_max, y_max.

To fill in this new warped image, I used inverse warping - Effectively taking the indices of the new image, multiplying them by the inverse of H to find the coordinates in the original image, and then using cv2.remap to sample the pixel values from the original image and bring them to the new, warped image. During this process, I had to make sure all (x, y) coords before being multiplied by H^(-1) had a 3rd dimension 1 value (for the homogenous coordinate), and that after multiplying by H^(-1), I was rescaling by w each time.

Note: For both the boundary calculation and the inverse warping, an H matrix had to be used and this was computed using the function explained in the previous section.

Below, you can see the results of warping the middle image into the shape of the left one, and warping the right image into the shape of the middle one:

New Boundaries for Warping Middle Image
Warped Middle Image into Left Image Projection
New Boundaries for Warping Right Image
Warped Right Image into Middle Image Projection


Part 3: Image Rectification

Once we have the warp algorithm working from the previous part, rectifying images becomes fairly similar. For this part, I took photos of square objects from an angle and then set the correspondence points as the 4 corners of the squares. Then I 'rectified' them by defining the 4 points of a perfect square and then warping the angled images to this square. Results can be seen below and generally show that the warping algorithm is working successfully!

Corner Points of Square (Projective Target Image)
Photo with Defined Correspondences
Rectified Result (Cropped for Viewing)


Photo with Defined Correspondences
Rectified Result (Cropped for Viewing)


Photo with Defined Correspondences
Rectified Result (Cropped for Viewing)


Part 4: Blend the images into a mosaic

To blend the images into a mosaic, I defined a function that takes in 2 images (one of which has presumably been warped into the other's projection as explained in part 2). This function first finds the size of the mosaic by finding the x-max and y-max between both images, and then creates two blank images with this new shape. It fills the first blank image with input_image1 and the second blank image with input_image2. It then finds the overlapping coordinates between these two images by checking the indices where both images are non-zero. Since we're doing a horizontal blend here, I found the minimum and maximum x coordinates of the overlapping area, and then defined a mask - a numpy linspace over this range (ranges from 1 to 0), which essentially acts as an alpha transition. The first image's overlapping area is multiplied by the mask and the second image's overlapping area is multipled by (1-mask). The two images are then added together to create the final mosaic. Three examples of this algorithm can be seen below:

You can see with the outdoor scene of Unit 2, there is some ghosting issues because the trees and the wires move around a little due to the wind and even with well defined correspondences, this will lead to some issues. On the other hand, with the desk and the indoor house scene, these are a lot more stationary, so the resulting mosaic is a lot better looking!

Mosiac 1: Blending Left and Middle Images
Mosiac 2: Blending Right and Middle Images
Mosiac 3: Blending Left and Right Images

Some Other Mosaic Results

Left Image Original
Right Image Original
Mosaic Result
Left Image Original
Right Image Original
Mosaic Result


Tell Us What You Learned

The most important thing I learned from this project was the new issues to keep in mind as we move from affine triangle warps (in the last project) to projective rectangular transformations in this project. I initially thought that the process would be quite similar and although the idea of defining correspondences and inverse warping is still present, there were new issues I had to deal with: What if after warping, the new boundaries go into negative coordinates? How can we rearrange the system of equations so that we can solve them even without explicitly knowing w in the p' point? Therefore, I was able to learn the differences between projective and affine warps more clearly by seeing the impact in the transformations themselves.

The coolest part of this project was that I've always wondered how document-scanning apps always work so well - They allow the user to take a photograph of a document at a slight angle, identify the 4 corners of the document, and then return an image that looks like it was scanned by an actual scanner. After working on the 'Rectifying Images' part of this project, I now realize how the user setting the 4 points of the document is basically us identifying the correspondences so they can warp the image into a rectangular!