Image Warping and Mosaicing

Shoot the Pictures

When shooting pictures, I found scenes that had lots of detail. This is important because capturing images up close, then stitching, will preserve more of the detail than taking the image from far away, to capture the entire scene in one image. I shot the images on an iPhone using the burst mode for panoramas, and when taking images of a plane from multiple points of projection, I used the Obscura app, which allowed me to fix the camera settings while taking pictures.

I had trouble finding good lighting and making sure that the camera moved slowly enough that the images were sharp, without moving too much in the wrong directions.

Recover Homographies

See below how the system of equations was set up:

img

As the project description mentioned, it was important to select several points of correspondence, so I made sure to choose at least 6 in each scene. Early on, I had weird results, that couldn’t be explained by any bugs in my code. I soon realized that it was because I was not precise enough when choosing points. After being more careful, the results were much better.

Warp the Images

For this part of the project, I used an inverse warp along with cv2.remap for interpolation. My implementation did not require any for loops. The biggest challenge in this portion of the project was tracking which pixels had values. To do so, I warped a white image as well, and used the result to create a mask, which I then added as another channel to the warped image.

Image rectification

Fortunately, I didn’t have too much trouble with this part. My warp function worked well, so the only challenge was marking points well. After some trial and error, I was able to get satisfactory results, which are shown below:

img img

Second image (a picture I took at the Vatican):

img img

Blend the Images into the Mosaic

I chose to leave one image unwarped, and I warped all of the other images into its projection.

When importing images, I used dstack to add something similar alpha channel to each image, with 0 representing a fully transparent pixel, and 1 representing a non-transparent pixel. When running cv2.remap for warping, I use an image filled with zeros (it will be fully transparent). Then, cv2.remap doesn’t edit pixels in the output image if those pixels are outside of the input image. Hence, all pixels that weren’t taken from the input image will have an alpha of 0, and the rest will have a value of 1.

After warping each of the images, I took there sum, and divided it by the sum of the alpha channels. This ensures that pixels are only averaged with pixels from another image if that other image has pixels at that location. I think the blending itself worked pretty well, as it isn’t obvious when looking at the results that the images were blended. That said, there are lines where the images meet that do not look very good. I believe that this problem was mostly created by poor point selection and inadvertent movements of the camera while taking pictures. The effects can probably be mitigated by adding feathering in overlapping regions. Below I display each original image, followed by the result. For the first scene, I also show each of the images warped.

img img img img img img img

Set 2:

img img img img img

Set 3:

img img img img

What I learned

I was most impressed by the process for recovering homographies. It was exciting to use more intricate algebra to avoid using w_hat, and to avoid solving for h_33. I also use many applications, like Google Street View, that rely on image stitching, so it was exciting to implement it myself. I was also happy to learn that my math and programming skills are sufficient to implement software similar to Google Street View.

Bells and Whistles

First, I used projections to project a painting that is in Doe Library on Campus onto a screen in an amphitheater in Italy. I took the second picture on a trip there a few years ago. In order to do this, I changed the background of the painting image to black. Then, when inserting pixels from the painting image, I only used pixels that were not black, thereby removing the background. This worked because non of the pixels in the image were truly black, but if there were black pixels, I could have used an alpha channel, as I did in the previous parts, instead. Below are the original images followed by the result.

img img img

Next, I implemented the 3D rotational model. I used the jpg information to see that the focal length was 4.2mm, and then use Procrustes algorithm to get R, and made the necessary transformations. As you can see from the results below, it was unsuccessful.

img img img img