Introduction

The fundamental basis of this entire project is focused on how to "mix" two images. This can be achieved through creating hybrid images like a half man half cat, or through seemlessly placing an object in a completely different environment like this salt dispenser the size of a human.

Warmup: Image Sharpening

The first technique we will explore will be image sharpening using an unsharp masking technique. This is done simply by applying a low pass filter of an image by applying a Gaussian blur. Then, the original image is subtracted by the gaussian to get the Laplacian. This is essentially equivalent to applying a high pass filter and thus we can isolate the high frequencies. By adding the high frequencies back into the original image we can "sharpen" the image.

  • Original

  • Original

  • Blurred

  • High Frequency (Laplacian)

  • Sharpening

Hybrid Images

After implementing the sharpening we can start experimenting with hybrid images. These images are ones that change in perception based on the distance the viewer is at from the picture. So in the cases of the following photos, viewing the image closer will show the high frequency image, and viewing the image from afar will reveal the low frequency image. We create the hybrid image by averaging the Laplacian and Gaussian generated similarly above.

  • Derek

  • A cat

  • Result

Fourier Analysis

Below the fourier analysis of each step is shown. The first is log FFT of the picture of Derek. The second is the log FFT of the cat. However, the next two are the pictures after having a Gaussian blur applied. Clearly, the results show how much of the green has been removed which indicates a low pass filter. The next photo is the Laplacian of the cat photo which is shown by an excess of the green. And finally the final photo is the final hybrid photo.

  • FFT Derek

  • FFT cat

  • FFT Low Pass Derek Aligned

  • FFT Low Pass Cat Aligned

  • FFT High Pass Cat Aligned

  • FFT Hybrid Image

Other Hybrid Images

  • Efros

  • Marilyn Monroe

  • Marilyn Efros

Failure Case

  • Panda

  • Black Bear

  • Hybrid

As expected, this example fails because the photos are too similar. Since the panda has such distinct features, the high frequencies of the Black Bear don't show much overlayed on the panda. Instead the result is a panda no matter how far or close you look. Even the distinct color difference of the nose of the Black Bear is not visible.

Gaussian Stacks

Here we just apply the Gaussian at multiple levels without downsampling. We can also show the laplacian of the respective iamges. Notice how the stacks affect the Efros Monroe hybrid image to the point where Monroe's high frequencies disappear.

  • Sigma: 2

  • Sigma: 4

  • Sigma: 8

  • Sigma: 16

  • Sigma: 32

  • Sigma: 2

  • Sigma: 4

  • Sigma: 8

  • Sigma: 16

  • Sigma: 32

  • Sigma: 2

  • Sigma: 4

  • Sigma: 8

  • Sigma: 16

  • Sigma: 32

Laplacian Stacks

Multiresolution Blending

We can use the stacks from the previous part to blend images together seamlessly. In order to do so, we can blur a mask with a gaussian blur and add the laplacians of the intended blended images masking the respective parts using the following formula:

Young to Old Obama

Toy Problem

Starting with this problem, we will now shift focus to gradient domain processing. This technique allows similar blending to multiresolution blending. The fundamentals of this technique is relatively simple. Given a source image, lets for example say a rainbow we want to place into an image, and a target, we can take the gradients of the source and essentially try to smooth the edge with a least squares solver. This problem is simplified for the toy problem in which we only try to match the top left pixel and try to reconstruct the image using the gradients from the original image. In order to do so, we minimize ( v(x+1,y)-v(x,y) - (s(x+1,y)-s(x,y)) )^2 , the x-gradients of v should closely match the x-gradients of s, minimize ( v(x,y+1)-v(x,y) - (s(x,y+1)-s(x,y)) )^2, the y-gradients of v should closely match the y-gradients of s, and minimize (v(1,1)-s(1,1))^2, so the top left corners of the two images should be the same color. The general problem of possoin blending is mathematically described with the following equation in which case we are solving for v :

  • Original

  • Reconstructed

Poisson Blending

We can see the results using the above formula here. In the first example you can see entire process of generating the blended images from getting the masks to cropping the source into the target and then the blending.

Failure Case

Although the blending around the source image worked as expected, the original color of the source image was not preserved at all. This is likely because the grass color and the sand color differing so much. As a result the colors completely change as the green tries to match the tan of the sand.

Poisson vs. Multiresolution vs. Mixed

As shown, it seems like possoin has the best results out of the examples. The border of the source is much more smooth compared to the multiresolution and the mixed is semi-transparent over the hand which is not what we want.

  • Poisson

  • Multiresolution

  • Mixed

Bells and Wistles: Mixed Gradients

Here are some other examples of mixed gradients which suffer from similar issues as the hand example in which the source images are semi-transparent although has surperior abilities to isolate complex shapes like text. To achieve this, we follow a modified version of the Poisson blending using the gradient in source or target with the larger magnitude as the guide, rather than the source gradient. This is shown in the following formula:

Struggles

The biggest issues were with the tediousness of the normalization of the images and the Poisson algorithm. With frequencies and gradients there isn't much useful feedback if part of the algorithm is messed up. For example, even sharpening at first only displayed noise for the Laplacian which was fixed by normalization. But without the help of suggestions by the rest of the class, that fix would've just been a guess. Similarly, with blending, the source image would just be completely either discolored or incomprehensible. Much of the debugging was carefully just going through the code over and over again, line by line.