Image Blending

Daniel Geng

We use some pretty cool techniques to blend pictures together

1.1: Warmup

Sharpening an image is relatively simple. We first blur the initial image. This effectively takes a low pass filter of the image.

(Mouse over for blurred image)

Next we subtract the the blurred image from the original image. Because the blurred image is a low pass, we get the inverted pass: a high pass filter of the image.

(Mouse over for high frequencies)

Finally, we take these high frequencies and add them to our original image. What this effectively does is to emphasize the higher frequency portions of the image. Because edges are frequency structures, we get a kind of sharpening happening.

(Mouse over for sharpened image)

1.2: Hybrid Images

To produce hybrid images we take a low pass filter of one image and add it to a high pass filter of another image. Because human perception is more sensative to high frequencies when close to the object and more sensative to low frequencies when far from the object, the resulting image is preceived as two different objects at different distances.

Our first attempt at combining Beyonce and Nicholas Cage resulted in a failure:

Blurry Cage

High frequency Beyonce

Combined

The resulting image kind of works. The main problem is that because we aligned the eyes so well, the mouths are very mismatched. Nick’s face is just too long! Thus, Beyonce looks like she has a gigantic smudge on her chin and Cage looks like he has a smile for a mustache. This goes to show that alignment is actually fairly important (and a simple two point transformation is probably not sufficient for most cases).

Our second attempt invovles a slightly more complicated transformation in which we match three points. Thus, Cage’s face gets slightly squished.

Blurry Cage

High frequency Beyonce

The hybrid photo looks much better now!

And of course Derek and Nutmeg:

Blurry Derek and High Frequency Nutmeg

Catman

The following are the fourier plots for the derek-nutmeg hybrid:

Left: Derek's FFT | Right: Blurry Derek's FFT

Left: Nutmegs's FFT | Right: High Frequency Nutmeg's FFT

FFT of Hybrid

Laplacian Stack

Laplacian stack of Derek and Nutmeg, showing the hidden structure at each frequency

1.4: Multiresolution Blending

We use the technique described in Burt et. al. to create multiresolution blends of different images.

An orpple!

Next we combine Daniel Radcliffe and Elijah Woods:

Elijah Radcliffe (or Daniel Woods?)

We also created a blended image using an irregular boundary. Behold, Daniel with Elijah’s right eye:

(Mouse over for combined image)

The mask was created by taking the convex hull of points chosen by the user

The mask used to construct the above image

2.1: Toy Problem

We first solve a toy problem. What we want to do is to reconstruct an image by using only the gradients and a single pixel value. The single pixel value will serve as our “boundary condition.” Setting up the system and solving we get the following result:

(Mouse over for reconstructed image)

As you can see, the reconstructed result is very very similar to the original. The only difference are some odd high frequency artifiacts in the grey background, likely due to the sparse linear solver we used to solve the system.

2.2 Poisson Blending

Finally, we implement poisson blending. We find such that

Essentially, what we want to match gradients as closely as possible inside the region

and also match the boundary pixel values as closely as possible

These equations combined result in an overconstrained system. We can use a least squares solver to find the appropriate . The results:

(Mouse over for blended image)

We can also graph the errors we got:

Rescaled to show texture

Using the other sample image:

(Mouse over for blended image)

We could possibly consider this image as a failure case in that it changes the color of the pasted image too drastically. At the end of the process the penguin chick is not even remotely close to white anymore.

And the errors:

Rescaled to show texture

Now we have some fun:

and

And we get:

A Faster Solver (with physics (and GPUs!)!)

The naive least squares solver takes a very very long time to solve in some cases. In fact, for the image above with the penguin chick the algorithm took 1420 seconds = 23 minutes to solve. The question arises: Can we make this any faster? The answer is yes (with a bit of physics)!

In Poisson Blending, we are solving the following optimization problem:

subject to

where are our new pixels and is the gradients of our source image is our crop region and is the boundary of the crop region and is the target image.

Essentially, we want to find new pixels that match the gradients of the source image while keeping the values of the target image on the boundaries.

Now if we can write , where is the original source image and is a perturbation. The idea here is that we only need to solve for the perturbation of the source image. We also have the nice identity

because is by definition of the gradients of the source image. Substituting in to our optimization problem we obtain

Using some multivariate calculus voodoo, we realize that:

(because the two norm is the sum of squared components of a vector, and that’s just the divergence if the vector is the gradient)

Now the smallest value that this could possibly take on would be , as norms can’t be negative. So we can just write

such that

and thus we have our new optimization problem. At this point it only seems that we’ve managed to complicated things…

As it turns out, the above problem actually describes the process of diffusion (if we take to be the density of the diffusing material) with certain (dirichlet) boundary conditions. This is a well known problem (the Poisson equation) and can be solved using many methods. One such method is by solving a linear system. However, because we know that the above problem models diffusion, why don’t we just simulate diffusion?

As a side note, it is pretty evident from the images of the errors above that a diffusion-like process is going on

Notice how the sides seem to be diffusing

The way we’ll simulate diffusion is by setting the boundaries of a grid to the boundary conditions. Then we will convolve the grid with the kernel

this simulates diffusion with a diffusion rate of , the following equation:

This seems well and all, but convolutions are very slow using just numpy. Can we somehow get fast convolutions? Yes! We can take advantage of all this machine learning hype and use a GPU to calculate just convolutions. Our implementation uses pytorch to calculate convolutions in 1/1000 the time it takes numpy. In the end, our algorithm brings down the running time from 1420 seconds to 6 seconds (and the majority of this time is loading tensors on to the GPU)!

Our algorithm simulating diffusion

And the results:

(Mouse over for blended image)

and the errors:

Errors: p from the algorithm above