CS196-24 FA18 // David Xiong `(cs194-26-abr)`

In this project, we explored various ways of utilizing frequencies and gradients to combine and blend images. We created hybrid images by combining high- and low- frequency portions of images, blended multi-resolution images using our implementation of Gaussian and Laplacian stacks. We then implemented Poisson blending and used it to transpose and blend objects from a source image into a target image.

For the warmup, we sharpened

an image using the unsharp masking technique. We first take the image and produce a smoothed version using the gaussian filter, then subtract the smoothed from the original image to isolate the details. We then multiply these details by some factor `ALPHA`

, and add them back to the original image to get a sharpened

version. Here's a look at the process for `ALPHA = 25`

:

Original | Smoothed | Details | Re-Combined |
---|---|---|---|

And here are the results for ALPHA = 10, 25, 60, and 120, respectively.

`alpha = 10` |
`alpha = 25` |
`alpha = 60` |
`alpha = 120` |
---|---|---|---|

For this part, the goal was to create hybrid images using the approach described in the SIGGRAPH 2006 paper by Oliva, Torralba, and Schyns. Essentially, we want to exploit the fact that humans aren't great at identifying high frequencies at long distances by isolating the high and low frequencies from two respective images and layering them on top of each other. At near distances our perception of the image will be dominated by the high frequencies, but as we move away the low frequencies become more noticable as we become unable to see the higher ones.

How do we accomplish this? We perform steps similar to the ones we did in Part 1.1. The low frequencies can be found by applying a Gaussian filter, and the high frequencies can be found by subtracting the low frequencies from the original image. We can vary the sigma values we pass into the low and high pass filter functions to achieve different blends of the two images.

Here's my favorite result: the low frequencies of my friend Eva's portrait (`sigma = 2`

) blended with the high frequencies of a tiger (`sigma = 45`

).

This is the process that I used to achieve this result:

Source (Eva) | Low Frequencies | Fourier Transform |
---|---|---|

Source (Tiger) | High Frequencies | Fourier Transform |
---|---|---|

Here's another example of a good hybrid image. This one is the low frequencies of Prof. Efros' portrait (`sigma = 3`

) blended with the high frequencies of Prof Hilfinger's portrait (`sigma = 15`

).

Source (Efros) | Low Frequencies | Fourier Transform |
---|---|---|

Source (Hilfinger) | High Frequencies | Fourier Transform |
---|---|---|

However, not all the hybrid images I processed turned out well. Take this image made up of two pictures of me (`sigmas = (3,4)`

). It's very hard to tell the difference between the low and high frequencies once blended, because the faces in the two images are so similar that there are not enough unique points to differentiate the two.

Source (San Francisco) | Low Frequencies | Fourier Transform |
---|---|---|

Source (Yosemite) | High Frequencies | Fourier Transform |
---|---|---|

For this part, I implemented a method to extract Gaussian and Laplacian stacks from an image. We can obtain the Gaussian stack from recursively applying a Gaussian filter to the image, and the Laplacian stack by subtracting the Gaussian output from the previous output.

Here are two examples, one of Dali's Gala Contemplating

and the other of Da Vinci's Mona Lisa.

Gaussian (N=1) | Gaussian (N=2) | Gaussian (N=3) | Gaussian (N=4) | Gaussian (N=5) |
---|---|---|---|---|

Laplacian (N=1) | Laplacian (N=2) | Laplacian (N=3) | Laplacian (N=4) | Laplacian (N=5) |
---|---|---|---|---|

Gaussian (N=1) | Gaussian (N=2) | Gaussian (N=3) | Gaussian (N=4) | Gaussian (N=5) |
---|---|---|---|---|

Laplacian (N=1) | Laplacian (N=2) | Laplacian (N=3) | Laplacian (N=4) | Laplacian (N=5) |
---|---|---|---|---|

Here, I've calculated the Gaussian and Laplacian stacks for one of my hybrid images shown above.

Gaussian (N=1) | Gaussian (N=2) | Gaussian (N=3) | Gaussian (N=4) | Gaussian (N=5) |
---|---|---|---|---|

Laplacian (N=1) | Laplacian (N=2) | Laplacian (N=3) | Laplacian (N=4) | Laplacian (N=5) |
---|---|---|---|---|

The higher up the Gaussian stack we go, the more the lower frequencies are emphasized and thus the more prominent Eva's face becomes. However, the tiger is most prominent at lower in the Laplacian stack. Since the Laplacian stack is calculated by taking the difference between levels in the Gaussian stack, it makes sense that the most noticable high frequencies would be cut out

in the first few layers.

Now, we can apply our implementation of Gaussian and Laplacian stacks to perform multiresolution blending, as described in Burt and Adelson's 1993 paper. First, we define a mask region and calculate the Gaussian stack \(G_m\) of it, then we calculate the Laplacian stacks of our source and target images as \(L_s\) and \(L_t\) respectively. We then combine each layer of the stacks to form a custom Laplacian stack \(L\) according to \(L(i,j) = G_m(i,j)L_s(i,j) + (1-G_m(i,j))L_t(i,j)\). Adding all the layers together should give us our final blended image.

Here's a recreation of the apple/orange example provided in the paper:

Source | Target |
---|---|

Apple Stack (N=1) | Apple Stack (N=2) | Apple Stack (N=3) | Apple Stack (N=4) | Apple Stack (N=5) |
---|---|---|---|---|

Orange Stack (N=1) | Orange Stack (N=2) | Orange Stack (N=3) | Orange Stack (N=4) | Orange Stack (N=5) |
---|---|---|---|---|

We can also specify a custom mask, allowing us to blend custom regions together. Rather than hard-coding a mask covering half the image, we can define a region as an image, and utilize the same technique as seen above.

Source | Target | Mask |
---|---|---|

Source | Target | Mask |
---|---|---|

Source | Target | Mask |
---|---|---|

This part is where the real fun begins! The goal here was to understand and implement gradient domain processing; we attempt to maintain the colors of the target image while following the gradients of the source.

Turns out, we can model this as a constraint optimization problem. Here's the equation for Poisson blending, where \(i\) is a pixel in source region \(S\), and \(j\) are the neighbors of \(i\).

\(\textbf{v} =\) argmin(\(v\)) of \(\sum_{i \in S ; j \in N_i \cap S} ((v_i-v_j) - (s_i-s_j))^2\) + \(\sum_{i \in S ; j \in N_i \cap \neg S}((v_i-t_j) - (s_i-s_j))^2\)

But first, a proof of concept. We are given a toy problem where we must recover a grayscale image given the color of the the pixel at (0,0) and the gradient of the image. We can use Poisson blending for this - and we set up the constraints as a system of linear equations (\(Ax = b\)) using a sparse matrix to represent \(A\). We'll also need a helper matrix, `im2var`

, to map every pixel to a count; this will help us when we try to match the x- and y- gradients of the values we must solve for to the intensity of the source image. Once we've done that, we can use a least-squares function such as `scipy.sparse.linalg.lsqr`

to solve for \(x\).

Here's the final results:

Source | Reconstructed |
---|---|

Now, it's just a matter of generalizing our implementation of the toy problem to larger images and irregular masks. We need to do some more work for initializing \(A\) and \(b\) (and `im2var`

, by extension) since we need to take the mask into account, but the general steps are the same.

This is my favorite blending result; my implementation was able to fairly accurately match the color of the lighting on the source image cutout to that of the target image.

Source | Target |
---|---|

Mask | Naive Result |
---|---|

Here are some additional blending results:

Source | Target | Mask |
---|---|---|

Source | Target | Mask |
---|---|---|

Source | Target | Mask |
---|---|---|

Source | Target | Mask |
---|---|---|

Here's an example of a blend that didn't turn out too well:

Source | Target | Mask |
---|---|---|

The reason why this blend turned out badly was beause the backgrounds for both images are very different. The areas around the mask on both the source and the target are also quite detailed, and as a result there is a floating area around the subject in the final blended image.

Finally, let's compare two results we got from multiresolution blending with the ones from Poisson blending:

Multiresolution Blending | Poisson Blending |
---|---|

Multiresolution Blending | Poisson Blending |
---|---|

We can see here that multiresolution blending gives us a more seamless blend, but Poisson blending preserves our target image's color. Poisson blending works best if the target and source images share a similar background texture and color.

Writeup by David Xiong, for CS194-26 FA18 Image Manipulation and Computational Photography