Andrew Millman, CS194-26 Project 3

This project involved playing with frequencies and using filters / gradients in order to blend images together. Some of the techniques learned are image sharpening, hybrid image creation, Laplacian stacks, multiresolution blending, and Poisson blending.

Part 1.1: Warm up

In order to sharpen an image, we can take a Gaussian filter of the original image and then subtract it from the original image. This creates a high pass filter, also known as the detail image. The detail image points out the detail, and can be added back into the original image to create a "sharp" effect.
chicken.jpg
Original image
detail_chicken.jpg
High pass filter / detail of the chicken
sharp_chicken.jpg
Sharpened image of the chicken (original + detail)

Part 1.2: Hybrid Images

In this part, we try to create "hybrid images", which are a concatenation of two aligned images. I implemented a high pass filter for the first image, and a low pass filter for the second image. The hybrid image for the Derek, Nutmeg example is below:
derek.jpg
nutmeg.jpg
derek_nutmeg.jpg

Hybrid Examples

Below is an example with Fourier analysis.
putin.jpg
High pass component
Log magnitude of the original Putin FFT
Log magnitude of the high pass Putin FFT
obama.jpg
Low pass component
Log magnitude of the original Obama FFT
Log magnitude of the low pass Obama FFT
putin_obama.jpg
Hybrid image
Log magnitude of the hybrid FFT
Below is another successful example.
mars.jpg
High pass
coin.jpg
Low pass
mars_coin.jpg
Hybrid

Failure Case

Below is a failure case between a cat and a hamster. This failed mainly because the image of the hamster barely had any high frequencies, so the low pass filter almost washed everything out. Meanwhile, the cat is just the opposite, having many high frequencies.
cat.jpg
High pass
hamster.jpg
Low pass
cat_hamster.jpg
Hybrid

Part 1.3: Gaussian and Laplacian Stacks

In this part, we obtain Gaussian and Laplacian stacks. A Gaussian stack is simply defined as GS(i) = GAUSSIAN(img, k*i), so the i-th element of the stack is simply the original image with a gaussian filter proportional to its index in the list. A Laplacian stack is defined as LS(i) = GS(i) - GS(i-1), simply the difference between the Gaussian stack element at that position and the one preceding it.

Gaussian Lincoln Stack


Laplacian Lincoln Stack


Gaussian Putin/Obama Stack


Laplacian Putin/Obama Stack


Part 1.4: Multiresolution Blending

In this part, we smoothly blend two objects together using multiresolution blending with masks. Basically, this means finding the laplacian stacks of both images and the gaussian stack of the mask. Then, at each level i of the stack, use the ith mask in the stack to add the two images together. In essence, CS(i) = LS1(i)*(1-GR) + LS2(i)*GR. The elements in the concatenated stack CS are then added back together to construct the final blended image.

Mask/Apple/Orange


Mask/Mars/Earth


Mask/John Cena/Alexei Efros


Part 2: Gradient Domain Fusion

This part explores more advanced techniques of blending images together by using gradients. In essence, we learned how to break down images into their gradients and reconstruct them based on a system of equations and the fact that we know what the gradient in x and y directions at a certain pixel is. This is very helpful, because with this powerful technique we can break down a source image into its gradients, place it on another target image, and using the target image's information, we can reconstruct the source image in the context of the target image.

Part 2.1: Toy Problem

In this part, we reconstruct an image based off of its gradient data. As shown in the project description, we try to reconstruct an image v such that the difference in gradients between image v and image s are minimized.
toy.png
Original
reconstructed_toy.png
Reconstructed

Part 2.2: Poisson Blending

In this part, we utilize the provided equation in order to blend a source image into the target image. The logic is similar to part 2.1 in that we set up the equation Ax = b, but when building b, instead of having an entry in b equal to only a single direction gradient, we set it equal to the sum of the gradients surrounding a certain pixel. If the pixel is not within the mask, we set the value equal to the value of the target's pixel at that position. That is how we give context to the source image, and force it to adjust its tone (it tries to minimize error by dispersing the error among all the other pixels). Below is a result of the penguin example:
Source/Target/Simple concatenation/Poisson Blending

Favorite result: John Cena on Moon

Source
Target
Simple Concatenation
Poisson Blending

Another successful result:

Source
Target
Simple Concatenation
Poisson Blending

Failure case:

I think this one didn't work out as well because the color of the skyscraper wasn't preserved. The bird image was more forgiving because the bird is supposed to be underwater, so we expect some sort of color distortion, but in this case it looks awkward. The sand also bleeds into the blending and creates an unnatural effect.
Source
Target
Simple Concatenation
Poisson Blending

Multiresolution vs. Poisson

I tried to redo the John Cena face on Alexei Efros head with Poisson blending, and I think both are pretty close. I think the Poisson blending is a bit more natural looking, since multiresolution does a simple gradual face, so some features from one face bleed into another slowly, while Poisson has a more contextual blending on the face and doesn't run into the same issue.
Cena
Alexei
John Efros (Multiresolution)
John Efros (Poisson)