Fun with Frequencies and Gradients!
cs194-26-aea
In this project, we play around with two different transformations of images
for a variety of different ends.
Part 1: Frequency Domain
For this part of the project, we take a note from signal processing and process our images
using tricks in the frequency domain. First we look at the very simple trick
of using the laplacian of an image to sharpen an image.
Next, we create hybrid images by combining the high frequency components of one image with the low
frequency components of another image.
Finally, we use Gaussian and Laplacian stacks to blend two images seamlessly through a technique known
as multi-resolution blending.
1.1 Warmup: Image Sharpening
1.2 Hybrid Images
Hybrid images are a development by
Olivia et. al. that exploits the differences in
how the human perceptual system works. The basic realiziation is that at close-up scales, humans easily perceive
the high-frequency components of an image, while at a distance, humans only perceive the low-frequency
components.
As a result, we can make an image that contains two different images when viewed at different scale
by taking the high frequency components of one image and combining it with the low
frequency components of another image.
Did she just wink at me?
High Freq
Low freq
Hybrid up close
Hybrid (at a distance)
Model T and Model S
High Freq
Low freq
Hybrid up close
Hybrid (at a distance)
Doggo and Pupper
High Freq
Low freq
Hybrid up close
Hybrid (at a distance)
Fourier Analysis of Doggo and Pupper
Doggo in Frequency Domain
Puppy in Frequency Domain
Cory and Soda (Failure Case)
High Freq
Low freq
Hybrid up close
Hybrid (at a distance)
As you can see in the blended image, it's fairly easy to distinguish the Soda even when occluded by
Cory. This is because the occlusion is not full and the angle of hte buildings do not line up, leaving
a large area of the high frequency without a background. This was discussed as a failure case within
Olivia et. al.
Multiresolution Blending
No blending
Seamless blend
The frequency domain provides a nice tool for smooth blending. If we wanted to blend two images together,
a naive approach might be to simply copy and paste one image on top of the other. However, this results
in pretty terrible looking images with a very obvious seam in the center of the image. Instead, we would
like to automatically be able to add these images together without any obvious seam.
Multiresolution blending allows us to do this via Laplacian and Gaussian stacks/pyramids.
Laplacian stacks are basically a nice representation of many different sets of frequencies in
an image. Multiresolution blending effectiely merges the images across the different frequencies, blurring the boundary of the image at each frequency.
Then after merginging the different frequencies, they are merged together for a final result.
BearTiger
Source Image 1
Source Image 2
Naive copy
Blended Image
Catface
Source Image 1
Source Image 2
Naive copy
Blended Image
Now we can also analyze the different frequencies of the image
Hand-nose (failure case)
Source Image 1
Source Image 2
Naive copy
Blended Image
This failure case likely occurred because frequency domain blending softens the boundary of the mask then
effectively takes the weighted average in the higher frequencies of the image. With all the white space in the
masked region, the algorithm picks up way too many misleading color clues and creates an unintended faded
image.
Part 2: Gradient Domain
The gradient domain provides an alternative method of blending, that also produces very realistic results.
The basic idea is that you can take a source of gradients and paste those onto a new image which acts as the
source of lighingt. We copy the gradients from the source image in our masked area and take color clues from the target image
in the pixels on the boundary of the mask. We then solve an optimization problem that tries to minimize the difference between
the gradients of the source and new image as well as balance the colors of the surrounding areas.
Toy Problem
As a test of this idea, first we run a toy implementation where we calculate the gradients of each pixel
with respect to the pixel immediately above and the pixel immediately to the right, Use a color clue from
a single pixel, then solve a least squares optimization given these constraints to find a new set of gradients.
Toy Image
Single "color" constraint
Reconstructed Image
Poisson Blending
Now we get into the meaty stuff. Poisson blending is a gradient-domain blending method that extends the toy
problem to work with color images and for masked imges.
The basic premise is that you calculate the gradients for each pixel inside the mask and try to minimize the different
of those pixels inside of the resultant image mask. You also have an added constraint for the boundary pixels that they
equal the sum of the neighboring pixels that are outside of the boundary, $\partial\Omega$. This boils down as so:
Thus for each pixel $p$, with a neighborhood of $N_p$ pixels (those pixels that are immediately above, below, left, and right of the pixel and are also in the mask),
we satisfy two constraints:
$$ |N_p|f_p - \sum_{f_q \in N_p} f_q = \sum v_{pq}$$
and
$$ f_p = \sum_{f^*_q \in N_p \cap \partial \Omega} f^*_q $$
We then use least squares to try to find the $\hat{v}$ that minimizes the error from this equations.
I originally tried to optimize the runtime of this algorithm by simply calculating these constraints on the borders,
however after days of frustration I decided to actually follow the old CS mantra of avoiding premature optimization
and ended up getting a relatively fast algorithm for the images I was using (typically less than 1000px in width and height).
On an alien planet
Foreground Image
Background Image
Naive Copy and Paste
Blending result
The above result was a light at the end of a very very dark tunnel. I had played around for hours upon hours with the
different constraints until I finally tried using just the sum of the target pixels on their own. Definitely a step away
from the equations provided in the project spec and the original paper, yet this method appears to work!
Mini-surfer on a calm lake
Foreground Image
Background Image
Naive Copy and Paste
Blending result
Lake Latte
Foreground Image
Background Image
Naive Copy and Paste
Blending result
Failure case. The edge blending is pretty bad and the color contrast in the source image w.r.t
the target image means there are some lingering elements of the original brown of the coffee.
Now the question of hour, what is the comparison of Laplacian blending with our new gradient-based Poisson
blending method? We grab our beartiger image from the previous example for reference and try
blending with Poisson
Bear Tiger comparison of Multiresoltuion and Poisson Blending
Foreground Image
Background Image
Naive Copy and Paste
Multiresolution
Poisson
As you can see the multiresolution blending produced the best result. There are a number of possible causes, but based on
my experience through these projects, I believe that for patches of images, Laplacian outshines Poisson.
Typically for patches, we worry that the textures between two images might not match up when we try to blend them.
The Laplacian stack method uses blending at high frequencies to mitigate this issue, creating a subtle change of textures
betwen images. However, the Poisson attempts to preserve the gradients as much as possible, but doesn't care as much about
color. I believe thats why the seams are much more apparent in the poisson model because the gradients clash in the images.
It's possible that using a mixed gradients approach might fix this issue, but I did not have the time
to evaluate this approach.