Fun with Frequencies and Gradients!

Phillip Kuznetsov

cs194-26-aea

In this project, we play around with two different transformations of images for a variety of different ends.

Part 1: Frequency Domain

For this part of the project, we take a note from signal processing and process our images using tricks in the frequency domain. First we look at the very simple trick of using the laplacian of an image to sharpen an image. Next, we create hybrid images by combining the high frequency components of one image with the low frequency components of another image. Finally, we use Gaussian and Laplacian stacks to blend two images seamlessly through a technique known as multi-resolution blending.

1.1 Warmup: Image Sharpening

Original Image (Greyscale)
Laplacian Image
Image Sharpened
Original image vs sharpened image.

1.2 Hybrid Images

Hybrid images are a development by Olivia et. al. that exploits the differences in how the human perceptual system works. The basic realiziation is that at close-up scales, humans easily perceive the high-frequency components of an image, while at a distance, humans only perceive the low-frequency components. As a result, we can make an image that contains two different images when viewed at different scale by taking the high frequency components of one image and combining it with the low frequency components of another image.

Did she just wink at me?

High Freq
Low freq
Hybrid up close
Hybrid (at a distance)

Model T and Model S

High Freq
Low freq
Hybrid up close
Hybrid (at a distance)

Doggo and Pupper

High Freq
Low freq
Hybrid up close
Hybrid (at a distance)

Fourier Analysis of Doggo and Pupper

Doggo in Frequency Domain
Puppy in Frequency Domain
High frequencies of Doggo
Low frequencies of Pupper
Composite
We can observe the properties of the frequency domain based on these results. For the high pass fiter, we can see a noticable black dot in the center of the image. If we increase the size of the low pass filter used to create the high pass filter, we can control the size of that center dot. Inversely, the low pass filter produces a frequency representation with very few values as you move away from the center, as expected. What's nice, however, is that the Hybrid image has a nice even distribution, similar to that of the unfiltered images

Cory and Soda (Failure Case)

High Freq
Low freq
Hybrid up close
Hybrid (at a distance)
As you can see in the blended image, it's fairly easy to distinguish the Soda even when occluded by Cory. This is because the occlusion is not full and the angle of hte buildings do not line up, leaving a large area of the high frequency without a background. This was discussed as a failure case within Olivia et. al.

Multiresolution Blending

No blending
Seamless blend
The magical Oraple
The frequency domain provides a nice tool for smooth blending. If we wanted to blend two images together, a naive approach might be to simply copy and paste one image on top of the other. However, this results in pretty terrible looking images with a very obvious seam in the center of the image. Instead, we would like to automatically be able to add these images together without any obvious seam. Multiresolution blending allows us to do this via Laplacian and Gaussian stacks/pyramids. Laplacian stacks are basically a nice representation of many different sets of frequencies in an image. Multiresolution blending effectiely merges the images across the different frequencies, blurring the boundary of the image at each frequency. Then after merginging the different frequencies, they are merged together for a final result.

BearTiger

Source Image 1
Source Image 2
Naive copy
Blended Image

Catface

Source Image 1
Source Image 2
Naive copy
Blended Image
Now we can also analyze the different frequencies of the image
High Frequencies
Medium Frequencies
Low Frequencies
Analysis of the different frequencies of the masked image of the human.
High Frequencies
Medium Frequencies
Low Frequencies
Analysis of the different frequencies of the masked cat image. As you can see, the mask starts to bleed at lower frequencies, incorporating the information from those levels more.
High Frequencies
Medium Frequencies
Low Frequencies
Analysis of the different frequencies of the masked image of the human.

Hand-nose (failure case)

Source Image 1
Source Image 2
Naive copy
Blended Image
This failure case likely occurred because frequency domain blending softens the boundary of the mask then effectively takes the weighted average in the higher frequencies of the image. With all the white space in the masked region, the algorithm picks up way too many misleading color clues and creates an unintended faded image.

Part 2: Gradient Domain

The gradient domain provides an alternative method of blending, that also produces very realistic results. The basic idea is that you can take a source of gradients and paste those onto a new image which acts as the source of lighingt. We copy the gradients from the source image in our masked area and take color clues from the target image in the pixels on the boundary of the mask. We then solve an optimization problem that tries to minimize the difference between the gradients of the source and new image as well as balance the colors of the surrounding areas.

Toy Problem

As a test of this idea, first we run a toy implementation where we calculate the gradients of each pixel with respect to the pixel immediately above and the pixel immediately to the right, Use a color clue from a single pixel, then solve a least squares optimization given these constraints to find a new set of gradients.
Toy Image
Single "color" constraint
Reconstructed Image

Poisson Blending

Now we get into the meaty stuff. Poisson blending is a gradient-domain blending method that extends the toy problem to work with color images and for masked imges. The basic premise is that you calculate the gradients for each pixel inside the mask and try to minimize the different of those pixels inside of the resultant image mask. You also have an added constraint for the boundary pixels that they equal the sum of the neighboring pixels that are outside of the boundary, $\partial\Omega$. This boils down as so: Thus for each pixel $p$, with a neighborhood of $N_p$ pixels (those pixels that are immediately above, below, left, and right of the pixel and are also in the mask), we satisfy two constraints: $$ |N_p|f_p - \sum_{f_q \in N_p} f_q = \sum v_{pq}$$ and $$ f_p = \sum_{f^*_q \in N_p \cap \partial \Omega} f^*_q $$ We then use least squares to try to find the $\hat{v}$ that minimizes the error from this equations. I originally tried to optimize the runtime of this algorithm by simply calculating these constraints on the borders, however after days of frustration I decided to actually follow the old CS mantra of avoiding premature optimization and ended up getting a relatively fast algorithm for the images I was using (typically less than 1000px in width and height).

On an alien planet

Foreground Image
Background Image
Naive Copy and Paste
Blending result
The above result was a light at the end of a very very dark tunnel. I had played around for hours upon hours with the different constraints until I finally tried using just the sum of the target pixels on their own. Definitely a step away from the equations provided in the project spec and the original paper, yet this method appears to work!

Mini-surfer on a calm lake

Foreground Image
Background Image
Naive Copy and Paste
Blending result

Lake Latte

Foreground Image
Background Image
Naive Copy and Paste
Blending result
Failure case. The edge blending is pretty bad and the color contrast in the source image w.r.t the target image means there are some lingering elements of the original brown of the coffee.
Now the question of hour, what is the comparison of Laplacian blending with our new gradient-based Poisson blending method? We grab our beartiger image from the previous example for reference and try blending with Poisson

Bear Tiger comparison of Multiresoltuion and Poisson Blending

Foreground Image
Background Image
Naive Copy and Paste
Multiresolution
Poisson
As you can see the multiresolution blending produced the best result. There are a number of possible causes, but based on my experience through these projects, I believe that for patches of images, Laplacian outshines Poisson. Typically for patches, we worry that the textures between two images might not match up when we try to blend them. The Laplacian stack method uses blending at high frequencies to mitigate this issue, creating a subtle change of textures betwen images. However, the Poisson attempts to preserve the gradients as much as possible, but doesn't care as much about color. I believe thats why the seams are much more apparent in the poisson model because the gradients clash in the images. It's possible that using a mixed gradients approach might fix this issue, but I did not have the time to evaluate this approach.