Project 2: Fun With Filters and Frequencies

Aaron Sun 3033976755 Fall 2021

Part 1: Fun with Filters

1.1: Finite Difference Operator

Suppose we want to take the derivative of an image. We can do this using an approximation of the derivative, g'(a) = (f(a+1)-f(a)). Since our images are 2 dimensional, we do this in both the x and y direction in order to find the gradient of the image.

We can use this to detect edges on an image. Consider this image of a cameraman:

We use a convolution to find the x direction using the 1x2 array [[1, -1]] and in the y direction using the 2x1 array [[1], [-1]].

Notice how the y derivative doesn't catch the middle of the tripod because its edge is completely vertical.

We can combine the x and y derivatives to find the magnitude of the gradient, sqrt(dx^2 + dy^2). Here's what that looks like:

And finally, we can make the image binary in order to find exactly which pixels are edges. Here we used a threshhold of 50 after some experimentation - lower threshholds include noise and other artifacts which we want to remove.

1.2: Derivative of Gaussian Filter

Rather than choosing the threshhold by hand, we can try to improve our binary image by first smoothing the image. Using a Gaussian kernel we can blur the image and remove any high frequency noise, since the Gaussian kernel is effectively a low-pass filter.

Using the same process as before on this image, we get the much less noisy gradient magnitude. In particular, when in the previous image we could see plenty of small noise from the grass in the photo, after blurring that vanishes entirely from the gradient magnitude.

Because there is such little noise, we can achieve a similar result as before using a threshhold of just 1:

However, this process is slightly inefficient. If we wanted to do this on multiple images, we would have to blur each one then take the derivative in x and y.

We can actually save computations by combining these two steps. Since the convolution is associative, we can just convolve the Dx, Dy with the gaussian first, then operate the resulting kernel on the input. That is, (image * gauss) * Dx = image * (gauss * Dx). We can just reuse (gauss * Dx) for all images and save ourselves a lot of work.

We see this gives us identical results:

Part 2: Fun with Frequenices

2.1: Sharpening

Using frequencies, we are able to sharpen an image. That is, we increase the high frequencies on an image to make it appear sharper. When used in moderation, this effect can be excellent.

How do we extract the high frequencies to amplify them? We make use of the fact that the image = high frequencies + low frequenices. We can find the high frequencies by subtracting the low frequencies (a blurred version of the image) from the original. Doing that with sigma = 3 for our Gaussian, we get:

Then adding some factor of this back to the original image gives us our shapened image (here we used 3 for our sigma, 2 for our sharpening factor):

As the title of the above image may allude to, this process is inefficient. Once again, we can take advantage of the associativity, distributivity, and commutativity of the convolution. If we have our image f, Gausian g, and sharpening factor s, our output is f + s * (f - (f * g)) = f * (e + s * (e - g)) = f * ((1 + s) * e - s * g).

So we are able to do this entire operation in one step. That's good to know! Let's try that:

And we see the results look identical, as promised by the math.

Let's try this on one of my personal photos. This is a Townsend's Warbler I saw at Lake Merritt:

Let's now sharpen it. I used a sigma of 10 since the image is much bigger, but left the sharpening factor as 2 still.

I have a blurred image I took of a mystery bird from Point Isabel Regional Shoreline.

It might be hard to identify it when the image is so blurred. Maybe if we could sharpen it, we would be able to identify the bird. Let's try that now!

Looks like a house sparrow to me! This is actually an image which I blurred with sigma = 5 and then sharpened with sigma = 10. This effect isn't able to make the image look exactly as it was before the convolution, but it certainly makes the image edges a bit sharper.

2.2: Hybrid Images

There's a super cool effect where an image looks like one thing up close and another thing far away. This is called a hybrid image, and it comes about when an image has the high frequenices from one image and the low frequencies from another.

Let's try blending these two images together. Taking the high frequencies of nutmeg (the cat) and the low frequencies of Derek and averaging the result gives us the following image:

Cool! Here we used sigma = 10 for the low frequency image and sigma = 50 for the high frequencies. We can visualize these frequencies using FFT. Here, the left image is the low frequenices from Derek and the right image is the high frequencies from nutmeg.

Let's try the same thing, but with birds. Here are two images I took at Lake Merritt - one is a California Gull, the other is an American Coot.

Here are the results.

It kind of... didn't work. The problem seems to be that the coot and the gull differ too much in color to blend well. From afar, the gull is pretty good. But from close up, the coot is clearly surrounded by a gull halo. Unfortunately, it looks like these images aren't compatible with one another.

I tried the same thing in color, but got similar results as before.

We can experiment by having high frequencies in color and low frequenices in grayscale, and vise versa. First, high frequencies in color and low frequenices in grayscale:

This one might be a bit better, since the colors on the coot stand out more from up close.

This one is even worse! The coot is more drab in grayscale so the gull is so obvious. I would speculate that it's better to have the high frequenices in color since the colors are more apparent from up close, enhancing the high frequency effect. Meanwhile as you get futher away the colors become less apparent and the black and white image stands out more.

Let's try the same thing with some slightly more similar objects. I used an orange and an apple (these will come back in the next section):

Because these images are much more similar, we get a much more pleasant result:

From up close, we definitely see an orange-like texture. But once we look from afar, that all smooths out and the fruit becomes more like an apple.

2.3 Gaussian And Laplacian Stacks

In this section we make a Gaussian stack, which is a single image which has been blurred using different sigmas. Let's try this on the apple image from before:

We can make use of this to make a Laplacian stack, which is a frequency-based stack. We can simply take the differences between the layers of a Gaussian stack in order to find the Laplacian stack. Note that the last image in the stack is the same as in the Gaussian stack.

Cool! We can do the same thing in color too, this time with the orange.

2.4 Multiresolution Blending

One neat way to blend images is to combine them at separate frequencies using a mask. Here, we use a 0-1 mask, then combine each layer of the frequency stack separately with a blurred mask (blurred to the same degree as the level of the stack).

We can visualize each layer of the stack and how it combines:

These sum to give us our two image halves:

Now let's try this on our two birds from earlier which refused to hybridize. This time, we need to use a custom mask, which I've shown below. We will paste the coot head onto the gull.

The results are both hysterical and monstrous.

When I see this, what comes to mind is one of the most iconic quotes from Heart of Darkness: "The horror! The horror!"

We see that the blending worked, in theory. The coot's head was correctly pasted onto the gull. The head may be a big smaller, but the fit on the body is pretty good somehow, and the transition on the neck is quite smooth between the gull and the coot.

Here are the separated layers of the Laplacian stack with the mask applied. And finally below is the sum of the two above arrays in layer by layer fashion.

Reflections

My favorite part of this project was when I was able to create my own Frankenstein of the coot and gull. I thought it was very cool how using differently blurred masks on different frequencies allowed us to create a smoothly blended image, much better than if we chose to work in the spatial domain.