📈

CS 194: Project 2

Fun with Filters and Frequencies

Using convolutions and kernels to play with images

Introduction

In this project, we tried our hand at putting together convolutions of gaussians, laplacians-of-gaussians, and other types of filter kernels to create interesting effects. Armed with nothing more than these filters and a few images, we set about making all manner of fascinating (and unsettling) images, combining different filters and bands of frequencies in our quest to create never-before-seen images, for the viewing pleasure of our esteemed faculty.

So. Without further ado, I present... my Project 2 menagerie!

Part 1.1: The Finite Difference Gradient Operator

The finite difference operator is one of the simplest kernels in image processing, consisting of nothing but a 1-by-2 or 2-by-1 matrix, with one of the entries having value 1, and the other having value -1. This operator essentially computes the difference between two adjacent pixels—that is, x1−x2x_1-x_2. This allows the Finite Difference operator to act as a makeshift edge-detector, with the value only being far from zero when the two adjacent pixels vary greatly in magnitude, thus creating visible peaking where there are many differences, and flatter areas where there are few. This effect can be best illustrated through images, rather than words:

The image in the top-left of this collage is the original image, showing a cameraman with his camera and other equipment. To the right and bottom of the image are the x-based gradient operator ([1−1]\begin{bmatrix}1 & -1\end{bmatrix}) and y-based gradient([1−1]\begin{bmatrix}1 \\ -1\end{bmatrix}), respectively, convolved with the original image. Finally, in the bottom-right is the sum of the absolute values of the gradients, which lets us see the locations of large changes in magnitude, regardless of x/y orientation or direction of change.

Finally, below is the edge image, thresholded so as to only show the largest changes in magnitude, which mark the edges across which change is greatest.

Part 1.2: The Derivative of Gaussian Filter

However, we note that the images returned using the finite difference operator are a little bit noisy and do not appear to be particularly smooth. With this in mind, we create the Derivative of Gaussian filter. This filter is created by convolving the results of our finite difference operator with a gaussian filter, which is one of our basic smoothing operators. The results speak for themselves:

Notice how much smoother these images are when passed through the gaussian filter. Our new images work better for outlining our subject, and create a more natural-looking edge detector.

Note that this same effect can be achieved through the use of only a single filter, by first convolving the finite difference operator with a gaussian filter, and then using that new kernel on your image. The kernels used to create these images look like this:

Notice that these kernels look just like smoothed versions of the [1−1]\begin{bmatrix}1 & -1\end{bmatrix} and [1−1]\begin{bmatrix}1 \\ -1\end{bmatrix} derivative kernels—which is because they are! Since these were obtained by convolving a gaussian with the derivative, when looking at their values, we get a similar smoothing effect.

Part 1.3: Image Straightening

Another interesting application of these gradient filters involves their directions. With an estimate of the gradient of an edge in both the x- and y-directions (like what we obtain when convolving our image with both the vertical and horizontal DoG filter) we can easily guess the orientation of the edge. Armed with that orientation, we can try out different rotations to try and get as many of the edges as possible to line up with vertical or horizontal lines.

Some Examples:

The facade of this gorgeous building

A Mondrian rotated completely out of whack:

The no-longer-quite-so-leaning tower of Pisa:

A shot of beautiful San Francisco:

In this case, our algorithm actually fails to find the optimal rotation, instead choosing an orientation that aligns the ridge of mountains in the background with horizontal and leaves the city looking like it just had another earthquake.

2.1: Image Sharpening

Using our Gaussian Filter helps us blur out the sharpest details of our images, leaving only the lower frequencies. But what if we wanted the opposite effect? What if we wanted to remove some of our lower frequencies, and leave only the highest ones?

Well, it turns out that the answer is surprisingly simple! Just subtract out the low frequencies obtained by gaussian filtering, and all that's left will be the high-frequency details! This procedure is known as "unsharp masking," and it is a way to make the high-frequency data in your image more apparent. It is also commonly referred to on phones as "sharpening," which is a somewhat deceptive term, as it is only increasing the prevalence of the high-frequency information, not somehow gaining more detailed information.

After the unsharp masking procedure, the details of the stonework and facade become much more apparent. These high-frequency details are hard to spot in the original image, but become much more accentuated after being given a little bit more room in the frequency mix.

Some other (subtler) examples:

Lunch Atop a Skyscraper:

Notice the accentuation of some of the windows and architectural details of the buildings in the background. Gives the old photograph a little bit more "pop."

The extra high-frequency information in this portrait of Salvador Dali brings out some extra details in the splashing water droplets. However, it also accentuates the noise in the background, which is a common drawback of this unsharp masking procedure.

To demonstrate that this operation is similar to, but not the exact inverse of the gaussian blur, we can first blur, and then attempt to "unblur" our image using the unsharp masking technique.

However, it quickly becomes clear that the recovered image is not identical to the original image. Despite the sharpening performed by the laplacian convolution, the resulting image has fuzzier edges, since the high frequency information was removed before we attempted to "add it back in." Because the unsharp masking technnique cannot uncover information that is not already in the image, we are left with an imperfect replica of our original.

2.2: Hybrid Images

Now that we have a tool for isolating out the high and low frequencies of an image, we can use our newfound power to create hybrid images, with frequency information from different sources. These images look like one thing from far away, and another thing up close. In theory, this is because when high-frequency information is available, our brains tend to rely on it to construct the details of an image. But when it is unavailable, such as when a picture is small or far away, our brains default to low frequency information. This allows us to create images that look different, depending on where they're viewed from.

Examples:

Regina George with Pearl Earring:

Catazawa

When using this technique with colored images, it is perfectly possible to keep the color from both the high frequency and the low frequency. However, lower-frequency color tends to appears as large blocks, while high-frequency color tends to only appear as noise around the sharpest edges. Because of that, color is best used on the low-frequency images, as using it with the higher frequency images is a subtle effect at best.

To illustrate the effects of our recombination technique, we will create plots of the frequencies present in our image in the fourier domain (using Regina and Girl with Pearl Earring as our example):

Fourier plot of Regina George
Fourier plot of Girl with Pearl Earring
Fourier plot of low-frequency image after filtering
Fourier plot of high-frequency image after filtering
Fourier plot of resulting combined image. Notice the combination of the signatures of both the low-frequency and high-frequency images.

2.3 Gaussian and Laplacian Stacks

Another application of our isolation of high and low frequencies is the ability to repeatedly apply them to get different frequency ranges for an image. Repetitive application of a gaussian results in a gaussian pyramid, which displays lower and lower frequencies of an image. Subtracting adjacent layers of a gaussian pyramid results in a laplacian pyramid, which gives you different BANDS of frequency from images.

A gaussian pyramid of Dali's famous "Lincoln in Dalivision" painting is shown below. Notice that the naked figure becomes less obvious as we go down the stack, while Lincoln's head becomes more apparent.

Looking instead at the Laplacian stack, notice which details are included at each frequency level of the image, and how they become more or less apparent as the bands of frequency change. Also notice that colors begin to appear as the frequency bands decrease:

Here is the same stack process for Regina with Pearl Earring, as constructed above:

Gaussian:

Laplacian:

2.4: The Almighty Orapple

One final application of this technique is the ability to blend images without disrupting patterns too obviously by separately blending each layer of a laplacian stack together. This process creates blended images at each level of the frequency domain, making for a more subtle shift between the two images.

For example, this can be used to combine an image of an orange and an apple into an orapple:

Notice the cleaner blending between the two halves, including the more subtle shifts in color and pattern as one crosses the seam as compared to simply cropping the two images together.

This technique can be used to combine all sorts of things, as long as their edges and shapes somewhat align, simply by adjusting the shape of the mask:

Examples:

Efrozawa:

Tigertiger:

Kamyas the Tank Engine

Blending stack for Kamyas: