CS 194-26 Project 2: Fun with Filters and Frequencies!

Zachary Wu

Introduction

one of the simplest, yet most powerful operations that we can have in our image processing toolkit is that of taking convolutions of images. With some very simple linear operations ie. simply multiplying and adding numbers together, we are able to separate the image into its many different frequencies and create some fun effects. Let's see some examples.

Part1: Fun with filters

Part 1.1 Finite Difference Operator

The first thing we can do with our filters is explore the gradients of images, allowing us to do some edge detection. We can start with some very simple two element filters that will be sensitive to large changes in the x and y directions respectively.

We can convolve this filter with our image, and combine the results to form a psuedo edge detecter. After experimenting, setting a threshold value of .22 allows us to get an image with only the edges.

cameraman

camera_xcamera_ycam_grad

The original image, finite difference in x, finite difference in y, and gradient magnitude image

We now binarize the image, setting all pixels greater than .22 to 1, and everything else 0. this gets the "strongest" gradient magnitudes in the image, and serves to give us a majority of the edges in the image.

camera_edge

part 1.2 Derivatives of Gaussian Filter

The results with just the difference operator were rather noisy, with lots of random speckles and dots, especially in the ground area. Is there anyway to remove these?

we will now add in the Gaussian filter to the mix. Before finding the gradient magnitude, we will first convolve it with a gaussian filter to add a sort of blurring effect.

camera_blur

We then repeat the same process, taking the finite difference and getting the gradient magnitude. and Then finally binarizing the image to get the edges. Not that because we used a gaussian first, the magnitude values are lower and correspondingly, we lower our threshold to 0.05 to get the edges.

camera_blur_edge

We notice that the edges are not a lot thicker and more pronounced, in addition to a lot of the speckles in the ground and other noise being removed now.

In the above example, we do 2 convolutions: one with a gaussian kernel, and then another with the finite difference filters. We can actually only do one convolution on the final image by first convolving the gaussian and finite difference filters together. This leads to a similar result, but only using one convolution on the image.

gauss_dxgauss_dy

the gaussian convolved with the x and y finite-difference filters respectively.

gauss_cam_edge

Part 2: Fun with Frequencies

part2.1: image "sharpening"

In this portion of the project, we can manipulate the frequencies of the images and separate them to get some pretty interesting results.

The first thing we will do is to "sharpen" an image by isolating the high frequencies, and adding more of it back to the original image. 

First we take the original image, blur it with a gaussian filter, and subtract the two to get only the high frequencies of the image.

tajtaj_blurtaj_high

the original image, blurred with a gaussian, and the difference between the two

Taking the high frequencies, we can add it back to the original image, multiplying it by alpha for various levels of "sharpening"

taj_sharp2taj_sharp5taj_sharp10

sharpening with alpha = 2,5,10.

notice that at high levels of alpha, there is a lot of artifacts that are now being introduced. while high levels of sharpening look rather bad, at alpha =2, the image does appear to be sharper with features on the building being much more pronounced. 

Because convolutions is associative, we can actually combine the 2 steps into a single filter. Following the info on the last slide from lecture, we can use only one combined filter and get the same results

taj_sharp2taj_sharp5taj_sharp10

sharpening with alpha = 2,5,10

Here are some other examples of this sharpening process.

amusementamusementamusementamusement

original, then sharpening with alpha = 2,5,10

wonkawonkaamusementamusement

original, then sharpening with alpha = 2,5,10

While in some cases the sharpening process does work decently well, it is not magic. It is not able to make an already blurry photo sharp again. As an example, we will blur 2 of the above examples, and try to sharpen the blurred photo with the same process described. 

taj_blurtajblursharp

sharpening a blurry photo, alpha = 10

We can see that even after sharpening, the photo is still blurry when compared to the original photo.

part2.2 hybrid images

We will now make some fun hybrid images based on the SIGGRAPH 2006 paper by Oliva, Torralba, and Schyns.

These images work by taking low frequencies of one picture, and overlaying it with high frequencies of another picture. Depending on the viewing distance, one or the other image will standout, which is a pretty cool effect.

We first align the images on key features (in this case the eyes) using the provided code.

derekcat

After, we blur one image, while extracting the high frequencies of another. Then we can combine it for our final hybrid image.

derekcat

cat_derek

We will also try the same thing on portraits of Emma Watson, and Emma Stone, creating a hybrid picture of Emma.

watsonstonewatsonstoneemma

For this picture, we can also break down the image into the Fourier domain to see how separating the frequencies affects certain aspects of the image.

watsonstonewatsonstoneemma

we have one more example with a dog and a spider. Unfortunately, this case was a bit of a failure, as the features of the two do not match up that well. They are quite different, and even the width between the eyes makes the hybrid image less than ideal.

 

dogspiderdogspiderspiderdog

 

We can see that eyes are aligned, but the rest of the features (body and legs) are quite different, making the effect not as pronounced.

part2.3 Gaussian and Laplacian Stacks

now we explore using frequencies when it comes to blending. One of the things we can do is separate frequencies by subtracting various levels of blurring to get a certain band of the desired frequency. By doing so, we can then apply a corresponding mask that will help smoothly blend lower frequencies, while keeping detail in high frequencies between the two images.

In our implementation, we use a 5 level stack to produce the results. At each step, we blur the image, and subtract it from the previous level to get a band of frequencies at that level. At each level, we also apply a gaussian blur to the mask that separates the two images.

On the left is the lower levels, and the right side is the final 5th level, in which we just take our blurred image.

01234

01234

01234

 

Below is another laplacian stack for some mountains, which we will also blend in the following part.

01234

01234

01234

part 2.4 MultiResolution Blending

now that we have our laplacian stacks, each of which is weighted by an mask, we can perform our blur. All we have to do is take the stacks of images and add them all together and normalize it. Doing so results in different amounts of blurring for each frequency level due to the masks getting blurred at each level. This resuls in some pretty good blended images, and a yummy orapple!

oraple

b&w: color

In order to perform the operation on color, I had to convolve our gaussian filters across each color channel separately, as well as create an additional 3 dimensional mask (although it is the same across all color channels). To get the laplacian and final image back at each level, I simply stack all color channels back together. After doing so, I normalized the image across all color channels for the end result. Although they are treated separately, they undergo the same operations at each step and can be combined easily.

 

Here is the merged mountain.

season

 

I also performed the same process on grayscale images, to put a group of iconic construction workers over a forest instead of the city in the original photo.

constructionforestconstruction_forest

The results seems decent, although not very convincing because part of the city is still there. If I had the computer chops, I could probably create a better mask thart fully masks out the city for a much more convincing effect.

 

What I learned from this project

This project was incredibly helpful in helping me understand pictures in the frequency domain, allowing for operations beyond pixel by pixel computations. This opens a whole lot of wonderous and cool things to do with these images.

 

I think the coolest part of the project for me is the hybrid images. Not only extracting different frequencies, but also how human perception comes into play. By having mixed frequencies from two images, it allows us to see two different things depending on conditions and distance. It's like the Mona Lisa shown in the lecture, which makes it have it's an incredible mystery on what her expression actually is. Seeing that, and how humans perceived different frequencies was a big takeaway. At the end of the day, human perception of images is ultimately what dictates the interaction, experiences, and feelings we get from viewing images.