Fun with Filters and Frequencies!

Roma Desai | CS-194 Project 2

 

Part 1.1: Finite Difference Operator

 

For this section, I used finite difference operators and the gradient magnitude to detect the edges of an image. This process involved first taking the derivative of the image in the x and y direction. We get the following result:

 

Dx = [1,-1] Filter

Dy = [1,-1]T Filter

 

 Then, to turn this into an edge image, I computed the magnitude of the image gradient by calculating the following for every pixel:

 

 

Once I had the magnitudes of the gradient vectors, I binarized the output by assigning all values >.3 to 0 and all values <.3 to 1 to reduce noise. Because the gradient vector points in the direction of the most rapid increase in intensity, it correctly identifies all edges in an image. Here is the resulting edge image:

 

 

 

Part 1.2: Derivative of Gaussian (DoG) Filter

 

Because the edge detection was a little noisy in the previous section, in this section I implemented gaussian smoothing to detect the edges and reduce noise. This operation consists of creating a gaussian filter, convolving it with the image, and then convolving the result with the finite difference operators from the previous part. We get the following result:

 

Compared to the result from 1.1, the edges here are much more pronounced and there is much less noise. However, for this operation, we had to use 2 convolutions with the entire image. We can make this operation faster by convolving the gaussian with the derivative before convolving with the image. We get the two DoG filters below:

 

D_x of Gaussian Filter

D_y of Gaussian Filter

 

Applying these filters to the original image, we get the same edge image as above:

 

 

 

 

Part 1.3: Image Straightening

 

For this section, I used the gradient angle to straighten images. By taking the gradient and finding the angle at ever pixel, you can count the number of horizontal and vertical edges that are in the “correct” orientation. I made a histogram of angles for each image with the bins allocated in the following way: [-185, -170, -95, -85, -558595175185]. This made sure to count all the lines that were at -180º, -90º, 0º, 90º, and 180º and put them into separate buckets. I kept a tolerance around these angles to account for any noise. All angles not within the range were put into larger buckets. In the end, I chose the rotation angle that gave the largest percentage of horizontal and vertical edges in the image out of all total edges. I got the following results:

 

Original Image

Histogram

Straightened Image

New Histogram

As you can see, not all the images were successfully rotated. The dog image looks very similar to the original input image. This is because the dog image does not have very many horizontal or vertical lines. The image is mostly constructed of irregular lines so when the algorithm searches for lines that are at angle of +/- 0,90 or 180º, it does not find any.

 

 

 

 

Part 2.1: Image “Sharpening”

 

For this section, I implemented image “sharpening” by taking out the high frequencies in the image and adding it back to the original to increase the clarity. Although there is no new information being added to the image, increasing the number of higher frequencies appears to sharpen the image to the human eye. I first convolved the image with a gaussian filter to extract the low frequencies. Then I subtracted this from the original image to get the high frequencies. Once I got the high frequencies, I scaled them to an appropriate amount and then added them back to the original image. I combined all these operations into a single filter which was then applied to each image with a single convolution. The results are shown below:

 

 

Original

“Sharpened”

 

 

To test my implementation further, I took a clear image, blurred it, and then tried to sharpen the blurred image by increasing the number of high frequencies. Surprisingly (or not surprisingly!), the blurred then sharpened image looks very similar to the original image.

 

 

Original Image

Blurred then Sharpened

 

 

 

Part 2.2: Hybrid Images

 

In this part, I created various different hybrid images. These images are special because from up close you see one image but from further away, you see a completely different image. This is due to the frequencies contained in the images. To create the hybrid images, I lowpass filtered one image, high pass filtered the other image, and averaged the two images together. Because our eyes can only see high frequencies up close, the high pass filtered image appears when close to the image. Because our eyes can see low frequencies from far away, the low pass filtered image appears when viewing from a distance.

 

The two images contained in the hybrid image can be seen more or less by changing the cutoff frequency. This is the value that lets us decide what frequencies to let through and which to throw out. In terms of gaussian filtering, this was encoded through the kernel and sigma values. If the kernel was large and the sigma value was high, then the gaussian averaged together a larger amount which resulted in more high frequencies being eliminated. If the sigma value was very low, then the value of each pixel depending less on the surrounding ones so more high frequencies were let through. The results are shown below:

 

 

Image 1 (highpass filtered)

Image 2 (lowpass filtered)

Hybrid

A cat sitting on top of a wooden door

Description automatically generated

Nutmeg

A person posing for the camera

Description automatically generated

Derek

A cat that is looking at the camera

Description automatically generated

Nutrek

A person wearing a suit and tie smiling at the camera

Description automatically generated

Obama

A close up of a dog

Description automatically generated

Puppy

A dog looking at the camera

Description automatically generated

Puppama

A person wearing glasses and looking at the camera

Description automatically generated

Steve

A person wearing glasses and smiling at the camera

Description automatically generated

Bill

A person wearing glasses and smiling at the camera

Description automatically generated

Steve Gates

 

 

As you can see, Puppama (Obama + Puppy) didn’t work as well compared to the other hybrid images. Because both images have hard, high frequency lines, the images were not as compatible as the others. Also, because the size of the faces in the two images were different, it is much easier to see the individual outlines.

 

 

 

Fourier Domain

 

For the Steve Jobs and Bill Gates images, I displayed the images in the log Fourier domain. The highpass Fourier image can be seen to have more high frequencies due to the spread of the points while the lowpass Fourier image points are more centered around the middle.

 

 

Image 1 Fourier

Image 2 Fourier

Lowpass Fourier

Highpass Fourier

Hybrid Fourier

A picture containing building, door, sitting, standing

Description automatically generated

A picture containing building, standing, sitting, person

Description automatically generated

A picture containing building, tub, rain

Description automatically generated

A picture containing sitting, door, small, standing

Description automatically generated

A close up of a door

Description automatically generated

 

 

 

2.2 Bells and Whistles

 

I combined the Steve and Bill images in grayscale as well as with one in color while the other not being in color. I found that keeping the lowpass image in color causes the image to do much better that regular. I believe this is because color is usually not a quickly changing variable. It tends to change slowly throughout an image and is more closely related to lowpass frequencies. Because of this, our eyes see the image better if only the lowpass filtered image is in color.

 

 

Gray

Lowpass Image Colored

A person wearing glasses and looking at the camera

Description automatically generated

A person wearing glasses and smiling at the camera

Description automatically generated

 

 

 

Part 2.3: Gaussian and Laplacian Stacks

 

For this part, I implemented a gaussian and laplacian stack to analyze the frequency composition of the image. At each step of the stack, I filtered the image with a gaussian filter (adding on another gaussian filter to the previous one at each level) to create the gaussian stack. At the same time, I subtracted gaussian filters between iterations to create the laplacian stack. The result is shown below:

 

Gaussian Stack

A picture containing building, room, table, standing

Description automatically generated

A picture containing building, person, room

Description automatically generated

A picture containing building, table, person

Description automatically generated

The inside of a building

Description automatically generated

The inside of a building

Description automatically generated

Laplacian Stack

A close up of a logo

Description automatically generated

A picture containing building, street, photo, brick

Description automatically generated

A sign on a brick wall

Description automatically generated

A sign on a brick wall

Description automatically generated

A close up of a brick building

Description automatically generated

 

In the original image “Lincoln and Gala”, Lincoln mostly consisted of low frequencies so he could only be seen from a distance while Gala was made of high frequencies so she could only be seen from up close. Decomposing the image proves just that. In the gaussian stack, we can see Lincoln much more clearly while in the laplacian stack, we can see Gala more clearly.

 

I also created a gaussian and laplacian stack for my “Steve Gates” image from part 2.2. Similarly, we can see how Steve (high pass filtered) is much more apparent in the laplacian stack while Bill (low pass filtered) becomes more apparent in the gaussian stack.

 

 

Gaussian Stack

A person wearing glasses and smiling at the camera

Description automatically generated

A person wearing glasses and smiling at the camera

Description automatically generated

A person wearing glasses and smiling at the camera

Description automatically generated

A person wearing glasses and smiling at the camera

Description automatically generated

A person wearing glasses and smiling at the camera

Description automatically generated

Laplacian Stack

A picture containing drawing

Description automatically generated

A close up of a person

Description automatically generated

An old photo of a person

Description automatically generated

An old photo of a person

Description automatically generated

An old photo of a person

Description automatically generated

 

 

 

Part 2.4: Multiresolution Blending

 

For this section, I implemented multiresolution blending to seamlessly blend together two different images. For each pair of images, I created a mask image that determined how the images would be blended together. To do the actually blending, I calculated the laplacian stack of both images, the gaussian stack of the mask and stitched together the individual levels by using the mask gaussian as a weight function. The mask image was white (pixel value = 1) where the images should overlap and black (pixel value = 0) otherwise. By using this mask, it caused certain image pixels to appear while others were zero-ed out by multiplying the pixel value by the mask pixel value of zero. By adding together the images by individual levels, we are blending the images together by frequencies instead of all together at once which results in a much smoother look. Here are some of the results:

 

 

Image 1

Image 2

Mask

Result

A picture containing sitting, indoor, orange, table

Description automatically generated

Apple

A picture containing sitting, indoor, orange, table

Description automatically generated

Orange

A picture containing drawing, table

Description automatically generated

A picture containing indoor, sitting, orange, table

Description automatically generated

Orapple

A close up of a flower garden in front of a brick building

Description automatically generated

My Backyard

A large stone building

Description automatically generated

Campanile

A picture containing logo

Description automatically generated

A small church in a garden

Description automatically generated

UC Berkeley @ Home!!

A vase of colorful flowers

Description automatically generated

Sunflower

A person holding a dog

Description automatically generated

My Dog

Icon

Description automatically generated

A picture containing orange, sitting, dog, flower

Description automatically generatedSundog

 

 

To illustrate the process a little better, here is the laplacian stack output for my “UC Berkeley @ Home” image. As you can see below, different frequencies are extracted at each level with the highest being at the first level. The different frequency bands are blended together at each level.

 

 

Level 0

Level 1

Level 2

Level 3

Level 4

A picture containing engineering drawing

Description automatically generated

A picture containing diagram

Description automatically generated

A picture containing photo, old, street, city

Description automatically generated

A picture containing photo, old, street, city

Description automatically generated

A picture containing outdoor, building, photo, old

Description automatically generated

 

 

Final Thoughts:

 

One thing I found super interesting was the power of the gaussian filter. While it appears to be a simple averaging function, it has many amazing results. I thought it was really neat how despite the gaussian actually “blurring” an image it resulted in better edge detection due to the removal of noise. Simply adding the gaussian filter drastically improved the derivative filter effects which I thought was super cool. I also thought the gaussian filter was especially interesting in its abilities to separate the low and high image frequencies which led us to create some really cool effects such as the hybrid images and seamless image blending! Overall, I think the versatility of the gaussian filter is what surprised me the most.