Project 2:
Fun with Filters and Frequencies!


Author: Isaac Bae
Class: CS 194-26 (UC Berkeley)
Date: 9/23/21


Overview


In this project, I used various concepts like 2D convolution, Gaussian filtering, and Laplacian stacks to implement image manipulation and creation procedures. All images used in this project are from Gooogle Images.



Part 1: Fun with Filters


Part 1.1: Finite Difference Operator


To compute the gradient magnitude of an image, we must first get the partial derivatives in the x and y direction by convolving the image with finite difference operators:

Dx = [1 -1] and Dy = [1 -1]T

Let's use the cameraman image as an example.


Original Image

cameraman
cman_pd1

Then, we need to use Pythagorean Theorem with the partial derivatives to calculate the gradient magnitude. Basically,

c2 = a2 + b2

where a, b = partial derivatives in x and y, and c = gradient magnitude.

Additionally, to turn this into an edge image, we could binarize the gradient magnitude image by picking a certain threshold. The goal is to keep the real edges, while suppressing the noise. For my edge image, I chose a threshold of 0.25, where the pixels in my gradient magnitude were inclusively between 0 and 1.


Gradient Magnitude

cman_gm1

Binarized Gradient Magnitude

cman_bgm1

Part 1.2: Derivative of Gaussian (DoG) Filter


As you can probably tell, the results from before were quite noisy. However, this is because we were only using the difference operators. For this section, we will use the Gaussian filter G to smooth (or blur) out our image(s).

To create the Gaussian filter, I referred to the "rule of thumb" from lecture, which stated that the filter half-width should be around 3*σ. With that rule, I thought of a natural formula to calculate the kernel size:

ksize = 2 * ceil(3σ) + 1

The ceil() is there to make the ksize a (positive) integer as sigma can be continuous number, and the +1 is there to make the ksize an odd number as a Gaussian should have a peak value. This formula will be used all throughout this project.

Now, we can start off by using G to make a blurred image from the original. Then, we just repeat what we did before. Let's see how that changes our results. For the edge image, I used a threshold of 0.0275.


Blurred Image (σ = 2)

blurred_cman
cman_pd2

Gradient Magnitude

cman_gm2

Binarized Gradient Magnitude

cman_bgm2

Quite a drastic difference right? In the partial derivatives, the edges are much more pronounced and clear, so we can see some edges that were too noisy or faint before. Along the same lines, there are much better outlines in the gradient magnitude and its binarized version. Specifically for the binarized version, it seems that we are able to see more of the real edges before the noise starts creeeping in (from increasing the threshold).

To make this process less work, we can convolve the Gaussian with the finite difference operators to make a single filter for each. Here are the results from doing so.

Blurred

cman_pd3

Gradient Magnitude

cman_gm3

Binarized Gradient Magnitude

cman_bgm3

As you can see, we were able to skip the blurring step, and have the same results as before. This really shows the power of convolution as this idea can be extended to have many operations within a single filter.


Part 2: Fun with Frequencies


Part 2.1: Image "Sharpening"


We are now moving into manipulating images through the frequency domain. In this section, we will be sharpening an image using a single convolution operation called the unsharp mask filter. It is defined by this formula:

UMF = (1 + α)e - αg

where α = scaling factor for the details, e = unit impulse (identity), and g = Gaussian filter.

Let's see this play out with a couple of examples.


Original Image

taj

Sharpened Image (σ = 4, α = 1)

sharp_taj

Original Image

pop_cat

Sharpened Image (σ = 4, α = 3)

sharp_cat

Now, let's look at an example where I blurred a sharp image, and tried to sharpen it.


Original Image

totoro

Blurred Image (σ = 4)

blurred_totoro1

Sharpened Image (σ = 0.6, α = 300)

sharp_totoro1

Blurred Image (σ = 0.6)

blurred_totoro2

Sharpened Image (σ = 0.6, α = 10)

sharp_totoro2

It may be a little bit difficult to understand the σ values, but basically the σ for the blurred images is the one that I blurred the image with, and the one for the sharpened image is the one that I used for the Gaussian in UMF.

Compared to the strong blurring, the weak blurring and sharpening is a lot more fine-grained. However, both versions have nicer outlines for the characters and objects (than their respective blurred images) to give more emphasis on the details, and everything simply got brighter (not hazy anymore!).

Part 2.2: Hybrid Images


Averaging the high frequencies of one image and the low frequencies of another image, we can create a hybrid image, assuming they are properly aligned. Here are some examples of that:


Image 1

nutmeg

Hybrid Image

gg_class_sample

Image 2

DerekPicture

Image 1

george_w_bush

Hybrid Image

george_obama

Image 2

barack_obama

Image 1

hard_droplet

Hybrid Image

splash

Image 2

soft_droplet

Image 1

waldo

Hybrid Image

waldo_on_wall

Image 2

brick_wall

Here is a failure case, which is the inverse of the last one. This fails because it is simply too hard to recognize that there is a crowd, which relies mostly on high frequencies (details) of the many people and objects.

Image 1

brick_wall

Hybrid Image

wall_on_waldo

Image 2

waldo

My favorite was the one with George W. Bush and Barack Obama. As such, here are the images of the log magnitude of the Fourier transform for the two input images, the filtered images, and the hybrid image.

Aligned Image 1

aligned_im1

Aligned Image 2

aligned_im2

High Pass Filtered Image 1

hp_im1

Low Pass Filtered Image 2

lp_im2

Hybrid Image

george_obama
fa_aligned_im1
fa_aligned_im2
fa_hp_im1
fa_lp_im2
fa_hybrid_im

How does color enhance the effect? Let's look at colorized versions of the class sample. Specifically, the first one is where the cat is colorized, the second one is where Derek is colorized, and the third one is where both are colorized.


cg_class_sample
gc_class_sample
cc_class_sample

From this specific example, the second one takes the win. In general, colors tend to stay when you blur compared to when you get the high frequencies. As such, Derek's face really pops out when I look far away, and it helps to make the cat disappear more. When I am up close, I can see the cat no problem for all three. However, I think that all three are more interesting and vibrant than the simple gray-scale one as color adds more to deal with for the eyes.


Part 2.3-4: Multi-Resolution Blending and the Oraple Journey

The goal of this part of the assignment is to blend two images seamlessly using a multi resolution blending. This involves creating Gaussian and Laplacian stacks for the two images, and blending them together with the help of the completed stacks. Here are the examples (including the oraple!).


apple
gray_oraple
orange
aligned_plane
plane_quarter
plane_mask
aligned_quarter
aligned_ms
sword
aligned_rs