In this project, I used various concepts like 2D convolution, Gaussian filtering, and Laplacian stacks to implement image manipulation and creation procedures. All images used in this project are from Gooogle Images.
To compute the gradient magnitude of an image, we must first get the partial derivatives in
the x and y direction by convolving the image with finite difference operators:
Dx = [1 -1] and Dy = [1 -1]T
Let's use the cameraman image as an example.
Original Image
Then, we need to use Pythagorean Theorem with the partial derivatives to calculate the
gradient magnitude. Basically,
c2 = a2 + b2
where a, b = partial derivatives in x and y, and c = gradient magnitude.
Additionally, to turn this into an edge image, we could binarize the gradient magnitude image
by picking a certain threshold. The goal is to keep the real edges, while suppressing the
noise. For my edge image, I chose a threshold of 0.25, where the pixels in my gradient
magnitude were inclusively between 0 and 1.
Gradient Magnitude
Binarized Gradient Magnitude
As you can probably tell, the results from before were quite noisy. However, this is because
we were only using the difference operators. For this section, we will use the Gaussian
filter
To create the Gaussian filter, I referred to the "rule of thumb" from lecture, which stated
that the filter half-width should be around 3*σ. With that rule, I thought of a natural
formula to calculate the kernel size:
ksize = 2 * ceil(3σ) + 1
The ceil() is there to make the ksize a (positive) integer as sigma can be continuous number,
and the +1 is there to make the ksize an odd number as a Gaussian should have a peak
value. This formula will be used all throughout this project.
Now, we can start off by using
Blurred Image (σ = 2)
Gradient Magnitude
Binarized Gradient Magnitude
Quite a drastic difference right? In the partial derivatives, the edges are much more pronounced and clear, so we can see some edges that were too noisy or faint before. Along the same lines, there are much better outlines in the gradient magnitude and its binarized version. Specifically for the binarized version, it seems that we are able to see more of the real edges before the noise starts creeeping in (from increasing the threshold).
Blurred
Gradient Magnitude
Binarized Gradient Magnitude
As you can see, we were able to skip the blurring step, and have the same results as before. This really shows the power of convolution as this idea can be extended to have many operations within a single filter.
We are now moving into manipulating images through the frequency domain. In this section, we
will be sharpening an image using a single convolution operation called the unsharp mask
filter. It is defined by this formula:
UMF = (1 + α)e - αg
where α = scaling factor for the details, e = unit impulse (identity), and g = Gaussian
filter.
Let's see this play out with a couple of examples.
Original Image
Sharpened Image (σ = 4, α = 1)
Original Image
Sharpened Image (σ = 4, α = 3)
Now, let's look at an example where I blurred a sharp image, and tried to sharpen it.
Original Image
Blurred Image (σ = 4)
Sharpened Image (σ = 0.6, α = 300)
Blurred Image (σ = 0.6)
Sharpened Image (σ = 0.6, α = 10)
Averaging the high frequencies of one image and the low frequencies of another image, we can create a hybrid image, assuming they are properly aligned. Here are some examples of that:
Image 1
Hybrid Image
Image 2
Image 1
Hybrid Image
Image 2
Image 1
Hybrid Image
Image 2
Image 1
Hybrid Image
Image 2
Image 1
Hybrid Image
Image 2
Aligned Image 1
Aligned Image 2
High Pass Filtered Image 1
Low Pass Filtered Image 2
Hybrid Image
How does color enhance the effect? Let's look at colorized versions of the class sample. Specifically, the first one is where the cat is colorized, the second one is where Derek is colorized, and the third one is where both are colorized.
From this specific example, the second one takes the win. In general, colors tend to stay when you blur compared to when you get the high frequencies. As such, Derek's face really pops out when I look far away, and it helps to make the cat disappear more. When I am up close, I can see the cat no problem for all three. However, I think that all three are more interesting and vibrant than the simple gray-scale one as color adds more to deal with for the eyes.
The goal of this part of the assignment is to blend two images seamlessly using a multi resolution blending. This involves creating Gaussian and Laplacian stacks for the two images, and blending them together with the help of the completed stacks. Here are the examples (including the oraple!).