Fun with Filters and Frequencies!
Roma Desai | CS-194
Project 2
Part
1.1: Finite Difference Operator
For this section,
I used finite difference operators and the gradient magnitude to detect the
edges of an image. This process involved first taking the derivative of the
image in the x and y direction. We get the following result:
Dx = [1,-1] Filter |
Dy = [1,-1]T
Filter |
|
|
Then, to turn this into an edge image, I computed
the magnitude of the image gradient by calculating the following for every
pixel:
Once I had the
magnitudes of the gradient vectors, I binarized the output by assigning all
values >.3 to 0 and all values <.3 to 1 to reduce noise. Because the
gradient vector points in the direction of the most rapid increase in
intensity, it correctly identifies all edges in an image. Here is the resulting
edge image:
Part
1.2: Derivative of Gaussian (DoG) Filter
Because the edge
detection was a little noisy in the previous section, in this section I
implemented gaussian smoothing to detect the edges and reduce noise. This
operation consists of creating a gaussian filter, convolving it with the image,
and then convolving the result with the finite difference operators from the
previous part. We get the following result:
Compared to the
result from 1.1, the edges here are much more pronounced and there is much less
noise. However, for this operation, we had to use 2 convolutions with the
entire image. We can make this operation faster by convolving the gaussian with
the derivative before convolving with the image. We get the two DoG filters below:
D_x of Gaussian Filter |
D_y of Gaussian Filter |
|
|
Applying these filters
to the original image, we get the same edge image as above:
Part
1.3: Image Straightening
For
this section, I used the gradient angle to straighten images. By taking the
gradient and finding the angle at ever pixel, you can count the number of
horizontal and vertical edges that are in the “correct” orientation. I made a
histogram of angles for each image with the bins allocated in the following
way: [-185, -170, -95, -85, -5, 5, 85, 95, 175, 185]. This made sure to count all the lines that were at -180º, -90º,
0º, 90º, and 180º and put them into separate buckets. I kept a tolerance around
these angles to account for any noise. All angles not within the range were put
into larger buckets. In the end, I chose the rotation angle that gave the
largest percentage of horizontal and vertical edges in the image out of all
total edges. I got the following results:
Original Image |
Histogram |
Straightened Image |
New Histogram |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
As you can see,
not all the images were successfully rotated. The dog image looks very similar
to the original input image. This is because the dog image does not have very
many horizontal or vertical lines. The image is mostly constructed of irregular
lines so when the algorithm searches for lines that are at angle of +/- 0,90 or
180º, it does not find any.
Part
2.1: Image “Sharpening”
For
this section, I implemented image “sharpening” by taking out the high
frequencies in the image and adding it back to the original to increase the
clarity. Although there is no new information being added to the image,
increasing the number of higher frequencies appears to sharpen the image to the
human eye. I first convolved the image with a gaussian filter to extract the
low frequencies. Then I subtracted this from the original image to get the high
frequencies. Once I got the high frequencies, I scaled them to an appropriate
amount and then added them back to the original image. I combined all these
operations into a single filter which was then applied to each image with a
single convolution. The results are shown below:
Original |
“Sharpened” |
|
|
|
|
To
test my implementation further, I took a clear image, blurred it, and then
tried to sharpen the blurred image by increasing the number of high frequencies.
Surprisingly (or not surprisingly!), the blurred then sharpened image looks
very similar to the original image.
Original Image |
Blurred then Sharpened |
|
|
Part
2.2: Hybrid Images
In
this part, I created various different hybrid images. These images are special
because from up close you see one image but from further away, you see a
completely different image. This is due to the frequencies contained in the
images. To create the hybrid images, I lowpass filtered one image, high pass
filtered the other image, and averaged the two images together. Because our
eyes can only see high frequencies up close, the high pass filtered image
appears when close to the image. Because our eyes can see low frequencies from
far away, the low pass filtered image appears when viewing from a distance.
The
two images contained in the hybrid image can be seen more or less by changing
the cutoff frequency. This is the value that lets us decide what frequencies to
let through and which to throw out. In terms of gaussian filtering, this was
encoded through the kernel and sigma values. If the kernel was large and the
sigma value was high, then the gaussian averaged together a larger amount which
resulted in more high frequencies being eliminated. If the sigma value was very
low, then the value of each pixel depending less on the surrounding ones so
more high frequencies were let through. The results are shown below:
Image 1 (highpass
filtered) |
Image 2 (lowpass filtered) |
Hybrid |
Nutmeg |
Derek |
Nutrek |
Obama |
Puppy |
Puppama |
Steve |
Bill |
Steve Gates |
As
you can see, Puppama (Obama + Puppy) didn’t work as
well compared to the other hybrid images. Because both images have hard, high
frequency lines, the images were not as compatible as the others. Also, because
the size of the faces in the two images were different, it is much easier to
see the individual outlines.
Fourier
Domain
For
the Steve Jobs and Bill Gates images, I displayed the images in the log Fourier
domain. The highpass Fourier image can be seen to
have more high frequencies due to the spread of the points while the lowpass
Fourier image points are more centered around the middle.
Image 1 Fourier |
Image 2 Fourier |
Lowpass Fourier |
Highpass Fourier |
Hybrid Fourier |
|
|
|
|
|
2.2
Bells and Whistles
I combined the Steve and Bill images in grayscale
as well as with one in color while the other not being in color. I found that
keeping the lowpass image in color causes the image to do much better that
regular. I believe this is because color is usually not a quickly changing
variable. It tends to change slowly throughout an image and is more closely
related to lowpass frequencies. Because of this, our eyes see the image better
if only the lowpass filtered image is in color.
Gray |
Lowpass Image Colored |
|
|
Part
2.3: Gaussian and Laplacian Stacks
For this part, I
implemented a gaussian and laplacian stack to analyze the frequency composition
of the image. At each step of the stack, I filtered the image with a gaussian
filter (adding on another gaussian filter to the previous one at each level) to
create the gaussian stack. At the same time, I subtracted gaussian filters
between iterations to create the laplacian stack. The result is shown below:
Gaussian Stack |
|
|
|
|
|
Laplacian Stack |
|
|
|
|
|
In the original
image “Lincoln and Gala”, Lincoln mostly consisted of low frequencies so he
could only be seen from a distance while Gala was made of high frequencies so
she could only be seen from up close. Decomposing the image proves just that.
In the gaussian stack, we can see Lincoln much more clearly while in the
laplacian stack, we can see Gala more clearly.
I also created a
gaussian and laplacian stack for my “Steve Gates” image from part 2.2.
Similarly, we can see how Steve (high pass filtered) is much more apparent in
the laplacian stack while Bill (low pass filtered) becomes more apparent in the
gaussian stack.
Gaussian Stack |
|
|
|
|
|
Laplacian Stack |
|
|
|
|
|
Part
2.4: Multiresolution Blending
For
this section, I implemented multiresolution blending to seamlessly blend
together two different images. For each pair of images, I created a mask image
that determined how the images would be blended together. To do the actually
blending, I calculated the laplacian stack of both images, the gaussian stack
of the mask and stitched together the individual levels by using the mask
gaussian as a weight function. The mask image was white (pixel value = 1) where
the images should overlap and black (pixel value = 0) otherwise. By using this
mask, it caused certain image pixels to appear while others were zero-ed out by
multiplying the pixel value by the mask pixel value of zero. By adding together the images by individual levels, we are blending
the images together by frequencies instead of all together at once which
results in a much smoother look. Here are some of the results:
Image 1 |
Image 2 |
Mask |
Result |
Apple |
Orange |
|
Orapple |
My Backyard |
Campanile |
|
UC Berkeley @ Home!! |
Sunflower |
My Dog |
|
Sundog |
To
illustrate the process a little better, here is the laplacian stack output for
my “UC Berkeley @ Home” image. As you can see below, different frequencies are
extracted at each level with the highest being at the first level. The different
frequency bands are blended together at each level.
Level 0 |
Level 1 |
Level 2 |
Level 3 |
Level 4 |
|
|
|
|
|
Final
Thoughts:
One
thing I found super interesting was the power of the gaussian filter. While it appears
to be a simple averaging function, it has many amazing results. I thought it
was really neat how despite the gaussian actually “blurring” an image it
resulted in better edge detection due to the removal of noise. Simply adding
the gaussian filter drastically improved the derivative filter effects which I
thought was super cool. I also thought the gaussian filter was especially
interesting in its abilities to separate the low and high image frequencies
which led us to create some really cool effects such as the hybrid images and seamless
image blending! Overall, I think the versatility of the gaussian filter is what
surprised me the most.