Leon Ming's Project 2

CS 194 Fall 2020

Project 2: Fun with Filters and Frequencies!

Leon Ming

Overview

In this project, we perform several related tasks involving filters:

Part 1.1: Finite Difference Operator
Part 1.2: Derivative of Gaussian (DoG) Filter
Part 1.3: Image Straightening
Part 2.1: Image "Sharpening"
Part 2.2: Hybrid Images
Part 2.3: Gaussian and Laplacian Stacks
Part 2.4: Multiresolution Blending

Part 1.1: Finite Difference Operator

First, we look at what we get by applying the difference operators in the x and y directions to our cameraman photo.

Next, we can turn the above results into a gradient magnitude visualization. This computation is performed by taking the (element-wise) square root of the sum of squares of the above two arrays.

Although the binarized version on the right looks better than without binarization, it still contains some amount of noise, and the lines are faint.

Part 1.2: Derivative of Gaussian (DoG) Filter

By applying a Gaussian filter of size 11 and standard deviation 2, we can then perform the same operation as in the previous part and get better results.

As I hoped, the edges are much better defined, without needing to increase any noise. To optimize for efficiency, however, we can also convolve the Gaussian filter and difference filters first, then apply them to the original image. This is what the Derivative of Gaussian filters look like.

This is the result we get from using the DoG filters on the cameraman photo.

As expected, the results look the same as if we were to apply the filters one after the other.

Part 1.3: Image Straightening

For facade.jpg, I chose to try several rotation candidates ranging from -4 degrees to +3 degrees, in 1 degree increments. I also cropped each side by 5% of the image width. My candidate ranking function said that -3 degrees is the best rotation, based on the proportion of angles within 6 degrees of vertical or horizontal.

In this next image, houses.jpg, my algorithm tried -20 degrees to +4 degrees, in increments of 2 degrees. According to my heuristic, -12 degrees was the best rotation.

In monkeys.jpg, my algorithm tried -30 degrees to +30 degrees, in increments of 5 degrees. The optimal rotation was 15 degrees. I found this one particularly interesting. Despite the tree branch resting at an angle, the algorithm still got the correct orientation, likely due to the left monkey's tail.

My algorithm failed on beijing.jpg, a picture of the Beijing National Stadium, which by design has many edges at interesting angles. The algorithm recommended a rotation of 0 degrees, but in fact the best rotation should have been around -12 degrees. This failure is absolutely expected as a result of the architectural design.

Part 2.1: Image "Sharpening"

This is the result of sharpening the blurry Taj image. I simply added the high frequency component of the image once in the second panel and twice in the third panel.

Below is a comparable result I obtained by performing the same operation on a blurry image of a dog.

I also tried to blur an image, then sharpen it using the same tactic as above. Below is the result.

Careful observation will show that despite the 3rd image in each row appearing sharper than the 2nd image, it has in fact lost information compared to the 1st image. This is unavoidable, as information lost from blurring cannot be simply be regained.

Part 2.2: Hybrid Images

Example #1:

Example #2:

Example #3:

Example #4 (failure):

It seems that hybrid images work best when two images are similar in overall structure/texture. In example #4, a squirrel and seagull may be too different to create a hybrid image.

Below is the frequency domain visualization of example #3. As expected, a low-pass filter decreases high frequencies and a high-pass filter decreases low frequencies (relative to higher frequencies). The Fourier transform of the hybrid image has a balanced combination of both.

Part 2.3: Gaussian and Laplacian Stacks

Using Gaussian filters of size 80 and standard deviations of sizes 2, 4, 8, 16, and 32, I produced the following Gaussian (top) and Laplacian (bottom) stacks for a cool image I found. To improve the visibility of the Laplacian stacks, I rescaled the intensity to range from 0.0 to 1.0.

I then use the same technique to visualize one of my results from part 2.2.

Part 2.4: Multiresolution Blending

Using the technique described in the paper by Burt and Adelson, I produced multiresolution blends of several pairs of images, the first of which is the example shown in class.

It was important for the success of the blending that each pair of images is fairly similar in color and texture. I also observed that Robert Downey Jr.'s face looks less attractive on Chris Evans' head. Below are the Laplacian stacks for the lion, the dog, and the liondog.