CS 194-26, Fall 2020

Project 2: Fun with Filters and Frequencies!

Angela Dong, 3032700166

Part 1.1: Finite Difference Operator

First, show the partial derivative in x and y of the cameraman image by convolving the image with finite difference operators D_x and D_y.

Partial derivative in x

Partial derivative in y

Now compute and show the gradient magnitude image.

Gradient magnitude

To turn this into an edge image, let's binarize the gradient magnitude image by picking the appropriate threshold.

The threshold that worked best for me was 0.2.

Binarized edge image

Include a brief description of gradient magnitude computation.

The gradient magnitude at a (x, y) can be computed by taking the directional derivative in each 2D direction, (grad_x, grad_y), and calculating the magnitude of this vector to get the magnitude of the overall gradient at (x, y). We can apply this operation to the entire image at once by formatting it as a matrix, finding grad_x and grad_y, also as matrices, by convolving with finite difference operators, and calculating the magnitude sqrt(grad_x^2 + grad_y^2) pointwise. This will yield the gradient magnitude at each point in the image.

Part 1.2: Derivative of Gaussian (DoG) Filter

Create a blurred version of the original image by convolving with a gaussian and repeat the procedure in the previous part.

Blurred

Gradient magnitude

Binarized edges

What differences do you see?

The Gaussian blur smooths the image and reduces fine-grained noise, weakening insignificant edges to produce a clearer edge detection. Since it blurs the entire image out, another side effect is that the edges in the binarized image appear more solid/thick.

Convolve the gaussian with D_x and D_y and display the resulting DoG filters as images. Verify that you get the same result as before.

DoG_x and DoG_y filters

Gradient magnitude is the same

Binarized edges are the same

Part 1.3: Image Straightening

Show the orientation histogram and straightening result for the facade image.

Original image	Orientation histogram
Straightened image	Orientation histogram

Show the original image, straightened image and the two edge orientation histograms for least 3 images, out of which, at least one should be a failure case.

Vegas:

Original image	Orientation histogram
Straightened image	Orientation histogram

Street (manually rotated 2.5 degrees, and confirmed that algorithm could rotate it back):

Original image	Orientation histogram
Crooked image	Orientation histogram
Straightened image	Orientation histogram

Train (fails as the original image is already aligned, but has a noisy texture and not many straight lines):

> Original image	Orientation histogram
Straightened image	Orientation histogram

Part 2.1: Image "Sharpening"

Show the progression of the original image to the sharpened image for the Taj image and an image of your choice.

Taj:

Original

Sharpened

Snail:

Original

Sharpened

Pick a sharp image, blur it and then try to sharpen it again. Compare the original and the sharpened image and report your observation.

Original

Blurred

Sharpened

Not all of the details were recovered (due to the highest frequencies being erased by the low-pass filter) so the sharpened image appears lower-resolution, but it no longer looks clearly blurry like the blurred image before filtering, and it seems to have a slightly higher color contrast.

Part 2.2: Hybrid Images

Try creating 2-3 hybrid images (including at least one failure). Show the input image and hybrid result per example.

Derek + Nutmeg:

Input images

Hybrid

Happy + unsure:

Input images

Hybrid

Happy + sad (fails to look convincing from either perspective):

Input images

Hybrid

I think this looks bad because the images and positions are overall similar, but each feature is different. Since the low- and high-frequency components are neither completely overlapping nor completely distinct, they might not combine well because they end up interfering with each other visually.

For your favorite result, show the log magnitude of the Fourier transform of the two input images, the filtered images, and the hybrid image.

Elephant + cheetah (example from paper):

Input images	FFTs
Low- and high-pass filtered images	FFTs
Hybrid	FFT

Try using color to enhance the effect. Does it work better to use color for the high-frequency component, the low-frequency component, or both?

Grayscale low, grayscale high	Grayscale low, color high
Color low, grayscale high	Color low, color high

I applied different combinations of grayscale and color to the happy + unsure expression hybrid, and found that the completely grayscale version and the color low frequency + grayscale high frequency options display the effect best. Colorizing the low-frequency component restores large solid color patches that are helpful in recognizing that input image from far away. On the other hand, there isn't much use in colorizing the high frequency component since it only shows up at image edges; it may also produce slivers of color that interfere with the appearance of the image from far away (like the pink bottom lip in the high-frequency component above). If color is to be added to enhance the effect, it would be most effective to colorize the low frequency component only.

Part 2.3: Gaussian and Laplacian Stacks

Apply your Gaussian and Laplacian stacks to one interesting image that contains structure in multiple resolutions.

Original

Gaussian stack

Laplacian stack

Illustrate the process you took to create your hybrid images in part 2 by applying your Gaussian and Laplacian stacks to one example.

Original

Gaussian stack

Laplacian stack

Part 2.4: Multiresolution Blending (a.k.a. the oraple!)

Pick two pairs of images to blend together with an irregular mask, as is demonstrated in figure 8 in the paper.

Apple and orange:

Input image A

Input image B

Mask

Blend

Hand and eye (example from paper):

Input image A

Input image B

Mask

Blend

Guy 1 and guy 2:

Input image A

Input image B

Mask

Blend

Illustrate the process by applying your Laplacian stack and displaying it for your favorite result and the masked input images that created it.

Laplacian levels for overall hand + eye blend

Laplacian levels for hand image of hand + eye blend

Laplacian levels for eye image of hand + eye blend

I used a 4-level stack for each blend. For the bottom 2 sequences, images 1-4 are Laplacian levels, which represent all frequencies down to a certain threshold. To make the blend visually consistent, it was necessary to add a 5th component representing the final level of the Gaussian stack (in other words, all the frequencies lower than that threshold) to fill in the missing low-frequency data from each input. This is what the 5th image represents.

Try using color to enhance the effect.

I implemented the method in color. It involved making the mask's Gaussian stack 3D (one copy for each RGB channel). While color can enhance the effect, it can also make the blend more unnatural if the color hues of the 2 images along the edges of the blend mask are dissimilar. On the other hand, a grayscale blend will look only unnatural if the color values, rather than the hues, are dissimilar.

Conclusion

This project was fun! My favorite part was writing efficient code for calculating the amount of straight lines in an image, and the task/algorithm I thought was coolest was the one for multiresolution blending. I think my most important takeaways were the advantages of frequency domain visualization, and the commutative, associative, and distributive properties of convolutions.