Project 2

Fun with Filters and Frequencies by Amy Hung

CS 194-26: Image Manipulation and Computational Photography, Fall 2020

Part 1: Fun with Filters

In this part, we will build intuitions about 2D convolutions and filtering.

Part 1.1: Finite Difference Operator

I obtained the partial derivatives in the x and y direction by convolving the images with the finite difference operators Dx = [[1 -1]], Dx = [[1], [-1]]. From there I calculated the gradient magnitude and binarized it by picking a threshold. In this case, I set the threshold to 0.1, so that values > 0.1 would be 1 and values <= 0.1 would be 0. This value was picked through tweaking the threshold parameter, and stopping when enough noise is suppressed, while still showing all the real edges. However, we can see that these edges are still quite noisy, and choppy due to the finite difference filters.
partial derivative x
partial derivative y
binarized gradient magnitude,
finite difference

Part 1.2: Derivative of Gaussian (DoG) Filter

In this step, we use the gaussian filter to smooth out the image before calculating the partial derivatives and gradient magnitude. With these, we can get a much smoother and less noisy edges. In particular, with the same threshold value of 0.1, we're also able to get bolder edges using this method.
partial derivative x
partial derivative y
binarized gradient magnitude,
finite difference after blurring
After simplifying the smoothing process to only require one convolution (instead of two) by creating a derivative of gaussian filters, we have the two DoG filters below, and an identical binarized gradient magnitude as before.
DoG x
DoG y
binarized gradient magnitude,
derivative of gaussian filter

Part 1.3: Image Straightening

In this part, we are automating the image straightening process by computing partial derivatives of an image again, but then calculating the gradient angle (rather than gradient magnitude). This way, we can see how each edge is oriented, and can use this information to straighten out our input images.

Over a set of angles from [-10, 10] degrees, I first rotate the image by that angle, crop the image to get rid of the borders that may negatively influence our later computations, and calculate the gradient angle of edges in the image. Then, I plot the gradient angles into a histogram, and perform a calculation what proportion of the total number of angles falls close to -180, -90, 0, 90, or 180 degrees (these values correspond to vertical and horizontal lines). The rotation angle with the maximum proportion of vertical and horizontal edges is then used to straighten out the image.
original image
original image
straightened image
straightened image
original image
original image
straightened image
straightened image
original image
straightened image
This last case was a bit of a failure, as the algorithm aligned according to the buildings/skyscrapers. However, due to the perspective of the photo, this caused the resulting photo's horizon to be tilted. The algorithm also straightened out the leaning Tower of Pisa, which results in the rest of the photo being tilted. The current algorithm I implemented would also fail if an image needed to be rotated more than 10 degrees in either direction.

Part 2: Fun with Frequencies

Part 2.1: Image "Sharpening"

In this part, we're "sharpening" images using the unsharp masking technique. We're essentially taking the process of: subtracting the blurred version of an image from the original image to isolate the higher frequencies then adding them back to the original image, and compacting it into a single convolution operation. We then use an alpha scalar value to determine how much to "sharpen" the image by. We can observe here that when the alpha value is too high, it can make the image look a little "fried" and quite unnatural.
original image
alpha = 0.5
alpha = 1
alpha = 5
alpha = 10
original image
alpha = 5
original image
alpha = 1

We can also take an image, blur it, and see how well we can "sharpen" it back to it's original quality. In this example, we were able to regain some of the "sharpness" of the image by alpha value 2. However, there are some differences in saturation of colors, and thickness of edges (e.g. the border lines around spongebob's body are thicker, and more intense in color), which makes sense as we're artificially boosting the intensity of the higher frequencies and edges. Overall, we still aren't able to recover the information lost in applying the gaussian to blur the image, which is as expected.
original image
blurred image
alpha = 1
alpha = 2
alpha = 3
alpha = 5

Part 2.2: Hybrid Images

In this part, we're looking to manipulate low and high-frequency images to create a "hybrid" illusion. The basic idea is that when you focus on an image, the higher frequencies will dominate your perception of it, whereas at a distance the lower frequencies will dominate.

The first step is to align the images based on a specified set of points (usually distinct features that you'd like the images to align on when combined/hybridized). In my case, this was usually a pair of eyes on each of the images, to align faces. From there, blur one of the images (leaving only the lower frequencies), and extract the high frequencies from the other image. Then we sum up these images to create the hybrid image.

Bells and Whistles:y Here, I was able to use color to enhance the hybridization effect. I experimented with using color for the high-frequency, low-frequency, and both. Adding color only in the high frequencies didn't do much, adding color only in the low-frequency greatly increased the believability of the hybrid image, and adding color from both looked the best, in my opinion, in tying together the two images.
original img1
straightened high freq img1
original img2
low freq img2
hybrid image result
original img1 fourier
high freq img1 fourier
original img2 fourier
low freq img2 fourier
hybrid image result fourier

original img1
high freq img1
original img2
low freq img2
hybrid image result
original img1 fourier
high freq img1 fourier
original img2 fourier
low freq img2 fourier
hybrid image result fourier

original img1
straightened high freq img1
original img2
low freq img2
hybrid image result
original img1 fourier
high freq img1 fourier
original img2 fourier
low freq img2 fourier
hybrid image result fourier

Part 2.3: Gaussian and Laplacian Stacks

In this part, we implemented Gaussian and Laplacian Stacks to demonstrate the effect of having multiple structures at different resolutions.

Salvador Dali - Lincoln Top to bottom: Gaussian, Laplacian Stack
original image

Pikachu x Ash - Hybrid Image Top to bottom: Gaussian, Laplacian Stack

With this, we can observe more closely the effect of focusing on high vs low frequencies in these hybrid images.
original hybrid image

Part 2.4: Multiresolution Blending

In this part, our aim is to seamlessly blend tow images together using multi-resolution blending. In particular, we're implementing the algorithm as described in the 1983 paper by Burt and Adelson, by implementing a mask over half of one of the images to force the blending border.
Laplacian Stacks of blended dog uwu image and masked image inputs

Conclusion

Though this project did get tedious in parts, especially when tweaking parameters, I thoroughly enjoyed being able to play around with hybridizing images, as it was a way for me to joke around and actually apply the class teachings to achieve fun and creative results. Overall, I learned a lot through working on this project, and will hopefully find other opportunities to use these tools in the future when I want to edit photos!