Fun with Filters and Frequencies!

Sravya Basvapatri // Fall 2021 // CS 194-26 Computational Photography

 

 

Objective

In this project, we play with the frequency domain to create interesting effects in the visual space. We start with basic Derivative of Gaussian (DoG) filters for edge detection, and see how they improve on simply using gradients on an unblurred image. From there, we explore image sharpening using an unsharp mask filter. After, we explore the creation of hybrid and splined images by stitching together layers of images filtered to focus on specific frequency bands. I hope you enjoy the end images!

Contents

Part 1: Fun with Filters

Part 2: Fun with Frequencies!

Part 1: Fun with Filters

1.1 Finite Difference Operator

For the below operations, we will use the cameraman image shown below, and the finite difference operators shown on the below.

I started by using the 2-D finite difference matrices as an x and y direction filter for the cameraman image. This was done using a convolution of these filters with the original image, shown below. D_x reveals the vertical changes, while D_y reveals the horizontal changes. From there, I took the gradient magnitude (defined below) to show the edges within the cameraman image.

Gradient magnitude - The gradient magnitude is computed by taking the L2-norm of the horizontal and vertical gradients. This is displayed below on the far right. 

From above, we can see that there is a lot of noise in this image from softer details in the image. I experimented with different thresholds to binarize the gradient magnitude image. I found that the threshold that looked the best was between 0.2 and 0.3 (out of a pixel value range [0,1]). Even after this, we can see that the edges are a bit spotty-- this isn't the best we can do for edge detection. 

1.2 Derivative of Gaussian (DoG) Filter

Noticing that the edges from the previous image with just our finite difference operator were quite noisy, I then used a Gaussian convolution before convolving with the finite difference operators to make these edges pop and reduce noise. When the image is softened first, we find that the gradient changes more slowly, allowing the final gradient magnitude image to have thicker and more defined edges. 

To simplify this operation, we can convolve the Gaussian filter with the finite difference operators D_x and D_y to create Derivative of Gaussian (DoG) filters that we can apply directly to the original image to find the horizontal and vertical edges. From there, we can find the gradient magnitude of these two images. The filters are shown below along with the result of applying them to our cameraman image.

By comparing the result of the two-step filter ("Gradient Magnitude") from above and the result of the DoG filters ("Magnitude $DoG$ on Cameraman"), we can see that both ways of processing the image yield the same result. The latter is easier because we can have a DoG filter ready-- we only need to change the image that we convolve the filter with.  

 

Part 2: Fun with Filters

2.1 Image "Sharpening"

Next, I implemented a sharpening algorithm using an unsharp mask filter. The filter first subtracts a Gaussian blurred image from the original, leaving just high frequency components. From there, we can amplify this high frequency image by a chosen parameter alpha, and add it back to the original image. Below, I display the unsharp mask filter on the Taj Mahal image with different alpha values. 

Through playing with the alpha and kernel size parameters, I found that a smaller kernel size tends to look better, because it subtracts less lower frequencies from the image before amplifying the higher frequencies. I also found that while a sharper image with a larger generally looks better, after a certain point, the clipping function makes the image look distorted rather than sharper.

Below, I apply the sharpening filter to a slightly blurred version of an image I took at Nevada Falls at Yosemite National Park. I needed to apply a larger Gaussian blur filter as the original image size was so large. I felt that after blurring the original image, the unsharp mask filter wasn't able to fully recover the details of the original image. It artificially creates a sharpened look by amplifying the remaining high frequency components, but the original details are irrecoverably lost.

Then, I tried it on a few additional photos. Below is a picture of a building in Bangkok. I didn't feel that the filter worked great here due to the metal in the picture-- amplifying higher frequencies simply made the colors in the metal look distorted.

The one below is a painting by John A. Fitzgerald of a nature scene. I felt that the sharpening here really made the details pop-- an alpha parameter of 2 is ideal so that the original painting is still honored. 

 

2.2: Hybrid Images

From there, I experimented with creating hybrid images using the approach described in the SIGGRAPH 2006 paper by Oliva, Torralba, and Schyns. Hybrid images are static images that change in interpretation as a function of the viewing distance. Close up, only the high frequencies are interpreted, and low frequencies are dismissed in interpretation as noise. Similarly, at a far distance, low frequencies are interpreted, and high frequencies overlooked. 

Nutmeg and Derek

I first started with a provided example using a man and a cat named Nutmeg. The original images are shown below, along with the high pass and low pass versions and the hybrid image. 

As we view the image smaller, we can see how the low pass details come into view.

I then analyzed the frequencies within each image, both before and after filters were applied. 

Tiger and Pug

From there, I chose additional photos to turn into hybrid images. First, I tried to combine an image of a tiger with various dogs. I started with a pug.

I would consider the example below a failed hybrid image because the combined image has many low pass components that are hard to dismiss as low frequency noise. This is because the tiger's head is much larger than the pug's. From this, I learned that it is important to choose objects with similar size and similarly placed edges.

I retried the same tiger example with a bigger dog.

I think this example worked better, but the faces still aren't aligned perfectly.

 

Happy and Sad Baby

This next example I think worked the best because the babies both had a simlar face shape. I think it could be improved a bit without the background crib of the first image, so that the background doesn't distract from the changing interpretations of the image.

This was my favorite hybrid image I created. I went ahead and again analyzed the frequencies within each image, before and after filtering. 

Part 2.3 Gaussian and Laplacian Stacks

In this next part, I created Gaussian and Laplacian stacks on a variety of images. A Gaussian stack is implemented by convolving the original image with a Gaussian of increasing sigma. I started with sigma = 2 and doubled the value at each level of the stack. The Laplacian pyramid is derived from the Gaussian stack; it is found by the difference between two successive levels of the Gaussian stack. The last level of the Laplacian pyramid, however, is simply equal to the last level of the Gaussian stack. The purpose of this is to ensure that all frequencies are present in the stack, so the summation of all the images in the stack results in the original image. 

Below, I create Gaussian stacks (first row) and Laplacian stacks (second row) for a variety of images, including the happy baby, the provided oraple image, and the provided apple and orange images. The latter two will be used in Part 2.4 to recreate the multi-resolution blending of the oraple, shown in Figure 3.42 of in Szelski (Ed 2) page 167. 

Part 2.4 Multi-Resolution Blending

The goal now was to blend two images seamlessly using a multi resolution blending as described in the 1983 paper by Burt and Adelson. An image spline is a smooth seam joining two image together by gently distorting them. Multiresolution blending computes a gentle seam between the two images seperately at each band of image frequencies, resulting in a much smoother seam. 

My splining function works by taking the Laplacian stacks of both images and a Gaussian stack of equal height of the mask. From their, it applies the mask and its inverted mask to each level of the Laplacian stack (shown in the first two columns). The last column shows the summation of the first two-- it shows the blending of a certain band of frequencies. The last row is reserved for the summation of the column, such that the bottom right image shows the final blended images. 

Oraple (Orange and Apple)

For the oraple, I had already found the Gaussian and Laplacian stacks in Part 2.3, so I started by finding the Gaussian stack of the mask in order to apply to different layers of the orange and apple Laplacian stacks. 

From here, we want to create a stack of combined images, weighted by the masks. I first followed the procedure shown in the example image Figure 3.42 in the paper. It used levels 0, 2, and 4 of the orange and apple Laplacian pyramid. Below, each row represents one level of the Laplacian pyramid, first with the mask applied to the orange, then to the apple, and then with the two halves summed together. The last row shows the columns summed together, with the bottom right representing the final oraple image.

I found that the final image was lighter than I would have liked due to it missing some frequencies in the final summation.

Due to this, I decided to recreate the image using all 5 computed levels of the Laplacian pyramid that we had. This helped create an overall darker image with more contrast.

 

The Quing (King and Queen of Spades)

Next, I wanted to try another blend, this time with slightly more complicated diagonal mask. I chose an image that would be easy to line up-- two playing cards, one with a king and the other a queen. 

As before, I compute the Gaussian and Laplacian stacks for the king and queen. 

I also show the Gaussian stack for my diagonal mask.

From there, I apply my splining algorithm described above. I found that this result worked fairly well, but there are some larger details in the image that make it extremely obvious that the image is stitched. To improve on this, I think an irregular mask might help obscure some of the image details that were chopped by my diagonal mask.

 

Third Eye (Face and Eye, with some Bells and Whistles!)

Inspired by the Hindu Lord Shiva and third eye spiritual concept, I wanted to attempt to give myself a third eye. This one was a bit more complicated than the first two because I needed to create an elliptical mask and resize one of the images.

To start, I tried creating an array the size of my face and placing the resized eye in the spot it would need to be to blend onto my forehead. 

As before, I computed the Gaussian and Laplacian stacks and came up with this final image. Quickly, I saw that the sharp edges from the resized eye image were hurting the final product. I decided to apply to a thick oval shaped blur to the outside of the eye image and try again. 

Here are the final Gaussian and Laplacian stacks for the images and the mask:

The coolest thing I learned during this project is definitely how image stitching can be made better. Throughout the process, I tried stitching without splitting into different bandwidth images, and the result was not nearly as smooth. 

Finally, here are the three final images that I created using multi-resolution blending!