Fun with Filters and Frequencies

CS 194-26: Intro to Computer Vision and Computational Photography

Gregory Du, CS194-26-aec

Overview

This project explores the way in which we can process images using various filters. We examine the variety of frequencies found in images, and see examples of how considering subsets can lead to some really interesting results.

Filters

1.1: Finite Difference Operators

The computation for the gradient magnitude image is actually quite simple. First we convolve the image with the finite difference operator in the x dimension, [1, -1], then we convolve the image with the finite difference operator in the y dimension, [[1], [-1]]. With these partial derivative intermediate steps (which we can call p_dx and p_dy respectively), we simply need to utilize the distance formula on these two images on a per-pixel basis, that is to say, the gradient magnitude image can be computed as follows: np.sqrt(p_dx**2 + p_dy**2).

Camera Man: Partial Derivative w.r.t x	Camera Man: Partial Derivative w.r.t y
Camera Man: Gradient Magnitude	Camera Man: Gradient Magnitude Binarized

1.2: Derivative of Gaussian Filter

By first applying the gaussian filter, we can see that in fact, we've removed a lot of the graininess from our original gradient magnitude image. By blurring the image first, we've essentially prevented outlier bright points from passing our derivative filter and threshold checks, since they now blend in much more with their surroundings, considering they aren't actually edges, and thus the intensities of the pixels that surround them are less likely to be different as they would be for an edge. We can also see that the edges are not quite as sharp, which stands to reason as well, simply because we've blurred the edges, thus by blending edge pixels with the pixels around them, we've increased the border width of pixels our derivative filter would pick up as an edge.

Camera Man: Partial Derivative w.r.t x	Camera Man: Partial Derivative w.r.t y
Camera Man: Gradient Magnitude	Camera Man: Gradient Magnitude Binarized

For some odd reason though, precomposing my gaussian and discrete difference filters yielded some odd, and blatantly incorrect results. convolve2d is supposed to be 2D convolution which is commutative, however, practically, the arguments are locked in order. I displayed my preconvolved filters below, and the final convolution.

Gaussian Preconvolved Dx

Gaussian Preconvolved Dy

Final Convolution

1.3: Image Straightening

We use a proposed rotation range from -5 to 5 degrees. This isn't a huge range, nor is it particularly granular and we assume that the initial image is not completely tilted beyond reason (this will result in some failure cases we will see below). The reasoning behind this choice is purely performance driven. We'd like to keep computation times down, so we've reduced the proposed range of rotations to check. Below from left to right, we will see the original image, the straightened image, as well as the angle histograms for the source and straightened image.

Facade

facade

facade original angle histogram

facade straightened

facade straightened angle histogram

Drawn City

drawn city

drawn city original angle histogram

drawn city straightened

drawn city straightened angle histogram

France

france

france original angle histogram

france straightened

france straightened angle histogram

Flan

flan

flan original angle histogram

flan straightened

flan straightened angle histogram

In fact not all images were created equal, and thus image straightening does not work equally well on all images. It worked pretty decently well on the first three, although you will notice that these are all architectural, or city-scapes. The third image is of a flan, which although delicious looking, is not particularly amenable to image straightening. There are two major reasons. First the image was rotated 9 degrees, which is farther than the proposed range of rotations our algorithm will seek. More fundamentally however, this image has a noticeable lack of straight lines. Based on the original image, it's almost completely impossible to tell that it was rotated at all, simply because there isn't really a well defined notion of what a straightened circular dessert should even be. This will happen with organic objects pretty easily since there aren't a ton of lines for our algorithm to latch onto.

Frequencies

2.1: Image "Sharpening"

We've implemented the unsharp mask filter below, both naively, as well as through efficient filter pre-composition. I've included some examples of some images that are sharpened, however you may notice that the last image, isn't really sharper, it's just a little brighter. This is a consequence of the pseudo-sharpening we're using. We're only augmenting the weight of high frequency signals in the original image, not a genuine sharpening of the image, so in a drawn image, where there isn't the blurry grain of traditional photographs, and the resolution is unbounded, the unsharp mask filter basically just brightens the image by additively increasing pixel brightness.

Taj Mahal

Taj Mahal: Sharpened Naively

Taj Mahal: Sharpened Pre-composed

Cherry Blossoms

Cherry Blossoms: Sharpened Naively

Cherry Blossoms: Sharpened Pre-composed

Woods

Woods: Sharpened Naively

Woods: Sharpened Pre-composed

Drawn

Drawn: Sharpened Naively

Drawn: Sharpened Pre-composed

2.1: Hybrid Images

By artificially lerping between select frequencies from two different images, we can generate hybrids, images that will transform as a function of viewing distance.

Let's take a look at our original two images, and their Fourier analyses:

Eiffel Tower	Eiffel Tower: Fourier Analysis
Washington Monument	Washington Monument: Fourier Analysis

Now what happens if we high pass filter the Eiffel Tower, and low pass filter the Washington Monument? Let's take a look at their Fourier Analyses now:

Eiffel Tower: High Pass Filtered

Washington Monument: Low Pass Filtered

Let's now average the filtered versions of the Eiffel Tower and Washington Monument to get the Eiffel Monument!

Eiffel Monument

Eiffel Monument: Fourier Analysis

Here are two more cool hybrids that we generated. You may recognize the movie stars from Roman Holiday:

Derek	Nutmeg	Derek & Nutmeg
Audrey Hepburn	Gregory Peck	Roman Holiday

This doesn't mean you should just go around making hybrid images of random source subjects though, if they don't align naturally, they hybrid photo won't be that good. Take a look at this pumpkin-penguin for example. How would you even go about aligning this? It's not that the effect doesn't work, it's just that the alignment isn't really meaningful. Perhaps another day for the punguin.

Penguin

Pumpkin

Punguin

2.3: Gaussian and Laplacian Pyramids

Let's take a look at some Gaussian and Laplacian stacks, first applied to the Mona Lisa:

Now let's see the Laplacian and Gaussian stack for the Eiffel Monument hybrid we generated above.

2.4: Multiresolution Blending

By employing the Laplacian and Gaussian stacks we generated above, we can synthesize some really interesting multiresolution blending images. Below, we'll display the two original images, as well as the result of applying multiresolution blending with standard masks.

Orapple

Apple

Orange

Orapple: Vertical Mask

Puffguin