Images are data. They can be interpreted to become information. Interpretation requires Difference (a distinction, a detail, something that stands out). Visually, Difference can take on many qualities: color, texture, consistency. To be employed in Computer Vision, though, it must be somehow quantified. Hence enters the concept of change, and the mathematical study of change, differential calculus.

The edges of an image—its contours, those elements which delineate forms, and ultimately, make the symbols that provide us with meaning—are primarily a sharp change in color. In differential calculus, sharp changes are denoted by peaks in the gradient. Developing methods to accurately and efficiently compute the gradient of an image is, therefore, essential deriving or communicating meaning through vision.

That said, images, unlike other signals, are two-dimensonal, which signifies that computing their gradient is a bit more complicated than taking a derivative. Namely, the gradient (total derivative) must be computed as a function of the partial derivatives in each dimension. Now, on the (very, very) bright side, since images (at least within computer screens) are represented as discrete signals, no obscure calculus rules need to be memorized. The dot product says hi (and winks)!

Practically: the partial derivative of a line of pixels along a dimension (for example, the x- or y-axis) is the extent to which each of its constituent pixels is distinguishable from its neighbors along that line. The total derivative, therefore, is the extent to which every pixel in the image is different from its neighbors, along both dimensions.

Algorithm-wise, the image gradient can be implemented by first constructing two arrays of finite difference operators, one for each dimension. A finite difference operator is nothing more than a vector whose dot product with any other vector is the result of subtracting the elements of the other vector, in order. When convolved with an image, a finite difference operator yields the partial derivative of the image in the direction along its dimension. Joined together, two finite difference operators form a kernel. When convolved with the image, such kernel yields the gradient.

Having thus examined the role of the image gradient as morphogenic Difference, only one more concept needs to be introduced: the threshold. Thresholds are intensive points used to (subjectively?) separate, and therefore identify, that which is from that which is not. For example, the last three of following Cameraman pictures are the result of taking the image gradient, and applying a threshold. Any pixel intensity above the threshold is rounded up to 1 (is considered an edge). Any pixel intensity below the threshold is rounded down to 0 (is not considered an edge). Voilà.

2. Derivative of Gaussian

A shortcoming of taking the gradient of the image directly is that the latter, due to its sampled nature, is subject to noise. This results in detecting edges and artifacts that do not trully exist. Enter filtering. By selectively choosing which frequencies to keep from the image, the output will constitute (almost) only pertinent information.

The filter implemented for this project is known as Gaussian, as it involves a kernel made of a Gaussian (aka. Normal) distribution. The image in question is convolved with the kernel. As such, the intensity of the pixel in the middle of an area of the image having the size of the kernel, will become the weighted average of intensities in that area (with intensities closer to the center being assigned a higher weight).

Notice the difference that filtering with a Gaussian before performing the gradient computationmakes. The following images of the Cameraman where first passed by a Gaussian filter with a kernel size of 3 and a standard deviation of 1.6. The image gradient, unlike in the previous part, now produces images that are gray rather than black and white. This occurs because a blurry image has less variation in frequency. Edges that used to be white are now a little darker. Parts that used to be black are now a little lighter. Everything is a bit more homogeneous.

3. Image Straightening

Images can be straightened programatically by testing their orientation along a set of rotations, and keeping the rotation that performs better in terms of maximizing the number of horizontal and vertical edges in the scene. The intuition behind such algorithm lies in the fact that, due to gravity, humans have a statistical preferences for lines parallel to, and orthogonal to, the ground.

The metric to count horizontal and vertical edges is straightforward: first compute the gradient orientation at each pixel, which is simply the arctan of the ratio between the partial derivatives. Then, take the cos and sin of this orientation, and multiply them together. The more the gradient orientation approaches 0, 90, 180, or 270 degrees, the closer it will be to zero, and vice versa.

That said, a difficulty may arise in that rotated images may contain remnant black borders from the rotation. As such, when rotating to test for orientation, the candidate image must also be cropped to only contain image information. This signifies that every candidate image will have a different number of pixels, which will skew the metric. The solution is to weight the metric by the size of each image. The following images show the algorithm in action for a facade.

The algorithm is also able to straighten the legendary Tower of Pisa, and a badly rotated picture of Mona Lisa, with minimal error. As it can be seen in the histograms, the rotated images contain a greater number of pixels where the gradient is either 0, π, or, -π radians, compared to their original versions.

The algorithm is nevertheless far from perfect. Images that naturally contain curves, such as the following picture of a bunny rendered with platinum skin, result in failures.

Speaking of failures, the algorithm cannot distinguish between rotations where the image is oriented according to how a human expects it, and one where it has been rotated by 90 or -90 degrees. Potential improvements to the algorithm (which however were outside of the scope of this project) include weighting the metric, or at least veryfying the edge cases, using feature detection.

Finally, it is worth mentioning that certain cases exist where the algorithm, and even a potential one enhanced with feature detection, stands absolutely no chance of success. That said, it is doubtful whether a human stands much chance either. Certain things simply cannot be measured. Kadinsky's awe-provoking masterpiece, Composition VII is a compeling proof.

4. Unsharp Mask

The Gaussian filter developed previously can find further use in image sharpening. A blurry image, in essence, is one that retained its low frequencies, yet had its high frequencies removed. When subtracted from the original image, the blurry (low frequencies) image yields a very crisp (high frequencies) image. The original image can be sharpened by simply adding it up to the crisp image, factored to taste.

Notice the unsharp mask can make a relatively low resolution image appear to be higher resolution, and even enhance perceived color. However, it is by no means magical. Images that are naturally blurry (or became blurry) will remain so.

5. Perception?

Two advanced applications of the Gaussian Blur are Hybrid Images (Oliva, Torralba, Schyns, SIGGRAPH 2006) and Multiresolution Blending (Burt and Adelson, ACM Transactions on Graphics 1983). Hybrid Images are images containing two scenes within one, and which appear selectively based on the viewer's distance. Multiresolution Blending referes to a technique of taking a patch from one image, and seamlesly aking it appear as an organic part of another.

The algorithms of this part were implemented exactly as outlined by their authors. As a sole exception, in multiresolution blending similar resolution stacks were used over pyramids to preserve as much information as possible. Furthermore, to speed up the filtering process, every image was first converted to its frequency domain with a Fast Fourier Transform, and then, once the desired filter had been applied, inverted back to its spatial domain. The following images demonstrate the results.

the three mustacheers of human civilization

analytically...

low frequencies form content, high frequencies form shape

Friedrich Nietzsche
—
low frequencies

Albert Einstein
—
high frequencies

Friedrich Einstein
—
hybrid

Friedrich Einstein
—
gaussian, level 1

Friedrich Einstein
—
gaussian, level 2

Friedrich Einstein
—
gaussian, level 3

Friedrich Einstein
—
gaussian, level 4

Friedrich Einstein
—
gaussian, level 5

filtering is equivalent to changing the viewer's spatial position

Salvador Dali
—
low frequencies

Friedrich Nietzsche
—
high frequencies

Salvador Nietzsche
—
hybrid

Salvador Nietzsche
—
laplacian, level 1

Salvador Nietzsche
—
laplacian, level 2

Salvador Nietzsche
—
laplacian, level 3

Salvador Nietzsche
—
laplacian, level 4

Salvador Nietzsche
—
laplacian, level 5

visualizing stacks

*Lincoln and Gala* — gaussian, level 0	*Lincoln and Gala* — gaussian, level 1	*Lincoln and Gala* — gaussian, level 2	*Lincoln and Gala* — gaussian, level 3	*Lincoln and Gala* — gaussian, level 4
*Lincoln and Gala* — gaussian, level 5	*Lincoln and Gala* — gaussian, level 6	*Lincoln and Gala* — gaussian, level 7	*Lincoln and Gala* — gaussian, level 8	*Lincoln and Gala* — gaussian, level 9

*Lincoln and Gala* — laplacian, level 0	*Lincoln and Gala* — laplacian, level 1	*Lincoln and Gala* — laplacian, level 2	*Lincoln and Gala* — laplacian, level 3	*Lincoln and Gala* — laplacian, level 4
*Lincoln and Gala* — laplacian, level 5	*Lincoln and Gala* — laplacian, level 6	*Lincoln and Gala* — laplacian, level 7	*Lincoln and Gala* — laplacian, level 8	*Lincoln and Gala* — laplacian, level 9

*Lincoln and Gala* — bandpass, level 0	*Lincoln and Gala* — bandpass, level 1	*Lincoln and Gala* — bandpass, level 2	*Lincoln and Gala* — bandpass, level 3	*Lincoln and Gala* — bandpass, level 4
*Lincoln and Gala* — bandpass, level 5	*Lincoln and Gala* — bandpass, level 6	*Lincoln and Gala* — bandpass, level 7	*Lincoln and Gala* — bandpass, level 8	*Lincoln and Gala* — bandpass, level 9

*Lincoln and Gala* — sharpen, level 0	*Lincoln and Gala* — sharpen, level 1	*Lincoln and Gala* — sharpen, level 2	*Lincoln and Gala* — sharpen, level 3	*Lincoln and Gala* — sharpen, level 4
*Lincoln and Gala* — sharpen, level 5	*Lincoln and Gala* — sharpen, level 6	*Lincoln and Gala* — sharpen, level 7	*Lincoln and Gala* — sharpen, level 8	*Lincoln and Gala* — sharpen, level 9

*Mona Lisa* — gaussian, level 0	*Mona Lisa* — gaussian, level 1	*Mona Lisa* — gaussian, level 2	*Mona Lisa* — gaussian, level 3	*Mona Lisa* — gaussian, level 4
*Mona Lisa* — gaussian, level 5	*Mona Lisa* — gaussian, level 6	*Mona Lisa* — gaussian, level 7	*Mona Lisa* — gaussian, level 8	*Mona Lisa* — gaussian, level 9

*Mona Lisa* — laplacian, level 0	*Mona Lisa* — laplacian, level 1	*Mona Lisa* — laplacian, level 2	*Mona Lisa* — laplacian, level 3	*Mona Lisa* — laplacian, level 4
*Mona Lisa* — laplacian, level 5	*Mona Lisa* — laplacian, level 6	*Mona Lisa* — laplacian, level 7	*Mona Lisa* — laplacian, level 8	*Mona Lisa* — laplacian, level 9

*Mona Lisa* — bandpass, level 0	*Mona Lisa* — bandpass, level 1	*Mona Lisa* — bandpass, level 2	*Mona Lisa* — bandpass, level 3	*Mona Lisa* — bandpass, level 4
*Mona Lisa* — bandpass, level 5	*Mona Lisa* — bandpass, level 6	*Mona Lisa* — bandpass, level 7	*Mona Lisa* — bandpass, level 8	*Mona Lisa* — bandpass, level 9

*Mona Lisa* — sharpen, level 0	*Mona Lisa* — sharpen, level 1	*Mona Lisa* — sharpen, level 2	*Mona Lisa* — sharpen, level 3	*Mona Lisa* — sharpen, level 4
*Mona Lisa* — sharpen, level 5	*Mona Lisa* — sharpen, level 6	*Mona Lisa* — sharpen, level 7	*Mona Lisa* — sharpen, level 8	*Mona Lisa* — sharpen, level 9

re^Naissance

Circles in a Circle
—
Wassily Kadinsky (1923)

Vitruvian Man
—
Leonardo Da Vinci (c. 1490)

Circles Upon Man
—
Mechanics

*Vitruvian Man* — laplacian, level 0	*Vitruvian Man* — laplacian, level 1	*Vitruvian Man* — laplacian, level 2	*Vitruvian Man* — laplacian, level 3	*Vitruvian Man* — laplacian, level 4
*Vitruvian Man* — laplacian, level 5	*Vitruvian Man* — laplacian, level 6	*Vitruvian Man* — laplacian, level 7	*Vitruvian Man* — laplacian, level 8	*Vitruvian Man* — laplacian, level 9

*Vitruvian Man* — mask, level 0	*Vitruvian Man* — mask, level 1	*Vitruvian Man* — mask, level 2	*Vitruvian Man* — mask, level 3	*Vitruvian Man* — mask, level 4
*Vitruvian Man* — mask, level 5	*Vitruvian Man* — mask, level 6	*Vitruvian Man* — mask, level 7	*Vitruvian Man* — mask, level 8	*Vitruvian Man* — mask, level 9

*Vitruvian Man* — applied mask, level 0	*Vitruvian Man* — applied mask, level 1	*Vitruvian Man* — applied mask, level 2	*Vitruvian Man* — applied mask, level 3	*Vitruvian Man* — applied mask, level 4
*Vitruvian Man* — applied mask, level 5	*Vitruvian Man* — applied mask, level 6	*Vitruvian Man* — applied mask, level 7	*Vitruvian Man* — applied mask, level 8	*Vitruvian Man* — applied mask, level 9

Circles Upon Man
—
low frequencies

Vitruvian Man
—
high frequencies

Homo Universalis
—

Vitruvian Man
—
low frequencies

Circles Upon Man
—
high frequencies

Zarathustra
—
Διόνυσος

"The practical teaching of Nietzsche is that difference is a happy accident,that
the diverse, the becoming, the random, are sufficient as objects of happiness.
That only happiness returns."

— Gilles Deleuze

Orange
—
top

Apple
—
bottom

Orapple
—

*In The Making* — laplacian, level 0	*In The Making* — laplacian, level 1	*In The Making* — laplacian, level 2	*In The Making* — laplacian, level 3	*In The Making* — laplacian, level 4
*In The Making* — laplacian, level 5	*In The Making* — laplacian, level 6	*In The Making* — laplacian, level 7	*In The Making* — laplacian, level 8	*In The Making* — laplacian, level 9

*In The Making* — mask, level 0	*In The Making* — mask, level 1	*In The Making* — mask, level 2	*In The Making* — mask, level 3	*In The Making* — mask, level 4
*In The Making* — mask, level 5	*In The Making* — mask, level 6	*In The Making* — mask, level 7	*In The Making* — mask, level 8	*In The Making* — mask, level 9

*In The Making* — applied mask, level 0	*In The Making* — applied mask, level 1	*In The Making* — applied mask, level 2	*In The Making* — applied mask, level 3	*In The Making* — applied mask, level 4
*In The Making* — applied mask, level 5	*In The Making* — applied mask, level 6	*In The Making* — applied mask, level 7	*In The Making* — applied mask, level 8	*In The Making* — applied mask, level 9

Campbell's
—
John Leffmann (2015)

In The Making
—

Orapple
—
frequency domain

Campbell's
—
frequency domain

Orapple
—
gaussian

Campbell's
—
laplacian

Canned Orapple
—
frequency domain

Mask
—
level 0

Mask
—
level 5

Mask
—
level 9

*Cameraman* — dx	*Cameraman* — dy	*Cameraman* — dxy
*Cameraman* — threshold: 0.0010	*Cameraman* — threshold: 0.0015	*Cameraman* — threshold: 0.0020

*Cameraman* — dx	*Cameraman* — dy	*Cameraman* — dxy
*Cameraman* — threshold: 0.0002	*Cameraman* — threshold: 0.0005	*Cameraman* — threshold: 0.0010

Filters and
Frequencies

dream and make the new reality with us

xr.berkeley.edu

Overview

—

Table of Contents

—

1. Finite Difference Operator

2. Derivative of Gaussian

3. Image Straightening

4. Unsharp Mask

5. Perception?

↑

1. Finite Difference Operator

↓

↑

2. Derivative of Gaussian

↓

↑

3. Image Straightening

↓

↑

4. Unsharp Mask

↓

↑

5. Perception?

↓

the three mustacheers of human civilization

analytically...

low frequencies form content, high frequencies form shape

filtering is equivalent to changing the viewer's spatial position

visualizing stacks

re^Naissance

"The practical teaching of Nietzsche is that difference is a happy accident,that
the diverse, the becoming, the random, are sufficient as objects of happiness.
That only happiness returns."

— Gilles Deleuze

―

⇈

CS 194-26: Computer Vision and Computational Photography (Fall 2020)

Apollo

Filters and Frequencies

dream and make the new reality with us

xr.berkeley.edu

Overview

—

Table of Contents

—

1. Finite Difference Operator

2. Derivative of Gaussian

3. Image Straightening

4. Unsharp Mask

5. Perception?

↑

1. Finite Difference Operator

↓

↑

2. Derivative of Gaussian

↓

↑

3. Image Straightening

↓

↑

4. Unsharp Mask

↓

↑

5. Perception?

↓

the three mustacheers of human civilization

analytically...

low frequencies form content, high frequencies form shape

filtering is equivalent to changing the viewer's spatial position

visualizing stacks

reNaissance

"The practical teaching of Nietzsche is that difference is a happy accident,that the diverse, the becoming, the random, are sufficient as objects of happiness. That only happiness returns."— Gilles Deleuze

―

⇈

CS 194-26: Computer Vision and Computational Photography (Fall 2020)

Apollo

Filters and
Frequencies

re^Naissance

"The practical teaching of Nietzsche is that difference is a happy accident,that
the diverse, the becoming, the random, are sufficient as objects of happiness.
That only happiness returns."

— Gilles Deleuze