CS 194-26 Project 2

Abe Jellinek

Introduction

This project gave us a chance to experiment with filtering and combining images using compositions and creative applications of simple filters. We blurred images, sharpened them, combined them at different frequencies to create hybrids, and built Gaussian and Laplacian stacks to blend together disparate images with irregular masks.

Part 1.1

A gradient magnitude image is essentially a large matrix representing an image, in which each entry (pixel) is the Euclidean norm of the 2D derivative of the image at that point. In other words, for some image f(x, y), we take two partial derivatives using finite difference filters, one with respect to x and one with respect to y. The gradient magnitude is then the square root of the sum of the two squared partial derivatives: sqrt((∂f/∂x ** 2) + (∂f/dy ** 2)). Edges and noisy areas of the image (areas in which the frequency of the signal is high) will have high gradient magnitude values, while areas with little change (hence a low-frequency signal) will have low gradient magnitude values.

Cameraman (original)
Cameraman (x derivative)
Cameraman (y derivative)
Cameraman (gradient magnitude)
Cameraman (edges; threshold = 0.2)

Part 1.2

Cameraman (blurred, sigma = 1.0)
Cameraman (blurred + x derivative)
Cameraman (blurred + y derivative)
Cameraman (blurred + gradient magnitude)
Cameraman (blurred + edges; threshold = 0.14)

The differences between the blurred and unblurred partial derivative images are hard to see in scaled-down form, but the differences in the edge-detection results are obvious: the edge-detected blurred image features significantly less aliasing and false positives, even with a lower threshold, and thicker edges are solid instead of appearing "hollow" as in the unblurred image (see particularly the cameraman's arm on the left side of the image). Edges detected from the blurred image would be much more useful in a computer vision application and are more visually pleasing.

Cameraman (convolved with DoG filter + edges, sigma = 1.0, threshold = 0.14)
2D gaussian convolved with finite difference filter D_x
2D gaussian convolved with finite difference filter D_y

The edge-detected output that we produce with the DoG filter is pixel-for-pixel identical to the output produced with a two-step convolution.

Part 2.1

Taj (original)
Taj (high frequencies only, calculated as original - gaussian_blur(original, sigma = 1.0)
Taj (unsharp masked by adding 30% of the high frequencies on top of the original)

Below are two examples using my own images.

Odessa church (original)
Odessa church (high frequencies only - view at full resolution to see clearly)
Odessa church (unsharp masked by adding 60% of the high frequencies on top of the original - view at full resolution to see differences)
Artsakh (original)
Artsakh (Gaussian blurred, sigma = 1.0)
Artsakh (Gaussian blurred and then re-sharpened with unsharp mask filter)

The re-sharpened image still looks blurry. Lines appear distinct and strong, but they are surrounded by slight blurry halos. Surfaces (low-frequency areas of the image) appear out of focus. It resembles a less pleasant version of the "Clarity" effect in Adobe Photoshop Lightroom when the amount is set to a negative value.

Part 2.2

Me
My dad, a while ago
Half me, half my dad (sigma 1 = 16, sigma 2 = 8)
Me (log magnitude of Fourier transformation)
My dad (log magnitude of Fourier transformation)
Half me, half my dad, now in color (sigma 1 = 16, sigma 2 = 8)
Me (filtered + log magnitude of Fourier transformation)
My dad (filtered + log magnitude of Fourier transformation)
Half me, half my dad (log magnitude of Fourier transformation)
An owl (Shutterstock sample)
A cat (Jonathan Fife via Pixels.com)
A... cowl? (sigma 1 = 4, sigma 2 = 8)
My friends and me in the desert
The desert that my friends and I were in
A failure. It looks cool, I guess, but it doesn't really show anything identifiable. This effect really needs photos with similar geometries and subjects, or else the two signals visually clash and become nothing more than noise. (sigma 1 = 4, sigma 2 = 8)

I decided to go with color from the beginning. It looks best on most photos when both components are in color, although the high-frequency component's color data is essentially lost - borders are often gray or black, even in colorful images. For my hybrid between me and my dad, I converted the picture of him (the high-frequency component) to grayscale even when I had my image in color because it looked significantly better that way.

Part 2.3

My version of figure 3.42 in Szelski:





All images above were normalized so that the minimum color value corresponded to zero.

Part 2.4

Another example, using two works by Jiří Balcar (33rd Street Subway Station and A Man with Glass of Vodka):





This display follows the same format as Szelski's, so the bottom-right image above is the final output.

One last example (same format, my photos):





Bells & Whistles

I really loved experimenting on my own images and on pieces of art that I like! It's so cool to see these abstract concepts come together into something tangible. And I feel that the most important thing I learned from working on this project was exactly what a Fourier transform does visually - it was helpful to visualize it on a few of my own images and get an understanding of what information we can pull from looking at it.