CS 194: Project 2
Theodora Worledge
Part 1: Fun with Filters
Part 1.1: Finite Difference Operator
To take a discrete derivative of an image, we can convolve the image with the humble finite difference filters, one to take the partial derivative with respect to x, and the other to take the partial derivative with respect to y. These filters are as follows:
One can observe that these filters represent the discrete derivative by taking the difference between two pixels.
After calculating the partial derivatives with respect to x and y, I calculated the magnitude of the gradient by taking the L2 norm:
To obtain the edge image, I used a threshold of 0.1 to binarize the gradient magnitude image into values of 1 (> 0.1) and 0 (<= 0.1).
Original cameraman
Partial derivative in x
Partial derivative in y
Gradient magnitude
Edge image (thresholded > 0.1 gradient magnitude)
Part 1.2: Derivative of Gaussian (DoG) Filter
To reduce noise in the gradient of the magnitude, we can blur the image (convolve with a low pass, Gaussian filter) before taking the partial derivatives and taking the L2 norm. Due to the associativity of the convolution operation, we can either convolve the Gaussian filter with the image and then convolve with the finite difference filters, or we can convolve the Gaussian filter with the finite difference filters and then convolve the image with those derivatives of the Gaussian (DoG). I try both approaches below. For my edge image creation, I used the same threshold of 0.1 as in part 1.1 of the project.
- Question asked in 1.2: There is less noise present in the thresholded gradient magnitude image on the image that was blurred, compared to that of the image that was not blurred. Additionally, the lines in the edge image are greatly improved (filled in) by blurring the original image. It is also possible to see how the partial derivatives in x and y of the blurred image are slightly smoother and less noisy than the partials in x and y of the original image.
- Note that the line images generated from (1) convolving the cameraman image first with the Gaussian filter and then with the finite difference filters and (2) convolving the Gaussian filter first with the finite difference filters before convolving this 2-in-1 filter with the cameraman image are identical. I verified this in my Jupyter notebook by verifying that the difference between the two images is the zero matrix.
Blurred cameraman
Partial derivative in x for blurred cameraman
Partial derivative in x for blurred cameraman
Gradient magnitude for blurred cameraman
Edge image (thresholded > 0.1 gradient magnitude) for blurred cameraman
Edge image (thresholded > 0.1 gradient magnitude) for blurred cameraman, gaussian (DoG filter)
DoG partial x filter
DoG partial y filter
Part 2: Fun with Frequencies!
Part 2.1: Image "Sharpening"
Images that were blurred and then sharpened do not return to the original level of sharpness in the original version because information (specifically, the high frequency information) was lost during the blurring step, and therefore, is not recoverable through the subsequent sharpening step. As can be seen in the examples, sharpening a blurred image will result in an image that is sharper than the blurred image, but still blurrier than the original.
Original Taj
Sharpened Taj
Extra sharp Taj
Extra, extra sharp Taj
Original frogs
Sharpened frogs
Original lunch
Sharpened lunch
Sharpened blurred frogs
Blurred frogs
Sharpened blurred lunch
Blurred lunch
Part 2.2: Hybrid Images
- My first hybrid photo is a self-portrait with and without my face mask - in line with proper mask etiquette, I'm "wearing" my mask when you're up close, but not necessarily when you are far away.
- The second row of images has the spatial domain representations and the third row contains the corresponding frequency domain representations.
- The high pass filter applied to the photo with my face mask reduced the log magnitude of the Fourier transform for low frequency values. In other words, the brightness in the center of the Fourier transform representation is diminished.
- The low pass filter applied to the photo without my face mask reduced the log magnitude of the Fourier transform for high frequency values. In other words, the brightness in all parts other than the center of the Fourier transform representation are generally diminished. The bright vertical and horizontal lines in the Fourier visualization is expected, as they indicate the strong presence of horizontal and vertical lines in the original photo. Also, there is a pattern of brighter spots, which I believe to be some sort of interference pattern from the filter I used. Nonetheless, most of the high frequencies are filtered out overall.
- The Fourier transform visualization for the mask/no mask hybrid photo is similar to the visualizations for the two original images. This is expected because I combined low and high frequencies to create the hybrid image.
- My second hybrid photo combines a crazy sneaker I once found in TJ Max with my slippers. This hybrid photo worked well because I posed my slipper photo in the same way as my sneaker photo. Also, there are a lot of fun high frequency details on the sneakers, and not as much detail on the slippers, making the sneaker photo a good fit for the high pass filter and the slipper photo a good fit for the low pass filter.
- My third hybrid photo is an example of a hybrid picture that does not work well. This is because the brightness of the low frequency image is a lot lower in the sunflowers than the brightness in the eggs of the high frequency image. This means that even when zoomed into the eggs, the darkness of the sunflowers in the low frequency image is so intense that it catches the viewer's eye more than the eggs benedict details.
Hybrid photo with Derek and Nutmeg
Original photo with mask
High passed photo with mask
Original photo without mask
Low passed photo without mask
Original mask: log magnitude of Fourier transform
High passed with mask: log magnitude of Fourier transform
Original without mask: log magnitude of Fourier transform
Low passed without mask: log magnitude of Fourier transform
Hybrid photo with and without mask
Hybrid photo with and without mask: log magnitude of Fourier transform
Original sneaker
Original slipper
Hybrid photo with sneaker and slipper
Original eggs benedict
Original flower eyes
Hybrid photo with eggs benedict and flower eyes
Doesn't work very well :(
Color hybrid photo with and without mask
Color hybrid photo with sneaker and slipper
I tried introducing color to the hybrid images. On the face mask hybrid image, the high frequency image took on the colors of the low frequency image, leaving me with a face mask the same color as my skin tone. My lips are also a lot more noticeable through the face mask due to their color. Adding color to the sneaker/slipper hybrid image works nicely in that the high frequency sneaker image now has distinct color to it. Once again, however, the purple color of my socks in the low frequency image is bright and distracts from the hybrid quality of the image.
Part 2.3: Gaussian and Laplacian Stacks
Here, I multiplied the outputs of images run through a Laplacian stack by the output of a mask run through a Gaussian stack. Each row corresponds to a different frequency band of the original images, from highest frequency to lowest frequency. The left-most column is the mask applied to the apple image, the middle column is the mask applied to the orange image, and the right-most column is the average of the first two columns.
Apple level 3 (highest frequency)
Orange level 3 (highest frequency)
Oraple level 3 (highest frequency)
Apple level 2
Orange level 2
Oraple level 2
Apple level 1
Orange level 1
Oraple level 1
Apple level 0 (lowest frequency)
Orange level 0 (lowest frequency)
Oraple level 0 (lowest frequency)
Part 2.4: Multiresolution Blending (a.k.a. the oraple!)
Oraple in gray scale!
Oraple in color!
Original earlier sunset photo
Original later sunset photo
Blended earlier and later sunset
I'm very happy with the result of this earlier/later merged sunset photo! My friend took these photos in Spring 2020 on the Big C hike. I always thought it would be cool to merge them like this.
Original lake photo
Original later sunset photo
Blended lake and gigantic otter
I used an irregular mask to create this image of a gigantic otter in a lake. This result is alright - it doesn't have visible seams along the water, but you can still see me swimming through the otter's head, even though I was trying to hide myself behind the otter. It was very difficult to get the water to blend to this extent; although the colors were similar, the texture of the water in the original otter photo is a lot smoother than the texture of the water in the original lake photo.
Here are the masked images from different frequency bands I used to construct the blended sunset photo, step-by-step:
Earlier sunset level 3 (highest frequency)
Later sunset level 3 (highest frequency)
Earlier & later sunset level 3 (highest frequency)
Earlier sunset level 2
Later sunset level 2
Earlier & later sunset level 2
Earlier sunset level 1
Later sunset level 1
Earlier & later sunset level 1
Earlier sunset level 0 (lowest frequency)
Later sunset level 0 (lowest frequency)
Earlier & later sunset level 0 (lowest frequency)
The most interesting thing I learned through this project is how crucial (and difficult) it is to choose good images for hybrid blending, multi-resolution blending, and other similar techniques. The best image pairs for hybrid blending share similar colors and brightness in corresponding parts of the image. The best image pairs for multi-resolution are those with similar backgrounds and spatial compatibility, i.e. whether the mask singles out a particular object in one image and an empty space in the other image. If images are not chosen well, then the incompatible features of the original images can detract from the hybrid or blended effect in the result image.