Introduction
This project explores different ways of using frequencies to process and even combine images in interesting ways. For instance, an image can be sharpened by filtering and emphasizing its highest frequencies. Edges can be extracted using finite difference kernels. Hybrid images can be produced by combining the high frequencies from one image with the low frequencies of another. Lastly, images can be blended together at various frequencies using Gaussian and Laplacian stacks.
Finite Difference Operator
Approach
For each of the two partial derivatives, a finite difference kernel was created as a Numpy array. dx_kernel = np.array([[1, -1]])
and dy_kernel = np.array([[1], [-1]])
. These two kernels were used to convolve the original image using scipy.signal.convolve2d
with mode='same'
to produce images of their respective partial derivatives. These values were combined into a single edge image by calculating the pixel-wise magnitude of the gradient using np.sqrt(dx_deriv ** 2 + dy_deriv ** 2)
. This is essentially treating the respective pixel values of the two partial-derivative images as elements in the gradient vector and taking its L2 norm as the final pixel value.
Results
|
dx |
dy |
Derivative |
|
|
Binarized |
|
|
Combined Gradient |
Combined Binarized |
|
|
Derivative of Gaussian (DoG) Filter
Blurred Finite Difference
First, the image is blurred using a Gaussian kernel of size 10
created using cv2.getGaussianKernel
. In order to ensure the convolution produces minimal artifacts, at least 6 standard deviations must fit inside the kernel. Thus, I chose sigma = kernel_size / 6
. The blurred image is then passed through the same finite difference function used above.
There are some noticeable differences in the final result using this approach. For one, the binarized edges are thicker and rounder than before. Also, the small bits of noise at the bottom of the image as well as the fine details inside the camera are completely gone.
|
dx |
dy |
Derivative |
|
|
Binarized |
|
|
Combined Gradient |
Combined Binarized |
|
|
Derivative of Gaussian
In this approach, the Gaussian kernels used to blur the image are instead convolved beforehand using the dx
and dy
finite difference kernels to produce dx_gaussian
and dy_gaussian
. These kernels are the partial derivatives of the gaussian kernel with respect to x
and y
. Then, the image is convolved using dx_gaussian
and dy_gaussian
to produce the partial derivative images, which are then combined into a single edge image using the same approach as above.
|
dx |
dy |
Derivative |
|
|
Binarized |
|
|
Combined Gradient |
Combined Binarized |
|
|
Comparison
Apart from very slight differences in the length and shape of short edges caused by noise, the results from the two approaches are essentially identical.
Blurred Finite Difference |
Derivative of Gaussian |
|
|
Image Sharpening
Approach
In order to sharpen an image target
, the image is convolved with a gaussian kernel in order to filter out higher frequencies, resulting in a blurred image blurred
. The high-frequency details
are then calculated through details = target - blurred
, which removes all lower frequency features from the original image. These details are then emphasized in the final image through result = target + alpha * details
where alpha
is a constant sharpening factor.
Taj Mahal
The image of the Taj Mahal was sharpened using alpha = 0.75
. As shown below, the edges of the arches and tiles were emphasized, as well as the silhouettes of the trees.
Original |
Sharpened |
|
|
Natural Landscape
The natural landscape image was sharpened using alpha = 1.0
. As shown below, the the silhouettes of the trees as well as the ripples in the water are emphasized by the sharpening procedure.
Original |
Sharpened |
|
|
Re-sharpening a Blurred Image
The image of the house was first blurred using kernel_size=10
and then sharpened using alpha = 1.0
. As shown below, most of the edges in the house's features are emphasized, including those in its reflection in the water. However, the details in the ceiling are not due to them being smoothed out by the initial blurring.
The sharpening does a great job on the edges of the house and removes that "smudged" quality of the blurred image. However, because the initial blur smoothed out the finer details in the ceiling and inside the house, the sharpening was unable to recover that lost information.
Original (Blurred) |
Sharpened |
|
|
Hybrid Images
Approach
Two images are taken as input: lo_img
and hi_img
. A gaussian blur is applied to lo_image
using kernel_size = 6 * lo_sigma
to produce the image lo
. For the higher frequencies, a gaussian blur is applied to hi_image
using kernel_size = 6 * hi_sigma
to produce hi_blurred
. Then, the high frequencies are extracted hi = hi_img - hi_blurred
. Finally, lo
and hi
are average together pixel-wise to produce the hybrid image.
Tobey's Glasses
Blurring was performed using lo_sigma = 5
and hi_sigma = 3
.
Low Frequency Image |
High Frequency Image |
|
|
Hybrid Image |
|
Mr. Incredible
Blurring was performed using lo_sigma = 6
and hi_sigma = 6
.
Low Frequency Image |
High Frequency Image |
|
|
Hybrid Image |
|
For the Mr. Incredible hybrid image, Fourier transforms were applied to the original input images, the filtered images lo
and hi
, and the final hybrid image, producing the following graphs:
Low Frequency Image FFT |
Filtered |
|
|
High Frequency Image FFT |
Filtered |
|
|
Hybrid Image FFT |
|
Love and War (Failure)
Blurring was performed using lo_sigma = 15
and hi_sigma = 3
. With hybrid images composed of text, I found it difficult to balance between blurring and readability. These were the best results I could achieve after a lot of fine-tuning, but even from a distance it is rather hard to decipher the blurred text. If I instead decreased the blurring on the text, it becomes too readable when viewing from a short distance and distracts from the text in the high-frequency image.
Low Frequency Image |
High Frequency Image |
|
|
Hybrid Image |
|
Gaussian and Laplacian Stacks
Approach
At every level of the Gaussian stack, instead of downsampling, the previous level is blurred using a gaussian kernel to produce the next level. Thus, the sizes of the images are the same across all levels. The Laplacian stack l_stack
is calculated from the Gaussian stack g_stack
such that l_stack[:, :, i] = g_stack[:, :, i] - g_stack[:, :, i+1]
. This is accomplished through l_stack = g_stack[:, :, :-1] - g_stack[:, :, 1:]
. The last level of the Laplacian stack is directly taken from the last level of the Gaussian stack, and stacked with the rest of the Laplacian stack. Thus, the Laplacian and Gaussian stacks end up with the same number of levels.
Results
From top to bottom, the Laplacian stack images for the apple (left) and orange (right) at levels 0
, 2
, and 4
are shown below:
Multiresolution Blending
Approach
The input images left
and right
are used to generate Laplacian stacks left_l_stack
and right_l_stack
using the method above. A Gaussian stack mask_g_stack
is generated from the mask input image. To blend the images, for each level i
in all three stacks, blended[:, :, i] = (1 - mask_g_stack[:, :, i]) * left_l_stack + mask_g_stack[:, :, i] * right_l_stack
. This can be accomplished using the vectorized code blended = (1 - mask_g_stack) * left_l_stack + mask_g_stack * right_l_stack
.
Watermelon Pizza
Is this better than pineapple on pizza?
Input Image #1 |
Input Image #2 |
Mask |
|
|
|
Result |
Laplacian Stack |
|
|
Smiling Cat
They don't always show it, but this is how they really feel when they're around you.
Input Image #1 |
Input Image #2 |
Mask |
|
|
|
Result |
|
Roypple
My friend Roy has always dreamed of becoming an apple, so I made his wish come true.
Input Image #1 |
Input Image #2 |
Mask |
|
|
|
Result |
|
Side-eye Mona Lisa
She's definitely judging you, peripheral vision or not.
Input Image #1 |
Input Image #2 |
Mask |
|
|
|
Result |
|