CS 194-26 Proj 2

Kaushal Partani

1.1 Fun With Filters

In this first section, we show the partial derivative in x and y of the cameraman image via convolving the image with Dx = [1, -1] and Dy = [1,-1] (vertical stack). The following images show the derivatives in x and y.

Partial Derivative in X image.png

Partial Derivative in Y image.png

From here, we now compute the gradient image using the formula (partialX**2 + partialY**2) ** .5 I also used binarize to pick a threshold in order to remove some noise. The threshold I used was .25 for the gradient image.

Binarized Gradient Image image.png

Gradient Magnitude Computation followed these above steps. To perform it, we obtain the dx and dy by applying a convolution for the image along with the finite difference operators. This allows us to see the directional edges, the changes in both the y and the x direction that are reflected in the edges. We find the gradient magnitude by taking a magnitude function of our x and y edge images. We also used binarize as a filter so that we can remove out parts of noise by creating a threshold value that rounds the pixel either up or down based on its value and the threshold.

1.2 Derivative of Gaussian Filter

I first blurred the image using the gaussian by convolving the image with the gaussian. Then I took the partials in x and y.

blurred image image.png

X and Y Partials image.png image-2.png

Binarized Blurred Gradient image.png

Alternatively, we can create the derivative of gaussian filter prior to applying the two convolutions. In this case, we are able to obtain the same result, but using a convolution that is much less computationally expensive.

Binarized Image, but with the cheaper convolution image-2.png image-3.png image.png

1.3 Image Straightening

To straighten images, we follow our use of edges as we did in the previous part. In particular, this time we take the partial in X and Y, and then count our edge values in both directions. The angle that has the strongest amount of horizontal and veritcal edges will be the "straightened" angle.

Facade image.png

The following chart depicts the total amount of "good" edges (horizontal or vertical with a slight error boundary) per angle. image-3.png

We see that the negative -3 angle seems to give us the best count of vertical and horizontal edges, so when we rotate the facade image by -3, this is our result: A straightened image! image.png

Keyboard

I took a slightly crooked picture of my keyboard and tried to rotate it. The angle that we see with the most vertical and horizontal edges should be 4, as seen in the histograms. When we apply the rotations, we can see that the keys are straightened out!

Crooked Keyboard image.png

angle bar chart image-5.png

straightened keyboard image-4.png

Failure case: crooked house

I tried the same method on the following image: image-3.png For the human eye, we can tell that this brick pattern is just slightly crooked, and so we hope that our straightening algorithm can fix it.

However, here is the angle chart: image-3.png

This results in a rotation of around -6. The corresponding image is the following: image-2.png

I believe that this rotation fails because the diagonal patterns of the brick mess with the algorithm we run. Since we look particularly for horizontal and vertical edges, these diagonal edges may not be picked up on, and thus, what we see with our human eye as a fairly straight pattern is interpreted as an extremely crooked image in the algorithm.

2.1 Image Sharpening

In this part, we sharpen images by boosting the high frequencies of an image using some alpha value.

Here is the provided image of the Taj Mahal: image.png

And here is the sharpened version: image.png

What happens if we blur and then resharpen an image? Can we get the original image back? Here is blurred taj: image.png

And here is the resharpened blurred taj: image.png

Compared to the original above, this taj is definitely a bit blurrier than we would hope it to be. This is because we have already lost the high frequencies in the original blurring because it is a low pass filter. Thus, we are boosting the high frequencies of this low passed filter, without the original high frequencies. This results in a less blurry, but not original quality image.

Dog original image: image.png

Dog Sharpened Image: image.png

Bricks original image: image.png

Bricks sharpened image: image.png

2.2 Hybrid Images

Hybrid images have an effect of looking like one image up close, while looking like another image from far away. We can achieve this by taking the high frequencies of one image and the low frequencies of another image and combining them, given that they are aligned.

Derek and Nutmeg: image-5.png image-6.png image-4.png

Jitendra Abbeel: image-2.png image-3.png image.png

coke and pencil (failure case): image-2.png image-3.png image.png

This is a failure case due to the mismatch in shape sizes. It would be cool to see a coke sized pencil or a pencil sized coke bottle, but since the size of the coke bottle is so much larger than the pencil, it overwhelms the image, and creates a very clear outline regardless of whether or not you're close or far away.

Fourier Analysis: image.png

from left to right we see the following: fourier transform of original nutmeg, fourier transform of low-passed nutmeg, fourier transform of david, fourier transform of hi passed david

Here is the fourier transform of the combined image: image-2.png

In this combined fourier transform, we see elements of both the high pass and the low passed images. In particular, in the final image we see the stark diagonal lines from the nutmeg image, while from the david image, we see the stark verticals and horizontals.

2.3 Gaussian and Laplacian Stacks

The gaussian is a low pass filter, and if we continually apply it to the same image, we continue to get a blurrier and blurrier image. The laplacian is similar, but with high frequencies. So, when we observe the changes from gaussian to gaussian, we are essentially removing the laplacian at each step. Thus, we can see that gaussian[i+1] - gaussian[i] = laplacian[i].

Lincoln: image.png

Lincoln's laplacian and gaussian stacks: image.png

Note that I have added the last gaussian to the end of the laplacian stack because it is needed for the final part.

Mona Lisa: image.png

Mona Lisa's laplacian and gaussian stacks: image.png

One cool application of these stacks is with our previous hybrid images. Let's take Derek and Nutmeg, for example: image.png

In this form, we can see either nutmeg or derek based on the laplacian and gaussian stacks pretty clearly. For example, in the bottom right, we can see more of derek than nutmeg, while in the top left we see more of nutmeg than derek.

2.4 Blended Images (Bells and Whistles included)

We can now make blended images using these stacks! We create a mask that takes the value 1 in the areas that we want image 1 to show up, and value 0 in the areas we want image 2 to show up. The blending occurs via this formula:

I = sum(laplacian1[i]*maskGaussain[i] + laplacian2[i]*(1-maskGaussian[i]))

orapple: image-2.png image-3.png image.png image-4.png

King and Queen of Hearts: image.png image-2.png image-3.png image-4.png

Irregular mask, Mountain and Night Sky: image.png image-2.png image-3.png image-4.png

Deep dive for the irregular mask: image.png

This is the laplacian stack for the night sky. I've scaled the values to make it a bit more clear since the images are small.

image-2.png

This is the laplacian stack for the mountains. I've scaled the values to make it a bit more clear since the images are small.

image-3.png

This is the gaussian stack for the mask.

The mask changes a lot about the images, and this is particularly seen with the nightsky image.

image-4.png

This is the laplacian stack for the mountain image with the mask.

image-5.png

This is the laplacian stack for the nightsky image with the mask.

We can clearly see that no data from the mountain image is being bled to the top of the picture, and no data from the nightsky image is being bled to the bottom of the image!

image-6.png

Finally, this is the set of images that we sum up to get our final combined result. In these images, we see both the night sky and the mountains together.

What's the coolest thing I learned from this assignment?

This assignment really helped a lot in understanding frequencies in images. I didn't know that the human eye actually cant differentiate the frequencies well when we're either close to an image or far away. Thus, I think the hybrid images were really cool and knowing that one can combine similarly shaped images to create these cool double images was very interesting!

In [ ]: