Project 2 CS294

Emaad Khwaja

Part 1: Fun with Filters

All figures were output in high resolution. Zoom in to see details.

Part 1.1: Finite Difference Operator

The results of the finite difference operator applied to the cameraman image are displayed below.

The horizontal and vertical edge maps are calculated by convolving the cameraman image with $D_y$ and $D_x$, respectively. The magniude image is calculated by $ \sqrt{Image_{Dx}^2 + Image_{Dy}^2} $ . A threshold of ${\mu + \frac {\sigma} {2}}$ is set to generate the final image. Values below the threshold are set to 0 and those above are set to 1.

Part 1.2: Derivative of Gaussian (DoG) Filter

The gaussian filter was constructed with $ksize = 10$ pixels and ${\sigma=2}$. This acted as a lowpass filter, removing the high frequency components of the image. Since noise usually occurs in a way that is very different from the surrounding pixels, it appears as a high frequency in the frequency domain. The threshold was set slightly higher here, at ${\mu} + 1.25 {\times}{\sigma} $ .

The edges appear thicker than in the unblurred example. This is because the edges are blending with the surrounding pixels when the filter is applied, thereby increasing the size of the edge. The larger the filter size, the thicker the edges will appear.

We observe the same result when we apply the finite difference filter to the image convolved with the gaussian filter as we do when convolving the original image with the Derivative of Gaussian filter. This is a demonstration of the commutative property of convolution.

Part 1.3: Image Stretching

To straighten the image, it is compressed until the minimum dimension is 254 pixels longer. After, a gaussian filter is applied with ${ksize = 15}$ and ${\sigma} = 3$. This is performed in order to remove some noise and preserve the main edges. Next, the image is rotated from -67.5 to 67.5 degrees. This is to avoid orientation issues where the image ends up its side. At each rotation, the gradient direction is calculated at each pixel via ${\theta} = {\tan}^{-1} (\frac{\partial f}{\partial y}/\frac{\partial f}{\partial x}) $ . The pixel value (in radians) is rounded to 2 decimal places (corresponding to less than an angle difference) and the proportion of angles corresponding to $-180^{\circ}, -90^{\circ},0^{\circ},90^{\circ},180^{\circ}$ relative total number of pixels in the rotated image is calculated. This accounts for every sitution where either $\frac {\partial f}{\partial y}$, $\frac {\partial f}{\partial x}$, or both equal 0. These angles correspond to horizontal or vertical edges. Since our world is mainly dictated by gravity, it makes sense to align an image against these parameters.

Below are some demonstrations of the rotation method in action. On the bottom row histograms, the red bars correspond to the key angles mentioned above.

The hotel performs well, as well as a rotated confirmation dialog, which was expected to perform well. The next photo, a pre-rotated photo from my wedding, was also surprisingly able to be aligned, despite the lack of background information. The clothing must have provided enough orientation information to allow for accurate rotation. The photo of the painted ladies was also able to rotate such that the houses, and not the ground, are upright.

The last photo is an example of where this method has no chance. The Harry Styles album cover is taken with a fisheye lens, and therefore there are no straight lines.

It is interesting to note that the histogram does not simply shift after rotation. There are major peaks lost and some curves become more jagged. This is likely due to artifacts from cropping out regions of the image and interpolation.

<Figure size 432x288 with 0 Axes>

Part 2: Fun with Frequencies!

Part 2.1: Image "Sharpening"

The first row shows the original Taj Mahal image, the filter applied to it, the extracted edges, and the original image added to these edges (scaled by a coefficient $\alpha$.) The second row shows identical results, but this time using a Laplacian of Gaussian filter obtrained from convolving the same gaussian filter used to blur in the first row with the unit impulse, dubbed the "unsharp mask filter". The bottom row shows this being applied to a pre-blurred image.

As you can see, although the image quality subjectively improves with the addition of the edges, in no way does it actually approach the original high quality image.

Part 2.2: Hybrid Images

Below are some hybrid photos blended by taking the high frequency components of one image and blending them with the low frequency components another. High frequency components are obtained via the LoG filter, while low frequency ones are obtained via the gaussian filter.

The CatMan image shows a cat up close and a man far away. This one worked reasonably well.

The Jobs photo shows a young Steve from up close, and an older one from far away. Again, this also worked well, mostly because of the same pose and overlapping facial features.

Muskla does not blend well, although the Tesla is only visible from afar. This is a failur in the sense that it is not a convincing blend, purely because of how radically different the subjects are.


The Deceptive Oreo image is another failure. From afar, this image appears to be a regular Oreo, however, upclose it is revealed through the title that this is in fact a mini Oreo! The Oreo shown however does not look much different depending on the angle. This is because there is such a compositional match that the low frequency components of the large Oreo fit extremely well with the high frequency components of the small Oreo. The actual features of the large Oreo, however are completely gone. This is because that portion of the image ONLY contained high frequenecy components which were removed by the gaussian filter.

Please select 2 points in each image for alignment.
Please select 2 points in each image for alignment.
Please select 2 points in each image for alignment.
Please select 2 points in each image for alignment.
Extended Images with Accompanying Fourier Domain Plots

Part 2.3: Gaussian and Laplacian Stacks

Gaussian and Laplacian stacks are shown. Stacks were generated via methods discussed in the supplied paper. A ${\sigma} = 5$ was used for the gaussian filters. At the bottom is the Jobs result from 2.2.

You can observe as the blurring increases, the minor high frequency details start to dissappear, such as the extreme strokes on the first images and the cracking of the Mona Lisa. Only the low frequency ones remain at the end, namely the figures of the bodies. In the Jobs example, since the old photo was selected to be the low frequency component, only this image is now visible.

Part 2.4: Multiresolution Blending (a.k.a. the oraple!)

Below is the multi-resolution blend. It is calculated by multiplying a mask times one image's lapalacian pyramid and the inverse (1-mask) times the other and taking the summation. The results here are generally good. Slight artifcating can be seen on the soccer ball. This is due to the LoG operator operating on black pixels.

[NbConvertApp] Converting notebook main.ipynb to html
[NbConvertApp] Writing 19020421 bytes to main.html