First, show the partial derivative in x and y of the cameraman image by convolving the image with finite difference operators D_x and D_y.
|
|
Now compute and show the gradient magnitude image.
To turn this into an edge image, let's binarize the gradient magnitude image by picking the appropriate threshold.
The threshold that worked best for me was 0.2.
Include a brief description of gradient magnitude computation.
The gradient magnitude at a (x, y) can be computed by taking the directional derivative in each 2D direction, (grad_x, grad_y), and calculating the magnitude of this vector to get the magnitude of the overall gradient at (x, y). We can apply this operation to the entire image at once by formatting it as a matrix, finding grad_x and grad_y, also as matrices, by convolving with finite difference operators, and calculating the magnitude sqrt(grad_x^2 + grad_y^2) pointwise. This will yield the gradient magnitude at each point in the image.
Create a blurred version of the original image by convolving with a gaussian and repeat the procedure in the previous part.
|
|
What differences do you see?
The Gaussian blur smooths the image and reduces fine-grained noise, weakening insignificant edges to produce a clearer edge detection. Since it blurs the entire image out, another side effect is that the edges in the binarized image appear more solid/thick.
Convolve the gaussian with D_x and D_y and display the resulting DoG filters as images. Verify that you get the same result as before.
|
|
Show the orientation histogram and straightening result for the facade image.
|
|
|
|
Show the original image, straightened image and the two edge orientation histograms for least 3 images, out of which, at least one should be a failure case.
Vegas:
|
|
|
|
Street (manually rotated 2.5 degrees, and confirmed that algorithm could rotate it back):
|
|
|
|
|
|
Train (fails as the original image is already aligned, but has a noisy texture and not many straight lines):
>
|
|
|
|
Show the progression of the original image to the sharpened image for the Taj image and an image of your choice.
Taj:
|
|
Snail:
|
|
Pick a sharp image, blur it and then try to sharpen it again. Compare the original and the sharpened image and report your observation.
|
|
|
Not all of the details were recovered (due to the highest frequencies being erased by the low-pass filter) so the sharpened image appears lower-resolution, but it no longer looks clearly blurry like the blurred image before filtering, and it seems to have a slightly higher color contrast.
Try creating 2-3 hybrid images (including at least one failure). Show the input image and hybrid result per example.
Derek + Nutmeg:
|
|
Happy + unsure:
|
|
Happy + sad (fails to look convincing from either perspective):
|
|
I think this looks bad because the images and positions are overall similar, but each feature is different. Since the low- and high-frequency components are neither completely overlapping nor completely distinct, they might not combine well because they end up interfering with each other visually.
For your favorite result, show the log magnitude of the Fourier transform of the two input images, the filtered images, and the hybrid image.
Elephant + cheetah (example from paper):
|
|
|
|
|
|
Try using color to enhance the effect. Does it work better to use color for the high-frequency component, the low-frequency component, or both?
|
|
|
|
I applied different combinations of grayscale and color to the happy + unsure expression hybrid, and found that the completely grayscale version and the color low frequency + grayscale high frequency options display the effect best. Colorizing the low-frequency component restores large solid color patches that are helpful in recognizing that input image from far away. On the other hand, there isn't much use in colorizing the high frequency component since it only shows up at image edges; it may also produce slivers of color that interfere with the appearance of the image from far away (like the pink bottom lip in the high-frequency component above). If color is to be added to enhance the effect, it would be most effective to colorize the low frequency component only.
Apply your Gaussian and Laplacian stacks to one interesting image that contains structure in multiple resolutions.
Illustrate the process you took to create your hybrid images in part 2 by applying your Gaussian and Laplacian stacks to one example.
Pick two pairs of images to blend together with an irregular mask, as is demonstrated in figure 8 in the paper.
Apple and orange:
|
|
|
|
Hand and eye (example from paper):
|
|
|
|
Guy 1 and guy 2:
|
|
|
|
Illustrate the process by applying your Laplacian stack and displaying it for your favorite result and the masked input images that created it.
I used a 4-level stack for each blend. For the bottom 2 sequences, images 1-4 are Laplacian levels, which represent all frequencies down to a certain threshold. To make the blend visually consistent, it was necessary to add a 5th component representing the final level of the Gaussian stack (in other words, all the frequencies lower than that threshold) to fill in the missing low-frequency data from each input. This is what the 5th image represents.
Try using color to enhance the effect.
I implemented the method in color. It involved making the mask's Gaussian stack 3D (one copy for each RGB channel). While color can enhance the effect, it can also make the blend more unnatural if the color hues of the 2 images along the edges of the blend mask are dissimilar. On the other hand, a grayscale blend will look only unnatural if the color values, rather than the hues, are dissimilar.
This project was fun! My favorite part was writing efficient code for calculating the amount of straight lines in an image, and the task/algorithm I thought was coolest was the one for multiresolution blending. I think my most important takeaways were the advantages of frequency domain visualization, and the commutative, associative, and distributive properties of convolutions.