The goal for part one was to get us familiar with frequency analysis in the context of computational photography. Along the way we learned how to apply frequency analysis in useful ways (image sharpening, multi-resolution blending, etc.), as well as how to implement it effectively in code (Laplacian/Gaussian stacks, etc.).
|
|
Image sharpening works by high pass filtering an image, and then adding the scaled high-passed image to the original image. This has the effect of making edges more prominent, which contributes to the result appearing 'sharper' to the human eye.
To generate the high pass image I low passed the image using a Gaussian filter and then subtracted that low pass result from the original image. This operation is equivalent to convolving the original image with a Laplacian to get the high pass output. It was a good warm up for the rest of the project, as it helped me solve bugs related to datatypes before they came up in the later (more difficult) parts of the assignment.
Hybrid images are generated by combining low frequency content from one image with high frequency content from another. When zoomed out the image looks different than it does up close, because zooming out acts as a low pass filter. The further away you are the more you can see the low passed image's content, and vice versa. Here is an example of me doing my best Mona Lisa impression. Close up I'm half-smiling, but far away I'm frowning.
|
|
|
|
|
|
|
|
As you can see in the FFT plots, the higher frequencies in the frown picture is largerly thrown away before being added back to make the final result. On the other hand the smile image keeps its higher frequencies, but loses some low frequencies. I found I had to really blur the frown, and leave in most of the smile's frequencies to get this effect. That can be seen in the FFT domain plots.
Here is another example of hybrid blending. Again I tried to combine a smile and frown picture!
|
|
|
I tried to get ambitious, and make everyone's favorite trump/cheeto meme. Unfortunately, a high passed cheeto doesn't look like much of anything (rough texture looks like noise) so the operation fails!
|
|
|
The next part of the project requires implementation of Laplacian and Gaussian stacks: a Laplacian stack is a stack of images that shows the original image's content at various frequencies, and a Gaussian stack is a set of progressively blurred images.
I applied my Laplacian stack to analyze my hybrid picture from the last section. Notice how higher frequencies bands show me smiling, but I'm frowning in the lowest band image?
|
|
|
After analyzing my hybrid image, I used the Laplacian stack to analyze a Salvador Dali painting. Note how the lower frequencies mostly contain the woman and man (in background), but the other characters are more obvious in the higher frequency band. In particular, the bottle clearly transforms from a bottle in the high frequency content to an earing in the low frequency content.
|
|
|
The goal of multi resolution blending is to fuse two images together by slowly distorting them, instead of sticking them just sticking them together. The distortion is necessary because without it the seam is quite harsh, which detracts from the believability of the final results.
The actual blending requires two source images and a mask. The mask is blended using a Gaussian stack, and each of the images is decomposed into its own Laplacian stack. A new Laplacian stack is generated by multiplying the Gaussian stack with the Laplacian stack of the first image, and adding the result to one minus the Gaussian stack multiplied with the Laplacian stack of the second. After this stack is created a new image can be constructed by summing up the new Laplacian stack.
Here is the example result of blending an apple with an orange to make an orapple!
|
|
|
Another example of multi-resolution blending. I used an irregular mask to blend a lion in the eucalyptus groves by VLSB! While the transition is much gentler, the change in colors from original grass to new grass still looks off-putting. I investigate improving this blend in part 2 of the project!
|
|
|
|
In this next blend I combine the skyline of San Fransisco with a picture of Sather Tower. The smooth transition between the towers is a particularly cool effect!
|
|
|
The next part of the project seeks to give us some practice with gradient domain processing. The human eye is perceptually more aware of gradients (change in color), than the actual color itself. In a sense it's more important to get the edges/changes right in an image than to present the exact color to the eye. Gradient Domain Fusion exploits this property in order to create seamless blends between two images. Instead of copying over pixels from source to target, in gradient domain blending a system of equations is setup that codify our dual desires to match the gradients of the source image and colors of the original image. The intuition here is a good seamless blend avoids creating an edge where source/target meet by keeping the colors roughly similar, and also preserves the gradients in the source image in an effort to copy the content from source to final image. The system is solved in least squares fashion (minimize squared error) on a per-channel basis. The system's solution replaces pixles in the target image where the source is to be copied.
Here we demonstrate gradient fusion by solving a (relatively) simpler set of equations. The constraints codify that the new image's top left pixel should match the original image's top left pixel, and the new image should have the same gradients (in either direction) as the old image. The algorithm succesfully reconstructs the given grayscale toy image, which is good news for the next part of the project!
|
|
The algorithm works as follows: a user provides a source picture, target picture, and binary mask which denotes region of interest in source image; a GUI prompts the user to click on where in the target image the masked source should be blended; the optimization is setup and executed after which the Poisson blended result is outputted.
To show off the algorithm, I present a picture of an error quarter (found from ebay) blended onto a picture of my desk. I also copied the source pixels onto the target just using the mask as reference.
|
|
|
|
Here is the same lion blend that I tried earlier with multi-resolution blending, except now done with the Poisson blending algorithm. In this case Poisson blending performs noticeably better, because the color of the background grass in the source contrasts far less in the Poisson blending result than in the multi-resolution blending result.
|
|
In general, Poisson blending will work better than multi-resolution blending when gradient content matters, but the exact pixel values are (relatively) unimportant. Multi-resolution blending will work better when gradual change can succesfully mask the seam, and when you don't want the pixel values of the source image to change that much.
Here is another cool blending result! Did you know wampas live in random caves in Wyoming? I didn't either until Poisson blending...
Finally, here is an example of a Poisson blending failure case. I tried to blend Sather Tower into the San Fransisco skyline, but the sky in San Fransisco was very bright when I took the picture. Poisson blending tried to match the target image's pixel values in this bright area, which naturally created a very bright Sather Tower blend. One can still make out the gradients, but the colors are extremely washed out.
Intuitively, buildings should cause a seam with the sky, so Poisson blending's desire to match background color will really backfire in this case. I would've been better of using multi-resolution blending for this one!
|
|
|