CS 194-26 Project 3: Fun with Frequencies and Gradients!
This is the submission of Myron Liu (cs194-26-afp) for CS 194-26 Project 3.
Background Unsharp Masking Hybrid Images Gaussian and Laplacian Stacks Multiresolution Blending Toy Problem Poisson Blending Conclusion
Background
"Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum."
1.1. Unsharp Masking
Unsharp masking takes a blurred ("unsharp") negative image to create a mask of the original image and then is combined with the original image to create an image which is less blurry (better defined edges) than the original. The unsharp mask is generally a linear/nonlinear filter that amplifies the high-frequency components of the signal (edges). For my pictures, I used a Gaussian filter to produce a blurred image and sharpened the image using the following equation:
sharpened image = original image + (original image - blurred image) × sharpening factor
We note that increasing the sharpness factor α leads to an increase in the effect of noise in the image and the brightness of the edges.
Original Mr. Krabs Before
Sharpened Mr. Krabs After (σ = 5, kernel size = 13, sharpening factor = 5)
Sharpened Mr. Krabs After (σ = 5, kernel size = 13, sharpening factor = 10)
Original Friends Before
Sharpened Friends After (σ = 5, kernel size = 13, sharpening factor = 5)
Sharpened Friends After (σ = 5, kernel size = 13, sharpening factor = 10)
1.2 Hybrid Images
To create a hybrid image, we first align two images by some common features, then we apply a Gaussian filter on the image chosen to be the low frequency component and apply subtract the image chosen to be the high frequency component by Gaussian filtered image of itself. By adding these two images together with specified frequency cutoffs controlled by the sigma and kernel size parameters of the Gaussian filters used, we get our hybrid image. This technique was described in this paper. Note that the failure case usually occurs when the images have vastly different colors and the shapes of the blended images are very different.
Fox
Panda
Greyscale Foxy Panda (High Frequency Fox - σ = 30, kernel size = 21, Low Frequency Panda, σ = 5, kernel size = 11)
Greyscale Pandy Fox (High Frequency Panda - σ = 30, kernel size = 21, Low Frequency Fox, σ = 5, kernel size = 11)
Frequency Analysis
The low pass frequency filter removes high frequencies and the high pass frequency filter essentially aliases the high frequencies by amplifying them more relative to the low frequencies (see FFT graphs). The hybrid image has both these low and high frequencies added together. A shown by the graphs below, the hybrid image contains both the low and high frequencies of the chosen images respectively.
Log Magnitude of Greyscale Fox Image
Log Magnitude of Greyscale Panda Image
Log Magnitude of Greyscale Low Frequency Filtered Fox Image
Log Magnitude of Greyscale High Frequency Filtered Panda Image
Log Magnitude of Greyscale Hybrid Pandy Fox Image
Greyscale Kermit
Greyscale Donald Trump
Greyscale Kerump (High Frequency Kermit - σ = 2, kernel size = 21, Low Frequency Trump, σ = 5, kernel size = 11)
Color Don Draper
Color Eagle
Don "Eagle" Draper (High Frequency Eagle - σ = 20, kernel size = 21, Low Frequency Don Draper, σ = 5, kernel size = 11)
Color Bob Ross
Color Fitzroy (Patagonia)
Bob Fitzroy (High Frequency Fitzroy - σ = 20, kernel size = 21, Low Frequency Bob, σ = 5, kernel size = 11)
Failures
Color Happy
Color Sad
Sad-Happy (What a monstrosity!) (High Frequency Sad - σ = 20, kernel size = 21, Low Frequency Happy, σ = 5, kernel size = 11)
Greyscale Edison (DC)
Greyscale Tesla (AC))
Tesla Edison (AC DC) (images are too similar and not enough distinguishing edges) ((High Frequency Edison - σ = 20, kernel size = 21, Low Frequency Telsa, σ = 5, kernel size = 11)
Bell's and Whistles
Color as shown below works well for images with similar colors. Below we can see that having color for both high and low frequency components results in the best image. Having the low frequency component as the colored image and the high frequency greyscale is better than having the low frequenecy component as the greyscale image and the high frequency as the colored image. Note that having a low frequency component with a very different color from the high frequency component can make the effect difficult to see.
Color Steve Rogers
Greyscale Tony Stark
Steve Grey Stark (High Frequency Tony - σ = 5, kernel size = 11, Low Frequency Steve, σ = 2, kernel size = 21)
Grey Steve Rogers
Color Tony Stark
Grey Steve Stark (High Frequency Tony - σ = 5, kernel size = 11, Low Frequency Steve, σ = 2, kernel size = 21)
Color Steve Rogers
Color Tony Stark
Grey Steve Stark (High Frequency Tony - σ = 5, kernel size = 11, Low Frequency Steve, σ = 2, kernel size = 21)
1.3 Gaussian and Laplacian Stacks
I built a Gaussian and a Laplacian stack - I apply a Gaussian filter at each level of the stack. The image becomes coarser at each level. I used this technique to display stacks of the following images below and deconstructed my hybrid foxy panda image. This technique was described in this paper. The Gaussian stack is calculated by applying a Gaussian blur on a particular image repeatedly (or in my case to make implementation clearer - I applied a Gaussian blur with increasing powers of 2 - because the Gaussian kernel is symmetric, this is equivalent to applying the Gaussian multiple times). The Laplacian stack is calculated using the differences between these Gaussian images - the high frequency components. I amplified the contrast on each Laplacian stack to make it easier to view.
Leonardo Da Vinci's Mona Lisa
Top: Gaussian Stack of Mona Lisa (σ = 2, 4, 8, 16, 32), Bottom: Laplacian Stack of Mona Lisa
Salvador Dali's Lincoln in Dalivision
Top: Gaussian Stack of Lincoln in Dalivision (σ = 2, 4, 8, 16, 32), Bottom: Laplacian Stack of Lincoln in Dalivision
Leonardo Da Vinci's Annunciation
Top: Gaussian Stack of Annunciation (σ = 2, 4, 8, 16, 32), Bottom: Laplacian Stack of Annunciation
Salvador Dali's Old Age
Top: Gaussian Stack of Old Age (σ = 2, 4, 8, 16, 32), Bottom: Laplacian Stack of Old Age
Myron's Pandy Fox
Top: Gaussian Stack of Pandy Fox (σ = 2, 4, 8, 16, 32), Bottom: Laplacian Stack of Pandy Fox
Myron's Bob-Fitzroy
Top: Gaussian Stack of Bob-Fitzroy (σ = 2, 4, 8, 16, 32), Bottom: Laplacian Stack of Bob-Fitzroy
1.4 Multiresolution Blending
Building off of part 1.3, I now apply the Laplacian Stacks produced by my code in part 1.3 to do some multiresolution blending. The trick to this technique is to apply some mask to an image and then blur both the mask and image to get their Laplacians. Then with the target image that you wish to paste the source image on, we can use the inverted mask and blend the two images together by adding up their Laplacians. If we add these Laplacians together along with the final level of the Gaussian blurred images, we get back a well-blended fusion of the two images. This technique was described in this paper. My code works to convert RBG to grey if necessarily and can blend either cases. Masks were made in photoshop and alignment was used. The following algorithm is used for the multiresolution blending:
Step 1a. Build Laplacian pyramids LA and LB for images A and B respectively. Step 1b. Build a Gaussian pyramid GR for the region image R. Step 2. Form a combined pyramid LS from LA and LB using nodes of GR as weights. That is, for each l, i and j: LSl(i, j) = GRl(i, j)LAl(i, j) + (1 - GRl(i, j))LBl(i, j). Step 3. Obtain the splined image S by expanding and summing the levels of LS.
Apple
Orange
Mask for Apple and Orange
Gaussian and Laplacian for Apple
Gaussian and Laplacian for Orange
Gaussian and Laplacian for Mask
Top: Laplacians of Apple, Middle: Laplacians of Orange, Bottom: Multiresolution Fused Laplacians of Apple and Orange
Orapple (Made from combining most blurred end of Gaussian stack and the calculated sum of Laplacians for each picture)
Campanile
Mt. Ruapehu and Me
Mask for Campanile (Irregular)
Gaussian and Laplacian Stacks for Campanile
Gaussian and Laplacian Stacks for Mt. Ruapehu and Me
Gaussian and Laplacian Stacks for Mask (Irregular)
Top: Laplacians of Masked Campanile, Middle: Laplacians of Masked Mt. Ruapehu and Me, Bottom: Multiresolution Fused Laplacians of Masked Campanile and Masked Mt. Ruapehu and Me
Mt Ruapehu, Me and the Campanile
Bell's and Whistles: Color and Other images
Trevor
Jace
Tr-ace
Orapple
Ruapehu, Me and the Campanile
Polar Bear
Desert
Laplacian Pyramid Blending Polar Bears in the Desert
Failures
Jeff Hiking Through Bush
Blizzard
Jeff Walking Through Blizzard
Part 2: Gradient Domain Fusion
Brief Description
In this section, I explore the use of gradient-domain processing using least squares to blend images. In particular, we look at a technique called Possion Blending which utilizes solving a least squares problem to minimize the differences by looking at the gradients at the boundaries of the desired blending through comparing the target image and the source image and gradients internally by computing the source image gradients. We do so by constructing a sparse matrix and computing the least squares of the following equation:
Here, values in S refer to pixels in the source region, v are the pixels you wish to determine via least squares, and t refer to pixels in the target image. This method is further elaborated on in this paper . As you will see in part 2 of this project, this method is good but also has drawbacks including modifying the original source image's internal pixels too much, failing when dealing with very differently colored images and certain edges failing to be blended smoothly. However, it is a powerful technique which in the right context can make blending very smooth.Toy Problem
See code for implementation.
Toy Image
Output Image
2.2 Poisson Blending
Poisson blending uses the equation described in the brief description; I minimize the equations using least squares by calculating a sparse matrix of linear gradient equations with the pixels both in the internal of the image and the boundary layer between the source and target images. This blends the result in the internal layer to be more like the boundary layer. Thus the image results are largerly dependent on the boundary chosen (shape of the mask), the differences in color of the source and target image at boundaries. Below are a few images. Notice with the paraglider - changing the boundaries on the mask will affect the blending of the image - thus the choice of domain to manipulate will affect the overall blending.
Source Image: Paraglider In Blue Sky
Target Image: Volcano
Photoshopped Onto Target Volcano Picture
Mask
Poisson Blending For Paragliding and Volcano
New mask to incorporate more of the target image
Poisson Blending For Paragliding and Volcano With New Mask
More Images
Source Image: Bear in forest
Target Image: Telegraph Avenue
Photoshopped Onto Target Telegraph Avenue
Poisson Blended Bear on Telegraph Avenue
Source Image: Penguin
Target Image: Hikers on Mountain
Penguins and Hikers on Mountain
Poisson Blended Penguins and Hikers on Mountain
Failures
Blending does not work as well when you inappropriately choose a mask that is either too big or too small for blending. Thus adjustments need to be made in order to achieve good blending. Below is an example where the mask chosen was a bit too big and thus the image becomes marred. The more successful cases involved more explicit detail to masks alongside choosing images that wouldn't result in too much mixing between the source and target images (see yellow polar bears below).
Source Image: Stingray
Target Image: Campanile Sky
Photoshopped Onto Target Campanile Sky
Mask
Poisson Blending For Stingrays and Campanile Sky
Laplacian Pyramid Blending vs. Poisson Image Blending
Personally, I like Poisson Image Blending better than Laplacian Pyramid Blending as it provides a smoother overall look to most images. I think Laplacian is better in the case when the images have very similar color environments - Poisson image blending is superior when we are dealing with images with gradual but noticeably different color schemes.
Poisson Blending and Laplacian Pyramid Blending Fail
The image I chose for blending was a failure case in the Laplacian Pyramid I chose specifially because the images have very different colors - the blizzard being white and Jeff hiking being green. You can see that the Laplacian image is not very convincing as the sharp change in color is unrealistic. You can also see that Poisson Image Blending has a slightly better result in that it colorizes the image of Jeff into a more white color to match the boundary of whiteness in the blizzard. But both cases can be seen as failures.
Jeff Hiking Through Bush
Blizzard
Laplacian Pyramid Blending Jeff Walking Through Blizzard
Poisson Image Blending Source Pixel Copied Onto Target
Mask Used For Poisson Blending
Poisson Blending Jeff Walking Through Blizzard
Laplacian beats Poisson
Looking at the desert example, we see that Poisson Blending leads to an unrealistic color for the polar bear fur - although it is more smoothly blended than Laplacian; Laplacian is better at keeping the realistic color for the polar bears. Thus in this particular case the Laplacian is better than Poisson blending.
Polar Bear
Desert
Laplacian Pyramid Blending Polar Bears in the Desert
Poisson Image Blending Source Pixel Copied Onto Target
Mask Used For Poisson Blending
Poisson Blending "Yellow" Polar Bears in the Desert
Conclusion
What I learnt: Gradient Domain Processing is super cool! I think that understanding how Poisson blending vs. other kinds of blending like Multiresolution blending was a really cool takeaway from this project. There is a lot of mathy tricks to making images look cool and I love how it combines both the best of science and art.