CS194-26: Image Manipulation, Computer Vision and Computational Photography Final Project

By Alex Kassil (cs194-26-aca)

Precanned Project 1: Gradient Domain Fusion

Part 2.1: Toy Story

Buzz and Woody need your help! They are trapped as pixels, but not any sort of pixels, but pixels that should the x and y direction gradient

Here you can see the original picture (and final goal!) plus the edges by taking the dx and dy gradients and square rooting the sum of their squares

Sadly we do not have the full picture, only the two gradients, and one pixel, the top left pixel from the original image. Is that enough to recover the whole image? YES! Using linear algebra, we set up three objectives. 1. The y gradients of the final picture (or column of values that will be our new pixels transformed into a new picture. The gradients make up part of our b vector, with each entry having a corresponding row in the A matrix, with the column of that row entry being 1 for the pixel location and -1 for the derivative pixel location. 2. The x gradients behave the same way as the y gradients in our matrix 3. ONE pixel at the last entry of our b vector that isn't a gradient but the actual value. And for that we just have one nonzero entry in our row entry of A being a 1 for the corresponding pixel column. Just one pixel intensity is enough to propogate and find all the correct values. Now we sparsify the matrix so it fits into memory and crank it through a linear algebra least squares solver, and then we get our result! Below on the left is the result, and on the right is the original image we were trying to recover.

MISSION SUCCESS! Blastoff! To Infinity and BEYOND!!!!!!!!

Part 2.2: Poisson Blending, Bezerkeley Efros, and a Brown Eyed Boy

Just like with the toy example, we want to reconstruct an image from the gradients. But now we want to have an irregular mask that we can use to crop and place an image from a source into a target and have the gradients match up!

Here is a cute penguin and a picture of some hikers, we want to make it seem as if the hikers stumbled upon a penguin.

First thing is to create a mask of the penguin region

Now let us copy the pixels over!

Well, sadly there are some issues. Clearly a bad crop job, seems like some shoddy photoshop work. NO ONE WILL BE FOOLED :(

So we have the source, and the target, and can copy pixels over. Let's try to solve this like we did the previos part! Using linear algebra, we again set up three objectives, for each of the three color channels 1. The y gradients of the final picture (or column of values that will be our new pixels transformed into a new picture. The gradients make up part of our b vector, with each entry having a corresponding row in the A matrix, with the column of that row entry being 1 for the pixel location and -1 for the derivative pixel location. 2. The x gradients behave the same way as the y gradients in our matrix 3. MANY pixels in our b vector that represent actual values at the boundary of our mask! Now we again sparsify the matrix so it fits into memory and crank it through a linear algebra least squares solver, and then we combine the three color channels to get our result!

PERFECT! Now one can say the alaskan trip was a success! With some tiny, barely noticeable blur around the penguin, it seems to git in quite nicely into the picture. Anyone glancing at the photo would think "Oh Woah! A penguin in this photo is super close to those hikers!"

Here is efros with a super cool tatoo, compared on the left to the primative pixel copy

Clearly you can see how even varied illumination around the tatoo comin from someone's cheek is no match for LINEAR ALGEBRA!

Finally I always wondered how I would look with a brown eye. Pretty creepy if you ask me

What I learned

I learned how powerful gradients are, and how one can really reformat their thinking away from pure pixels to gradients to do lots of cool stuff. I learned how to use sparse matrix sovlers and how to formulate problems into a matrix to allow the use of heavy linear algebra machinery. I learned how cool efros is with a face tatoo.

Precanned Project 2: Seam Carving

Part 2.1: Toy Story

"Boy I wish my image was smaller. No crop gets rid of the edges and details I want, and scaling messes with the aspect ratio" - one sad customer. NO MORE NEED TO FEAR! SEAM CARVING IS HERE. And it is ready to support all your resizing dreams!

So how does this fancy - shmancy yet deviously simple algorithm work? Well you take an image

And you get its E N E R G Y using an energy function. The one I initally used was the sum of absolute values of the x and y derivative

Next, we comput the minimum energy path at each resulting bottom row of pixels, but using the recurrence relation where any coordinates minimum path energy is the sum of its energy plus the sum of the minimum pixel path energy either directly above it, or to the right and above, or to the left and above. Simple dynamic programming let's us compute these values real fast. Here is what the minimum energy values look like from the house above using the L1 energy function

Somewhere around 2/3's of the way from left to right seems to be the darkest pixel of the bottom row, and the darkest path seems to kind of snake up in an S shape. Let's see that path!

Now lets remove that seam!

Wow! Looks ... pretty much the same. That's because that seam was deemed least interesting by our energy function, and by removing just a seam we keep spacial locality. Now the seam carving algorithm just repeats this seam removal!

Below will be pictures of the format: Original, my algorithm removing seams, website examples

I would say I did better than the example for this house resizing since my valls of the house still seem verticle as opposed to the slanted walls of the example seam carve. Also my right tree looks more realistic. Sadly both of our left front trees look ridiculous

Here my result has a better couch, but a much more curvey street.

Here we get some hilarious artifacts around removing seams from a person's face. Though my result seems to preserve more face.

Not lets apply it to some more pictures!

Kinda meta, seam carving the seam carved example to 1/4 of the size seems to preserve a lot of the original relationship between the three colored lines.

Lets take seam carving TO THE EXTREME by taking a 40x40 original image and carving it down to 35x35, 30x30, 25x25, 20x20, 15x15, 10x10, and finally 5x5!

The first row looks mostly fine, but the last row looks like a nightmare

Bells and Whistles

Finally for the bells and whistles I compared the L2 Distance to the L1 Distance and devised a rule of thumb for deciding which energy function to use. Below left is L1 distance, right is L2 distance (square root of sum of squared gradients in the x and y direction)

Here is my rule of thumb -> If fine grained details you want to preserve, use L1, if not, use L2.

What I learned

How simple stuff can produce really amazing outcomes! It's a really elegant algorithm that allows for quite a lot of use. And how algorithms taken past their intended use lead to some hilarious outcomes!