International student participating in the Brazilian Scientific Mobility Program (BSMP) at University of California, Berkeley, for the 2015-2016 academic year. Enrolled as a Computer Science Extension student. Born and raised in Rio de Janeiro, Brazil. Undergrad student at Pontifical Catholic University of Rio de Janeiro, studying Computer Engineering and Mathematics. Traditional and Digital art lover and currently interested in Computer Graphics and Artificial Intelligence - but who knows what new field of study I might fall in love with! Trying to learn the most I can during this one year here in Berkeley.
One of the papers at SIGGRAPH that made news when it first came out was Seam Carving for Content-Aware Image Resizing by Shai Avidan and Ariel Shamir. In this assignment, you'll be implementing the basic algorithm presented therein. To wit, you'll be designing a program which can shrink an image (either horizontally or vertically) to a given dimension.
The first step for doing the beam carving is defining an energy function for each pixel of the input image. Consider the following image as an example:
For the energy function calculation we first convolute the image with the gaussian filter. Then we subtract the original image from the filters image to get an image containing the high frequencies of the input. For the energy function, we use the absolute value of this high frequency image. The idea behind this is that the edges are the features that we want to preserve and the high frequencies will contain this information. It is necessary to take the absolute values of the high frequency image because it may have negative values. Below, the subtraction of the first image by the second image results in the third image. The energy function is created by taking the absolute values of the third image.
The next step is to calculate the path that contains the lowest energy. For that, we use dynamic programming techniques. We iterate through the entire image calculating the best path for each row given that the best path from the previous row was found. The minimum path cost that passes through a pixel is the energy cost of that pixel added to the minimum path cost of the pixels above. The result of the dynamic programming technique for each one of the directions is displayed below. You can see that the energy function propagates with a speed of one pixel per row/collumn, resulting in those cool pictures of this process. The dark pixels represent low cost paths and the white pixels high costs.
Once we backtrack the path in the dynamic programming image, we will have the pixels that need to be removed from the image. By removing those pixels from the image we get its smaller version. By continuosly doing this process, we may define a pixel ordering for removal. The pixel ordering value of a pixel will be the step of this process that it has been removed from the image. By keeping track of this information, it is possible to recover the pixel removal function of an image.
Using this pixel ordering image it is possible to easily resize the image with seam carving. To reduce an image size by X pixels, you just need to remove the pixels that have a value lower than X. This can be easily and quickly achieved through an iterative process. Below, you can see the horizontal and vertical pixel ordering functions of the original image. Darker pixels are removed earlier while brighter pixels are removed later. You can see that the main features of the image (the boats and the islands) have white clusters (will be removed late in the resizing process).
By having the ordering function, we can resize the image very easily by removing the lower valued pixels. The images below are the results of the seam carving for the original image. The horizontal carving was pretty much a success. You can see that compared to the cropped and scaled versions of the image, the resized image preserves more features that allows us to better recognize the original composition.
Of course, the more pixels you remove, the more artifacts appear in the image. As you can see with the removal of 600 pixels, all images have distortions or lack of information.
For the same picture, the removal of pixels in the vertical direction was trickier. The seam carving quickly presented an artifact on the left island as soon as the sky has been removed from the picture. The horizont does not appear to be a line anymore. The structure of this image does not allow very good vertical removals.
Below are some results that worked. They were pretty much images that had large places without much information (such as skies and lakes) in one of the directions. The seam carving quickly found paths through those zones, resulting in very good resized images.
Below are some results that did not work. Most of them have complex structures in the image, which results in the absence of low energy areas. A better energy function could improve those images by evaluating with lower energies some features that are not as important as the main features.
This example was a total fail. Even small resizings did not work. The image not being aligned vertically or horizontally may have contributed to this problem. The algorithm can not find a good horizontal or vertical path because a lot of those low paths in the scene are rotated. At the same time, the floor has a lot of details that are perceived as important features to preserve.
Instead of deleting seams, we insert new pixels by taking the average of the neighbor pixels. The results are larger images, as the ones shown below.
We may assign a mask that will be considered as a weight for the energy function. By giving important features a high mask value compared to the rest of the image, the algorithm will not find paths that passes through those objects.
The opposite can be done. We may assign low mask values for objects that we want to delete. The algorithm will be more likely to find paths that delete pixels from those masked features.
The features that the mask altered were already very small. This resulted in some problems to find the path through thoses places.
The whole project was divided in a precomputing module where the pixel orders were computed and an online module that used this information to do the pixels deletions. Precomputing takes a lot of time but the online module is fast. The pixel order information is saved in a ".ord" file, that basically stores each entry of the pixel order matrix.
I got a better understanding on how to map energy functions to correlate pixels to our human perception. It may be a hard task, but we may gather a lot of information from images that have good energy functions.
Technically, the thing that I learned the most was working with large amount of pixels, backtracking and dynamic programming with images.
The coolest feature of this assignment was deleting objects from the scenes. If I had more time I would improve that and work in a way to put the "online" part of the project on the web.
Website built using bootstrap.