This project introduces us to seam carving, which is part of a procedure called Content Aware Resizing. To resize images while keeping in mind the contents inside them, we find "seams" of pixels through the images that have low "energy", and remove them, rather than naively remove columns.
To determine which pixels to remove, we first calculate the "energies" of each pixel, which roughly corresponds to the importance of each pixel. The higher the energy, the less likely the pixel should be included in any "seam" that we choose to remove.
I chose to implement the dual gradient energy function. In this case, a high energy pixel corresponds to a pixel where there is a sudden change in color, such as at a boundry of an object. To do this, we define that the energy of each pixel is equal to the sum of squares of the differences in R, G, B values in the neighboring pixels (pixel (y-1, x) and (y+1, x) for vertical seams).
To find set of pixels that we want to remove from the image, we determine a "seam": a path of pixels which a minimum total sum of energies, which the restriction that these pixels have to form a connected "path" from the top of the image to the bottom.
This is a dynamic programming exercise, because the path of minimum energy ending at row n is equal the sum of the path of minimum energy ending at row n-1 and the smallest-energy adjacent pixel in row n.
Once we implemented how to find the minimum vertial seam, the minimum horizontal seam is much more simple. We simply transpose the pixels in the original image, and run the vertial seam finder algorithm.
|
|
|
|
|
|
|
|
|
|
|
|
These failure cases came about because they had many subjects in the photo. The issue is exacerbated when there are people or recognizable objects(the Empire State Building's spire), because any change in size will be immediately noticed.
|
|
|
|
|
|
This project really blew my mind because I expected a "Content Aware Image Resizer" would use complicated machine learning under the hood, but really it's just basic concepts! That makes things very approachable for me, especially since I've yet to take a machine learning course at Berkeley.
This project introduces us to the dolly zoom, a camera effect made famous by Alfred Hitchcock's "Vertigo" movie. To create it, I took successive photographs of a subject, increasing the distance with each shot while also zooming in on my camera's lens. I used a Nikon camera with a zoom lens, and took the pictures at my friend's apartment on various things laying around.
It was actually surprisingly difficult to keep the subject in the same relative position in the frame while taking the pictures, so the photos reflect that, but overall the effect is still there. It creates an emotion of suddenly plunging forward, which is why it is so associated with "Vertigo".
|
|
I am somewhat of a film nerd, so it was extremely cool to have this experience recreating such an iconic shot.