CS 194-26 Spring 2020 Final Pre-Canned Projects

By: Annie Nguyen

Seam Carving Project
Light Field Camera Project

Seam Carving Project

Vertical Seam Carving Algorithm

For this project, I reimplemented the algorithm presented in the paper, Seam Carving for Content-Aware Resizing by Shai Avidan and Ariel Shamir. It uses dynamic programming to find the minimum cumulative energy along a connected pixel path from one edge of the image to the opposite edge of the image. In this paper, they defined energy of a pixel as the sum of the absolute value of the partial derivatives in the x and y direction.

First, I tackled finding the optimal seam vertically. To find the minimum cumulatiive energy along a connected pixel path, or the optimal seam, I first computed the energy of each pixel in the image, making a corresponding energy matrix, where entry (r, c) in the energy matrix was the energy of the image's pixel at (r, c). Then, I computed the matrix M using dynamic programming and the recurrence relationship:

Once M was computed, the last row of M represents the miminum cumulative energies of full paths that go from top to bottom. Finding the minimum of the last row tells us where the optimal seam ends. Once I know it's end, I backtrack and traverse up the image, looking at the M values of the top left, top middle, and top right neighbors of the current seam to find the optimal seam.

Mural from the project page

Optimal Vertical Seam

Original Mural Image

Mural Reduced Vertically by 70 pixels

Horizontal Seam Carving Algorithm

Finding the horizontal seam was very simples. I transposed the image and energy matrix and used the vertical seam finding algorithm. Then, I transposed the image and energy matrix back to original.

Original Mural Image

Optimal Horizontal Seam

Successful Results

Mural Image

Reduced Vertically by 70 pixels

lake Image

Reduced Vertically by 80 pixels

Golden Gate Bridge

Reduced Horizontally by 50 pixels

Joshua Tree

Reduced Vertically by 50 pixels

Kauai

Reduced Horizontally by 75 pixels

Steve Lacey

Reduced Vertically by 30 pixels

Sunset

Reduced Vertically by 30 pixels

Pacific Coast Highway

Reduced Vertically by 75 pixels

Failed Results

Annie

Reduced Vertically by 58 pixels

Ferris Wheel

Reduced Horizontally by 75 pixels

garage

Reduced Horizontally by 75 pixels

harvey

Reduced Horizontally by 75 pixels

mouse

Reduced Horizontally by 75 pixels

Bells and Whistles: Seam Insertion for Englarging Images

For the bell and whistles, I decided to implement seam insertion to enlarge images. To enlarge the image to a certain width, calculate how many pixels need to be added to the width dimension. I defined this value as the variable, k. Repeating k times, find the optimal vertical seam, save it in a list of seams, and remove the seam. Then using the original image, duplicate the saved seams in order by averaging the pixels to the left and the right of the seam and inserting the averaged seam next to the seam.

Enlarging the image vertically means duplciating vertical seams and enlarging the width dimension.

Enlarging the image horizontally means duplciating horizontal seams and enlarging the height dimension.

Donner Lake

Enlarged Vertically by 100 pixels

Trail

Enlarged Vertically by 50 pixels

Trail

Enlarged Horizontally by 100 pixels

Berkeley

Enlarged Vertically by 100 pixels

As you can see the seam insertion does pretty well and does not leave any stretching artifacts because we first find the k optimal vertical seams before duplication.

However, where the optimal seams are close to highly-structured objects or structured patterns, the seam insertion does not do well because it will distort these highly structured objects and patterns. This can be seen with the tower in Park Guell in Barcelona, Spain and ripples and rocks at Donner Lake. The vertically enlarged image of the lake was more successful than the horizontally enlarged image of the lake.

Park Guell

Enlarged Vertically by 100 pixels

Donner Lake

Enlarged Horizontally by 100 pixels

What have you learned?

I learned that content awareness can enhance the resizing of photos rather than simply rescaling. However, it has drawbacks and is definitely not a replacement for choosing the desired aspect ratio at the time of the image being taken. This is because the optimal seams found aren't guaranteed to not interfere with highly structured objects or patterns in the image. This can be seen in my failed results, and it took me time to find images that would be successful for seam carving techniques.

Light Field Camera Project

In this project, I reimplemented Depth Refocusing and Aperature Adjustment in the paper, Light Field Photography with a Hand-help Plenoptic Camera by Ren Ng et al. The Stanford Light Field Archive contains a rectified dataset of 289 views of a chess board on a 17 by 17 grid which is what I used.

Depth Refocusing

For depth refocusing, objects closer to the camera vary in position accross images more than objects farther from the camera do. Therefore, when you add and average all of the images together, the objects closer to the camera will appear out of focus and the objects farther from the camera will appear in focus.

In order to focus on different parts of the chess board, I shifted each of the 289 view images based on the view's location in the 17 by 17 grid. Then, I chose an amount to scale each shift in order to focus on a specific point. Here are examples of different amounts used to scale the shift. I found these values experimentally.

amount = -1.2, focused in the center

amount = -2.4, focused in the front

amount = range(0, -3, -.3)

Aperature Adjustment

In a camera, the aperature is the hole which light enters a camera and can be manipulated to be large or small. Therefore, averaging all the view images within a large radius from the center of the 17 by 17 grid imitates a camera with large aperature, and averaging within a small radius imitates using a small aperature. This is how I implemented aperature adjustment.

I wanted to focus on the center piece of the chessboard in order to see the aperature adjustment more clearly. If the center was in focus, then an image with a large aperature would appear blurry in the back and in the front. I shifted all the images scaled by amount = -1.5 because that is what I found to be the best amount when depth refocusing. Then I only used images in the average if it was inside a specified radius.

radius = 0

radius = 5

radius = 10

radius = range(0, 12, 1)

Bells and Whistles

I collected my own data using a 5 by 5 grid. To minimize noise, I tried to make sure the top of the banana was in the same location in each picture. I still expected there to be imperfections due to inaccurate locations, rotations in my image-taking, distracting background, and non-correspondence between the 25 images.

Here are my images in the 5 x 5 grid.

For depth refocusing, the amount I scaled each shift was range(0, -10, -1) For aperature adjustment, the amount I scaled the shift for all images was 0 and the radii were range(0, 4, .4)

Depth Refocusing

Aperature Adjustment

As you can see, it did not work as well as the chess image dataset from The Stanford Light Field Archive due to the inaccurate position of my camera taking the pictures and the non-correspondence.

In the depth refocusing, the bananas which are the farther away objects do get more and more unfocused. However, they never started out in focus because of the innacuracies of the image-taking. In the aperature adjustment,you can see the wall behind the bananas and fruit in front do become blurrier as the aperature gets larger. However, due to the inaccuracies from image-taking, averages using a small radius are very noisy and contain a lot of ghosting.

Summary

I gained a deeper understanding of depth and aperature while working on this project. I learned how taking multiple grid-structured images could recreate a camera effect and do powerful things.