Yiwen Chen

Final Project 1: Reimplement: A Neural Algorithm of Artistic Style

Overview

In this project, I used the pretrained vgg19 as suggested in the paper, which has 16 convolutional layers and 5 pooling layers. Beside, I used the L-BFGS optimizer to minimize overall loss. At each iteration, I project all pixel values in training image to the range between 0 and 1.

Effect of initialization

The paper starts with a white noise image. However we found that starting with the content target would lead to faster convergence and generally better results since it ensures the optimizer has a decent starting point. Although both style loss function and content loss function are convex with respect to feature response F, they are non-convex with respect to pixel values in x, the training image due to mutiple layers of convolution and max pooling. Thus by starting with the content target image, we are less likely to be trapped in a bad local optimal. Another advantage of initializing with the content target is that one can adjust the style transfer extent by simply changing the number of iterations while preserving the content. Here is a comparison of initializing with white noise and content target.

Start with content target image Start with content white noise image

Style loss weight to content loss weight ratio

When the weight of style loss is relatively low, we prioritize content over style. As a result we would be able to preserve the original structure of the content. On the contrary, if we set the weight of style loss to a large value we might lose the structure in order to prioritize style. By deciding the weights we are trying to strike a balance to both preserve the structure we want and get the most style information

When we initialize with content target image, the main job of optimization focus on minimizing style loss since the content loss is already very small. Thus in this case, the weight ratio does not make much difference. However when we start with white noise image, although the paper suggests we use 1000 as the style loss weight to conten loss weight ratio, we found that a smaller number like 10 works better in our example.

style loss weight to conten loss weight ratio is 5 style loss weight to conten loss weight ratio is 10 style loss weight to conten loss weight ratio is 100

Results

Transfer three styles in the paper to
I got the following results by initializing with the content targer image and selecting layer 'conv4-2' for content representatoin and layer conv1-1, conv2-1, conv3-1, conv4-1, conv5-1 for style representation.
Successful case1
Successful case2
Failure Case: Different from the paper, starting with white noise and selecting layer conv4-2 for content matching does not have a nice result. The content information is not preserved as much as in lower layers like layer conv2-2 use conv4-2 for content match use conv2-2 for content match

Final Project 2: Image Quilting

Randomly Sampled Texture

Overlapping Patches

Compared with randomly samples patches, neighbouring patches in results from overlapping patches are more similar to each other. There is almost no direct connection between two patches of contrast color. However the connecting seams are still obvious.

Seam Finding Texture Synthesis

For seam finding, I calculated the min-cost path using a dynamic programming approach. Then I created a mask and set the left part of the cut to 1 and right part to 0. The mask is later used to concatenate the next patch.

the min-cost path between the two patches (Neighbouring patch) (The patch we select) is The overlapping width is 10.

Compared with simple overlapping patches, seams are less obvious between patches.

More examples of seam finding:

Texture Transfer

Transfer guide: Results:

Altough the person's face does not show clearly, we can see the dark and light pixel layout is roughly the same as the guide image.

Bells & Whistles: 1.create cut function

My cut function in python which uses dynamic programming

def cut(overlap):
    overlap = np.append( np.ones((1,overlap.shape[1]))*1e10 ,overlap, axis=0)    
    overlap = np.append( overlap, np.ones((1,overlap.shape[1]))*1e10, axis=0) 
    path = np.zeros((overlap.shape[0], overlap.shape[1]))
    cost = np.zeros((overlap.shape[0], overlap.shape[1]))
    cost[:,0] = overlap[:,0]
    cost[0,:] = np.ones((1,overlap.shape[1]))*1e10
    cost[overlap.shape[0]-1,:] = np.ones((1,overlap.shape[1]))*1e10
    for col in range(1,  overlap.shape[1]):
        candidates = np.zeros((overlap.shape[0]-2,3))
        candidates[:,0] = cost[0:overlap.shape[0]-2,col-1]
        candidates[:,1] = cost[1:overlap.shape[0]-1,col-1]
        candidates[:,2] = cost[2:overlap.shape[0],col-1]
        previous = np.argmin(candidates, axis = 1)
        path[1:overlap.shape[0]-1, col] = np.array(range(1,overlap.shape[0]-1)) + previous -1
        cost[1:overlap.shape[0]-1, col] = np.min(candidates, axis = 1) + overlap[1:overlap.shape[0]-1, col]
    path = path[1:overlap.shape[0]-1,:] -1
    cost = cost[1:overlap.shape[0]-1,:]    
    mask = np.zeros((cost.shape[0], cost.shape[1]))
    optimal_path = np.zeros( cost.shape[1])
    optimal_path[-1] = np.argmin(cost[:,cost.shape[1]-1])
    mask[int(optimal_path[-1]):cost.shape[0],cost.shape[1]-1] = 1
    for i in range(cost.shape[1]-2,-1,-1):
        optimal_path[i] = path[int(optimal_path[i+1]),i+1]
        mask[int(optimal_path[i]):cost.shape[0],i] = 1
    return mask

Bells & Whistles: 2. Iterative Texture Transfer

First iteration same as previous vanilla texture transfer

Second iteration Third Iteration

Another example:

First iteration same as previous vanilla texture transfer

Second iteration Third Iteration

Compared with simple texture transfer which can only roughly layout dark and light colors, with iterative color transfer, we can reduce patch size every iteratoin to render details of the picture. However iterative style transfer would still fail if there is not much color contrast in the texture image.

In [ ]: