Yiwen Chen
In this project, I used the pretrained vgg19 as suggested in the paper, which has 16 convolutional layers and 5 pooling layers. Beside, I used the L-BFGS optimizer to minimize overall loss. At each iteration, I project all pixel values in training image to the range between 0 and 1.
The paper starts with a white noise image. However we found that starting with the content target would lead to faster convergence and generally better results since it ensures the optimizer has a decent starting point. Although both style loss function and content loss function are convex with respect to feature response F, they are non-convex with respect to pixel values in x, the training image due to mutiple layers of convolution and max pooling. Thus by starting with the content target image, we are less likely to be trapped in a bad local optimal. Another advantage of initializing with the content target is that one can adjust the style transfer extent by simply changing the number of iterations while preserving the content. Here is a comparison of initializing with white noise and content target.
When the weight of style loss is relatively low, we prioritize content over style. As a result we would be able to preserve the original structure of the content. On the contrary, if we set the weight of style loss to a large value we might lose the structure in order to prioritize style. By deciding the weights we are trying to strike a balance to both preserve the structure we want and get the most style information
When we initialize with content target image, the main job of optimization focus on minimizing style loss since the content loss is already very small. Thus in this case, the weight ratio does not make much difference. However when we start with white noise image, although the paper suggests we use 1000 as the style loss weight to conten loss weight ratio, we found that a smaller number like 10 works better in our example.
Compared with randomly samples patches, neighbouring patches in results from overlapping patches are more similar to each other. There is almost no direct connection between two patches of contrast color. However the connecting seams are still obvious.
For seam finding, I calculated the min-cost path using a dynamic programming approach. Then I created a mask and set the left part of the cut to 1 and right part to 0. The mask is later used to concatenate the next patch.
Compared with simple overlapping patches, seams are less obvious between patches.
More examples of seam finding:
Altough the person's face does not show clearly, we can see the dark and light pixel layout is roughly the same as the guide image.
My cut function in python which uses dynamic programming
def cut(overlap):
overlap = np.append( np.ones((1,overlap.shape[1]))*1e10 ,overlap, axis=0)
overlap = np.append( overlap, np.ones((1,overlap.shape[1]))*1e10, axis=0)
path = np.zeros((overlap.shape[0], overlap.shape[1]))
cost = np.zeros((overlap.shape[0], overlap.shape[1]))
cost[:,0] = overlap[:,0]
cost[0,:] = np.ones((1,overlap.shape[1]))*1e10
cost[overlap.shape[0]-1,:] = np.ones((1,overlap.shape[1]))*1e10
for col in range(1, overlap.shape[1]):
candidates = np.zeros((overlap.shape[0]-2,3))
candidates[:,0] = cost[0:overlap.shape[0]-2,col-1]
candidates[:,1] = cost[1:overlap.shape[0]-1,col-1]
candidates[:,2] = cost[2:overlap.shape[0],col-1]
previous = np.argmin(candidates, axis = 1)
path[1:overlap.shape[0]-1, col] = np.array(range(1,overlap.shape[0]-1)) + previous -1
cost[1:overlap.shape[0]-1, col] = np.min(candidates, axis = 1) + overlap[1:overlap.shape[0]-1, col]
path = path[1:overlap.shape[0]-1,:] -1
cost = cost[1:overlap.shape[0]-1,:]
mask = np.zeros((cost.shape[0], cost.shape[1]))
optimal_path = np.zeros( cost.shape[1])
optimal_path[-1] = np.argmin(cost[:,cost.shape[1]-1])
mask[int(optimal_path[-1]):cost.shape[0],cost.shape[1]-1] = 1
for i in range(cost.shape[1]-2,-1,-1):
optimal_path[i] = path[int(optimal_path[i+1]),i+1]
mask[int(optimal_path[i]):cost.shape[0],i] = 1
return mask
First iteration same as previous vanilla texture transfer
Another example:
First iteration same as previous vanilla texture transfer
Compared with simple texture transfer which can only roughly layout dark and light colors, with iterative color transfer, we can reduce patch size every iteratoin to render details of the picture. However iterative style transfer would still fail if there is not much color contrast in the texture image.