Poor Man’s Augmented Reality and A Neural Algorithm of Artistic Style
Poor Man’s Augmented Reality
Keypoints with known 3D world coordinates
The true coordinates of the points is
[[6,6,0],[3,6,0],[0,6,0],[6,3,0],[3,3,0],[0,3,0],[6,0,0],[3,0,0],[0,0,0],[6,-3,-2.3],[3,-3,-2.3],[0,-3,-2.3],[6,-3,-4.6],[3,-3,-4.6],[0,-3,-4.6],[6,-3,-6.9],[3,-3,-6.9],[0,-3,-6.9]]
Tracking
A Hacky Corner Detector
At first, I load the manually selected points of the first frame.Then I used the given get_harris_corners() but replace peak_local_max with corner_peaks. After detecting the points, I calculated the distance between the given points and detected points. For each given points, I found the smallest distance and if d<30, this points is counted as corresponding points. I use a list to keep track and update everypoints.
Click the picture to see the video~
Off the Shelf Tracker
At first,I initialize a list of independent MedianFlow trakcer by manually selected points. I used a 8*8 patch. In order to ensure the points are correctly tracked, I create a list to record every point’s last successful position. And calculated the distance between the current point and last successful point. If d<30, the detected point will be considered as a successful point.
Calibrating the camera
I used the mask to decide which point is successfully detected. And use the corresponding 3D coordinates and detected 2D coordinates to calculate the projection matrix by least squre methods.
Projecting a cube in the Scene
After calculating the projection matrix, I calculate the 2D coordinates of my cube.The true coordinate of my cube is [3,6,3,1],[0,6,3],[3,3,3],[0,3,3],[3,6,0],[0,6,0],[0,3,0],[3,3,0]. Then draw lines between this points to create a cube in the pictures.
Output
A Neural Algorithm of Artistic Style
Visualization of the model
I used LBFGS as optimizer.the ratio α/β I used is 10^(-3), the iteration number is 1500.
In the paper,they matched content representation on layer ‘conv4_2’.But in my implementation, I found that match on layer ‘conv5_2’ would have a better output.
Also, in the paper they match style representation on layer ‘conv1 1’, ‘conv2 1’, ‘conv3 1’, ‘conv4 1’ and ‘conv5 1’. But I matched them in ‘conv1 2’, ‘conv2 2’, ‘conv3 2’, ‘conv4 2’ and ‘conv5 2’. Technically, this will make little difference, I but found this has a better output then the former one.
At first, I found that using Adam will have a faster speed, but LBFGS indeed perform better.
Output of Neckarfront
my output | paper’s output |
---|---|
Output of my choice
style images and content images
style img | content img |
---|---|
output
Failure case: The input content image:
THe input style image:
What I expect is something like this:
But I got this output:
I think the main reason is: don’t like the other paintings or pictures, this artwork is consist of color blocks with little detail and brush stroke in it. With a lot of same piexls in it, it’s hard to generate the texture. So I think this style transfer method is good for detailed picture but not this kind of animation drawings.