Infinite GANoramas

Daniel Geng

We present a generative method to create ``infinite” panoramas from large datasets of images. Our work builds upon previous work in which images from a large database are transformed and then matched based on an error along the edges of the iages. Two matched images are then combined by cutting along a minimum energy cut and blended by using poisson blending. Our work differs in the use of a Generative Adversarial Network and a trained “realism” scorer. The GAN allows the use of not just data from the dataset, but also data from the distribution of the dataset. This flexibility allows more convincing photos. Instead of minimizing the squared sum of errors along a minimum energy cut to pair photos, we use a learned “similarity network” to find semantically close and images to put in our panorama. This further increases the realism of our panoramas. Finally, we can use this similarity network and reduce the problem of creating a panorama to finding the shortest path between two points in a graph.

[ Code ]

Background

Previous methods used to generate infinite panoramas from large datasets sampled solely images directly from the data. Images were matched based on lowest error between their GIST descriptors and then aligned using a horizon line estimation and blended along a minimum energy cut. The panoramas produced were good but suffered from problems with global scale and perspective inconsistencies. In addition, the best matching image might not always work well in a panorama, especially when working with a dataset of smaller size.

Method

1. GAN

We implement an infinite panorama generator, but instead of just sampling from the dataset we train a Generative Adversarial Network to model the distribution of the dataset and draw images from that distribution. Having just a GAN generate panoramas is a ridiculously underconstrained problem so we condition the GAN on real images, enforcing that the left side and the right side of a generated image should look exactly like an image from the dataset. To build a panorama we start off with an image and repeatedly choose a new image. The GAN then “inpaints” or “connects” the first image to the second image. In the end, our panorama is effectively made up of stripes of alternating real and hallucinated images. The GAN we use is an adapted version of pix2pix.

We take two images from our dataset and concatenate their left and right thirds together. We then ask a GAN to complete or inpaint the middle third

2. Realism Scorer

In addition to a generator we would also like a system to automatically choose photos to place in the panorama. Essentially we would like a discriminator to give us a probability of a generated photo of being realistic. Note that in training a GAN we have already trained a discriminator to detect fake and real images. However, our experiments show that just using the discriminator from the GAN training is not enough. The discriminator’s predictions are intimately tied to the generators output. That is, the discriminator is more of a generated image detector rather than a real image detector.

In order to mitigate this effect to some extent we fine tune the discriminator on a set of generators throughout the GAN training process. This effectively creates an image pool and forces the discriminator to use features of realistic photos rather than features of a particular generator’s output to classify images as or fake.

3. Image Taxi (tentative…)

Borrowing from Biliana et al., we can also construct a graph with images as nodes. Each node, and should be connected by an edge of weight . Where is the probability given by the realism scorer given the generator’s output given and , so . By finding a shortest path between two specified points we can find a panorama that has maximum probability of being a real image according to the realism scorer. This is of course under the assumption that images in a panorama are independent of each other, which to be quite honest is not that great an assumption.

Results

Infinite panoramas generated by our method manually. The panoramas are best viewed through a (literal) sliding window, but hopefully these images do some sort of justice:

These panoramas were created with a human in the loop, cherry picking the best generator results to add to the panorama

These panoramas were created by using the generated images that scored the lowest with the realism scorer

These panoramas were created by using random generated images and serves as a baseline

These panoramas were created by using the generated images that scored the best with the realism scorer