(NIPS 2015) Deep Generative Image Models using a Laplacian Pyramid of Adversarial Networks
Paper: http://arxiv.org/abs/1506.05751
Code: https://github.com/facebook/eyescream
In this paper we introduce a generative parametric model capable of producing high quality samples of natural images.
Our approach uses a cascade of convolutional networks within a Laplacian pyramid framework to generate images in a coarse-to-fine fashion.
Introduction
Building a good generative model of natural images has been a fundamental problem within computer vision.
However, images are complex and high dimensional, making them hard to model well, despite extensive efforts.
we exploit the multiscale structure of natural images, building a series of generative models, each of which captures image structure at a particular scale of a Laplacian pyramid [1].
At each scale we train a convolutional network-based generative model using the Generative Adversarial Networks (GAN) approach of Goodfellow et al. [11]. Samples are drawn in a coarse-to-fine fashion, commencing with a low-frequency residual image.
The second stage samples the band-pass structure at the next level, conditioned on the sampled residual.
Related Work
Approach
Generative Adversarial Networks
Laplacian Pyramid
The Laplacian pyramid [1] is a linear invertible image representation consisting of a set of band-pass images, spaced an octave apart, plus a low-frequency residual.
Laplacian Generative Adversarial Networks (LAPGAN)
Our proposed approach combines the conditional GAN model with a Laplacian pyramid representation.
The generative models
Specifically, we construct a Laplacian pyramid from each training image I. At each level we make a stochastic choice (with equal probability) to either
(i) construct the coefficients
or
(ii) generate them using
Figure 1: The sampling procedure for our LAPGAN model.
We start with a noise sample
This is upsampled (green arrow) and then used as the conditioning variable (orange arrow)
Together with another noise sample
This process repeats across two subsequent levels to yield a final full resolution sample
Figure 2: The training procedure for our LAPGAN model.
Starting with a 64x64 input image
(i) we take
(ii) we upsample
(iii) with equal probability we use
In the real case (blue arrows), we compute high-pass
In the generated case (magenta arrows), the generative network
In both the real/generated cases,
Optimizing Eqn. 2,
The same procedure is repeated at scales 1 and 2, using
Note that the models at each level are trained independently.
At level 3,