Generative Neural Networks Explained
Deep generative models are a powerful approach to unsupervised and semi-supervised learning where the goal is to discover the hidden structure within data without relying on external labels. Traditional Machine Learning (ML) is mostly discriminative – the goal being to discover a map from inputs to outputs, like image pixels to object names presented in it.
These models, however, have several limitations: i) they require vast amounts of annotated data; ii) can fail drastically when presented with inputs very different from the training set; and iii) they can be intentionally subverted by humans or other ML algorithms and misclassify data – quite common in security and privacy protection, an area heavily relying on ML.
- Which-50 and ADMA are introducing a one day classroom-based digital transformation education program for senior executives lead by visiting US subject matter expert Courtney Hunt PhD. Places are strictly limited.
Generative ML models, on the other hand, learn in a completely different way. They try to recreate the rich imagery of the world in which they will be used. Instead of discriminating the inputs, they try to replicate the (hidden) statistical process that is behind the data being observed. They start by generating “hallucinations” that become more realistic and plausible as the learning process evolves. Thus, these models have more powerful expression capabilities as they explore variations in data and try to reason about the structure and possibilities of world that are consistent with given observations.
Deep generative models have widespread applications including density estimation, image denoising (creating high quality images from low resolution or noisy ones) and inpainting (recovering the whole image after a partial occultation), data compression, scene understanding, representation learning, 3D scene construction, semi-supervised classification or even hierarchical control.
Generative models are more powerful than discriminative as they:
- Have the capability to move beyond associations between inputs and outputs.
- Are able to recognize and represent hidden structure in the data and invariants: for instance the concept of rotation, light intensity, bright or shape in 3 dimensional objects.
- They can image the world as “it could be” rather than as “it is presented”.
- Establish concepts useful for decision making and reasoning.
- Detect surprising events in the world.
- Can “plan the future” by generating plausible scenarios.
There are several types of generative models but the most common is the latent variable models – they use hidden, or latent stochastic variables, to generate the data. The most common deep neural network based generative models are variational auto-encoders (VAE), and generative adversarial networks (GAN). To train these models we rely on Bayesian deep learning, variational approximations and Monte Carlo Markov Chain (MCMC) estimation, or the old faithful stochastic gradient estimation (SGD).
Generative Adversarial Networks have been proposed by I. Goodfellow in 2014 and are trained as a competition game between a generative network trying to fool a discriminative network with faked inputs. As the training evolves the generative networks will eventually produce samples that are indistinguable from real images – or other data they have been trained for. Yann LeCunn, Director of AI research at Facebook, in a recent Quora discussion considered GAN one of the most important new developments in AI. GANs are, nevertheless, very hard to train. This very interesting paper from openAI proposes several tricks to overcome these difficulties.
Generative neural networks have applications far beyond image of video processing, especially in very dynamic and non-stationary scenarios, like fraud detection and intrusion and other security related problems. Some examples of applications include the following:
- https://arxiv.org/pdf/1601.06759v2.pdfPixel Recurrent Neural Networks, DeepMind group used generative networks at pixel level for image inpainting.
- http://arxiv.org/abs/1607.05046– face hallucination (reconstruction from low resolution images)
- http://arxiv.org/pdf/1607.07539v1.pdfSemantic Image Inpainting with Perceptual and Contextual
- My favorite application of GAN is this work where the authors used a combination of recurrent neural networks (RNN) and convolutional neural networks (CNN) to generate images from text descriptions. The results are super cool!