Deep Generative Models

To Improve Learning Process in Geospatial and Earth Science Applications

Overview:

Generative models learn the distribution of the training data and help in generating new data points from the learned distribution by sampling those distribution. In most cases, Gaussian distribution of the data are assumed. Further, a deep generative model is an unsupervised learning technique that learns the distribution of the training data while optimizing the loss function of the model network. There are several variants of the deep generative models and most of them are used to perform dual function viz. abstraction and generation. These models are also classified depending on whether the network is learning an explicit or an implicit probability distribution. Here, we are considering two such deep generative models: Variational AutoEncoder (VAE) and Generative Adversarial Nets (GANs). VAEs are generative models consisting of an encoder and a decoder. VAEs can perform both abstraction and generation and assume Gaussian distribution for the data. Most commonly, a trained VAE can be used for generating new data samples from the learnt distribution space. GANs are generative models consisting of a generator and a discriminator. GANs are trained using an adversarial learning technique, where the generator and the discriminator compete with each other. During this process of training, the generator improves itself and tries to generate more realistic samples which the discriminator cannot distinguish with the real dataset.

The aim of this work is to propose novel methods which utilize trained deep generative models for improving the deep learning models on diverse geospatial dataset for Earth science applications using two deep generative techniques namely VAE and GANs. We propose to address: The aim of this work is to propose novel methods which utilize trained deep generative models for improving the deep learning models on diverse geospatial dataset for Earth science applications using two deep generative techniques namely VAE and GANs. We propose to address:
(i) Problem of imbalanced data classification – Imbalanced data typically refers to a classification problem where the number of observations per class is not equally distributed and is a common observation in Earth science applications. We are proposing to use VAE and GANs for learning the data distribution followed by structure preserving oversampling techniques for generating the new data samples.
(ii) Dynamical systems – A dynamical system function describes the time dependence of a point in a geometrical space. At any given time, it has a state given by a tuple of real numbers (a vector) that can be represented by a point in an appropriate state space (a geometrical manifold). The model learns from observation data a set of lower dimensional representations that are used to make predictions for the dynamics.
(iii) Domain adaptation – Nonavailability of reliable labelled geospatial data is a main challenge for machine learning techniques. Here, VAE and GAN will be used for domain adaptation in an unsupervised manner.
(iv) Adversarial attacks – Adversarial attacks come in the form of adversarial examples: carefully crafted perturbations added to a legitimate input sample. In the context of classification, these perturbations cause the legitimate sample to be misclassified at inference time. Such perturbations are often small in magnitude and do not affect human recognition but can drastically change the output of the classifier. Non-deterministic generative samples from VAE and GAN can be used for adversarial training in an unsupervised manner with the classifier dataset.

TEAM MEMBERS:

Prof. Uttam Kumar
Ms. Indu Solomon (PH2019501), Research Scholar, IIIT Bangalore.

GitHub Link