Learning Sparsity of Representations with Discrete Latent Variables

by   Zhao Xu, et al.

Deep latent generative models have attracted increasing attention due to the capacity of combining the strengths of deep learning and probabilistic models in an elegant way. The data representations learned with the models are often continuous and dense. However in many applications, sparse representations are expected, such as learning sparse high dimensional embedding of data in an unsupervised setting, and learning multi-labels from thousands of candidate tags in a supervised setting. In some scenarios, there could be further restriction on degree of sparsity: the number of non-zero features of a representation cannot be larger than a pre-defined threshold L_0. In this paper we propose a sparse deep latent generative model SDLGM to explicitly model degree of sparsity and thus enable to learn the sparse structure of the data with the quantified sparsity constraint. The resulting sparsity of a representation is not fixed, but fits to the observation itself under the pre-defined restriction. In particular, we introduce to each observation i an auxiliary random variable L_i, which models the sparsity of its representation. The sparse representations are then generated with a two-step sampling process via two Gumbel-Softmax distributions. For inference and learning, we develop an amortized variational method based on MC gradient estimator. The resulting sparse representations are differentiable with backpropagation. The experimental evaluation on multiple datasets for unsupervised and supervised learning problems shows the benefits of the proposed method.


page 1

page 7

page 8


Efficient Marginalization of Discrete and Structured Latent Variables via Sparsity

Training neural network models with discrete (categorical or structured)...

Learning Sparse Latent Representations for Generator Model

Sparsity is a desirable attribute. It can lead to more efficient and mor...

Sparse Topical Coding

We present sparse topical coding (STC), a non-probabilistic formulation ...

Learning to Generate Lumped Hydrological Models

In a lumped hydrological model structure, the hydrological function of a...

Bayesian Beta-Bernoulli Process Sparse Coding with Deep Neural Networks

Several approximate inference methods have been proposed for deep discre...

Hierarchical Disentangled Representations

Deep latent-variable models learn representations of high-dimensional da...

Bayesian representation learning with oracle constraints

Representation learning systems typically rely on massive amounts of lab...

Please sign up or login with your details

Forgot password? Click here to reset