Exploring Bayesian Models for Multi-level Clustering of Hierarchically Grouped Sequential Data

04/19/2015
by   Adway Mitra, et al.
0

A wide range of Bayesian models have been proposed for data that is divided hierarchically into groups. These models aim to cluster the data at different levels of grouping, by assigning a mixture component to each datapoint, and a mixture distribution to each group. Multi-level clustering is facilitated by the sharing of these components and distributions by the groups. In this paper, we introduce the concept of Degree of Sharing (DoS) for the mixture components and distributions, with an aim to analyze and classify various existing models. Next we introduce a generalized hierarchical Bayesian model, of which the existing models can be shown to be special cases. Unlike most of these models, our model takes into account the sequential nature of the data, and various other temporal structures at different levels while assigning mixture components and distributions. We show one specialization of this model aimed at hierarchical segmentation of news transcripts, and present a Gibbs Sampling based inference algorithm for it. We also show experimentally that the proposed model outperforms existing models for the same task.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset