Bayesian Learning of Random Graphs & Correlation Structure of Multivariate Data, with Distance between Graphs
We present a method for the simultaneous Bayesian learning of the correlation matrix and graphical model of a multivariate dataset, using Metropolis-within-Gibbs inference. Here, the data comprises measurement of a vector-valued observable, that we model using a high-dimensional Gaussian Process (GP), such that, likelihood of GP parameters given the data, is Matrix-Normal, defined by a mean matrix and between-rows and between-columns covariance matrices. We marginalise over the between-row matrices, to achieve a closed-form likelihood of the between-columns correlation matrix, given the data. This correlation matrix is updated in the first block of an iteration, given the data, and the (generalised Binomial) graph is updated in the second block, at the partial correlation matrix that is computed given the updated correlation. We also learn the 95% Highest Probability Density credible regions of the correlation matrix as well as the graphical model of the data. The difference in the acknowledgement of measurement errors in learning the graphical model, is demonstrated on a small simulated dataset, while the large human disease-symptom network--with > 8,000 nodes--is learnt using real data. Data on the vino-chemical attributes of Portugese red and white wine samples are employed to learn the correlation structure and graphical model of each dataset, to then compute the distance between the learnt graphical models.
READ FULL TEXT