Copula graphical models for heterogeneous mixed data
This article proposes a graphical model that can handle mixed-type, multi-group data. The motivation for such a model originates from real-world observational data, which often contain groups of samples obtained under heterogeneous conditions in space and time, potentially resulting in differences in network structure among groups. Therefore, the i.i.d. assumption is unrealistic, and fitting a single graphical model on all data results in a network that does not accurately represent the between group differences. In addition, real-world observational data is typically of mixed-type, violating the Gaussian assumption that is typical of graphical models, which leads to the model being unable to adequately recover the underlying graph structure. The proposed model takes into account these properties of data, by treating observed data as transformed latent Gaussian data, and thereby allowing for the attractive properties of the Gaussian distribution such as partial correlations from the inverse covariance matrix to be utilised. In an extensive simulation study, the proposed model is evaluated against alternative models, where the proposed model is better able to recover the true underlying graph structure for different groups. Finally, the proposed model is applied on real production-ecological data pertaining to on-farm maize yield in order to showcase the added value of the proposed method in generating new hypotheses for production ecologists.
READ FULL TEXT