Exponential canonical correlation analysis with orthogonal variation
Canonical correlation analysis (CCA) is a standard tool for studying associations between two data sources; however, it is not designed for data with count or proportion measurement types. In addition, while CCA uncovers common signals, it does not elucidate which signals are unique to each data source. To address these challenges, we propose a new framework for CCA based on exponential families with explicit modeling of both common and source-specific signals. Unlike previous methods based on exponential families, the common signals from our model coincide with canonical variables in Gaussian CCA, and the unique signals are exactly orthogonal. These modeling differences lead to a non-trivial estimation via optimization with orthogonality constraints, for which we develop an iterative algorithm based on a splitting method. Simulations show on par or superior performance of the proposed method compared to the available alternatives. We apply the method to analyze associations between gene expressions and lipids concentrations in nutrigenomic study, and to analyze associations between two distinct cell-type deconvolution methods in prostate cancer tumor heterogeneity study.
READ FULL TEXT