Regularization Parameter Selection for a Bayesian Multi-Level Group Lasso Regression Model with Application to Imaging Genomics
We investigate the choice of tuning parameters for a Bayesian multi-level group lasso model developed for the joint analysis of neuroimaging and genetic data. The regression model we consider relates multivariate phenotypes consisting of brain summary measures (volumetric and cortical thickness values) to single nucleotide polymorphism (SNPs) data and imposes penalization at two nested levels, the first corresponding to genes and the second corresponding to SNPs. Associated with each level in the penalty is a tuning parameter which corresponds to a hyperparameter in the hierarchical Bayesian formulation. Following previous work on Bayesian lassos we consider the estimation of tuning parameters through either hierarchical Bayes based on hyperpriors and Gibbs sampling or through empirical Bayes based on maximizing the marginal likelihood using a Monte Carlo EM algorithm. For the specific model under consideration we find that these approaches can lead to severe overshrinkage of the regression parameter estimates in the high-dimensional setting or when the genetic effects are weak. We demonstrate these problems through simulation examples and study an approximation to the marginal likelihood which sheds light on the cause of this problem. We then suggest an alternative approach based on the widely applicable information criterion (WAIC), an asymptotic approximation to leave-one-out cross-validation that can be computed conveniently within an MCMC framework.
READ FULL TEXT