Linear shrinkage for predicting responses in large-scale multivariate linear regression
We propose a new prediction method for multivariate linear regression problems where the number of features is less than the sample size but the number of outcomes is extremely large. Many popular procedures, such as penalized regression procedures, require parameter tuning that is computationally untenable in such large-scale problems. We take a different approach, motivated by ideas from simultaneous estimation problems, that performs linear shrinkage on ordinary least squares parameter estimates. Our approach is extremely computationally efficient and tuning-free. We show that it can asymptotically outperform ordinary least squares without any structural assumptions on the true regression coefficients and illustrate its good performance in simulations and an analysis of single-cell RNA-seq data.
READ FULL TEXT