False Discovery Rate Control via Debiased Lasso
We consider the problem of variable selection in high-dimensional statistical models where the goal is to report a set of variables, out of many predictors X_1, , X_p, that are relevant to a response of interest. For linear high-dimensional model, where the number of parameters exceeds the number of samples (p>n), we propose a procedure for variables selection and prove that it controls the directional false discovery rate (FDR) below a pre-assigned significance level q∈ [0,1]. We further analyze the statistical power of our framework and show that for designs with subgaussian rows and a common precision matrix Ω∈R^p× p, if the minimum nonzero parameter θ_ satisfies √(n)θ_ - σ√(2(_i∈ [p]Ω_ii)(2p/qs_0))→∞ , then this procedure achieves asymptotic power one. Our framework is built upon the debiasing approach and assumes the standard condition s_0 = o(√(n)/( p)^2), where s_0 indicates the number of true positives among the p features. Notably, this framework achieves exact directional FDR control without any assumption on the amplitude of unknown regression parameters, and does not require any knowledge of the distribution of covariates or the noise level. We test our method in synthetic and real data experiments to asses its performance and to corroborate our theoretical results.
READ FULL TEXT