Optimal Sparsity Testing in Linear regression Model
We consider the problem of sparsity testing in the high-dimensional linear regression model. The problem is to test whether the number of non-zero components (aka the sparsity) of the regression parameter θ^* is less than or equal to k_0. We pinpoint the minimax separation distances for this problem, which amounts to quantifying how far a k_1-sparse vector θ^* has to be from the set of k_0-sparse vectors so that a test is able to reject the null hypothesis with high probability. Two scenarios are considered. In the independent scenario, the covariates are i.i.d. normally distributed and the noise level is known. In the general scenario, both the covariance matrix of the covariates and the noise level are unknown. Although the minimax separation distances differ in these two scenarios, both of them actually depend on k_0 and k_1 illustrating that for this composite-composite testing problem both the size of the null and of the alternative hypotheses play a key role. Along the way, we introduce a new variable selection procedure, which can be of independent interest.
READ FULL TEXT