Selective inference after variable selection via multiscale bootstrap

05/25/2019
by   Yoshikazu Terada, et al.
0

A general resampling approach is considered for selective inference problem after variable selection in regression analysis. Even after variable selection, it is important to know whether the selected variables are actually useful by showing p-values and confidence intervals of regression coefficients. In the classical approach, significance levels for the selected variables are usually computed by t-test but they are subject to selection bias. In order to adjust the bias in this post-selection inference, most existing studies of selective inference consider the specific variable selection algorithm such as Lasso for which the selection event can be explicitly represented as a simple region in the space of the response variable. Thus, the existing approach cannot handle more complicated algorithm such as MCP (minimax concave penalty). Moreover, most existing approaches set an event, that a specific model is selected, as the selection event. This selection event is too restrictive and may reduce the statistical power, because the hypothesis selection with a specific variable only depends on whether the variable is selected or not. In this study, we consider more appropriate selection event such that the variable is selected, and propose a new bootstrap method to compute an approximately unbiased selective p-value for the selected variable. Our method is applicable to a wide class of variable selection algorithms. In addition, the computational cost of our method is the same order as the classical bootstrap method. Through the numerical experiments, we show the usefulness of our selective inference approach.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset