Grouped feature screening for ultrahigh-dimensional classification via Gini distance correlation
Gini distance correlation (GDC) was recently proposed to measure the dependence between a categorical variable, Y, and a numerical random vector, X. It mutually characterizes independence between X and Y. In this article, we utilize the GDC to establish a feature screening for ultrahigh-dimensional discriminant analysis where the response variable is categorical. It can be used for screening individual features as well as grouped features. The proposed procedure possesses several appealing properties. It is model-free. No model specification is needed. It holds the sure independence screening property and the ranking consistency property. The proposed screening method can also deal with the case that the response has divergent number of categories. We conduct several Monte Carlo simulation studies to examine the finite sample performance of the proposed screening procedure. Real data analysis for two real life datasets are illustrated.
READ FULL TEXT