High-Dimensional Feature Selection for Genomic Datasets

02/27/2020
by   Majid Afshar, et al.
0

In the presence of large dimensional datasets that contain many irrelevant features (variables), dimensionality reduction algorithms have proven to be useful in removing features with low variance and combine features with high correlation. In this paper, we propose a new feature selection method which uses singular value decomposition of a matrix and the method of least squares to remove the irrelevant features and detect correlations between the remaining features. The effectiveness of our method has been verified by performing a series of comparisons with state-of-the-art feature selection methods over ten genetic datasets ranging up from 9,117 to 267,604 features. The results show that our method is favorable in various aspects compared to state-of-the-art feature selection methods.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset