A Comparative Performance Analysis of Explainable Machine Learning Models With And Without RFECV Feature Selection Technique Towards Ransomware Classification
Ransomware has emerged as one of the major global threats in recent days. The alarming increasing rate of ransomware attacks and new ransomware variants intrigue the researchers in this domain to constantly examine the distinguishing traits of ransomware and refine their detection or classification strategies. Among the broad range of different behavioral characteristics, the trait of Application Programming Interface (API) calls and network behaviors have been widely utilized as differentiating factors for ransomware detection, or classification. Although many of the prior approaches have shown promising results in detecting and classifying ransomware families utilizing these features without applying any feature selection techniques, feature selection, however, is one of the potential steps toward an efficient detection or classification Machine Learning model because it reduces the probability of overfitting by removing redundant data, improves the model's accuracy by eliminating irrelevant features, and therefore reduces training time. There have been a good number of feature selection techniques to date that are being used in different security scenarios to optimize the performance of the Machine Learning models. Hence, the aim of this study is to present the comparative performance analysis of widely utilized Supervised Machine Learning models with and without RFECV feature selection technique towards ransomware classification utilizing the API call and network traffic features. Thereby, this study provides insight into the efficiency of the RFECV feature selection technique in the case of ransomware classification which can be used by peers as a reference for future work in choosing the feature selection technique in this domain.
READ FULL TEXT