Towards a Standard Feature Set of NIDS Datasets
Network Intrusion Detection Systems (NIDSs) datasets are essential tools used by researchers for the training and evaluation of Machine Learning (ML)-based NIDS models. There are currently five datasets, known as NF-UNSW-NB15, NF-BoT-IoT, NF-ToN-IoT, NF-CSE-CIC-IDS2018 and NF-UQ-NIDS, which are made up of a common feature set. However, their performances in classifying network traffic, mainly using the multi-classification method, is often unreliable. Therefore, this paper proposes a standard NetFlow feature set, to be used in future NIDS datasets due to the tremendous benefits of having a common feature set. NetFlow has been widely utilised in the networking industry for its practical scaling properties. The evaluation is done by extracting and labeling the proposed features from four well-known datasets. The newly generated datasets are known as NF- UNSW-NB15-v2, NF-BoT-IoT-v2, NF-ToN-IoT-v2, NF-CSE-CIC-IDS2018-v2 and NF-UQ-NIDS-v2. Their performances have been compared to their respective original datasets using an Extra Trees classifier, showing a great improvement in the attack detection accuracy. They have been made publicly available to use for research purposes.
READ FULL TEXT