The afore-mentioned work of feature selection is summarized in a systematic way according to approach as filter in Table 18, wrapper in Table 19 and hybrid in Table 20. These tables consist of literature reference, proposed method name, number of features selected by paper, feature number according to Table 1, classifier used to evaluate the proposed method, evaluation criteria and results of proposed method.
Intrusion Detection Systems (IDS) have become vital and a necessary component of almost every computer and network security. As network speed becomes faster, there is an emerge need for IDS to be lightweight, efficient and accurate with high detection rates (DR) and low false positive rates (FAR). Other difficulties faced by intrusion detection systems are curse of feature dimensionality and emerging data complexities. Therefore, feature selection has become very important part in intrusion detection systems due to curse of feature dimensionality and emerging data complexities. Feature selection selects a subset of relevant features, removes irrelevant and redundant features from the dataset to build robust, efficient, accurate and lightweight intrusion detection system to ensure timeliness for real time.
A plenty of feature selection methods have been proposed by researchers in intrusion detection system to deal with these problems. This paper has presented to survey this fast developing field and addresses the main contribution of feature selection research proposed for intrusion detection. We showed that why feature selection method is vital in IDS. We surveyed the existing feature selection methods for IDS categorised as filter, wrapper and hybrid. We also presented the performance of these methods based on different metric on KDD Cup’99 dataset, mentioned extracted feature set and classifier to evaluate these extracted feature set, strength, limitation and future work of these proposed method in section 5 and 6. The following are useful future research issues:
Single classifier for evaluation of the extracted feature set may be no longer good solution for building the robust IDS. Therefore, designing more sophisticated classifiers by combining multiple classifiers or combining ensemble and hybrid classifiers may enhance the robustness and performance of IDS.
After comparing the existing feature selection methods in intrusion detection, we discovered that finding an optimal and best feature set still needs to be researched.
Feature selection algorithms always need improvement on search strategy and evaluation criterion for building efficient and lightweight intrusion detection system.
Robustness of the extracted feature can be enhanced by using ensemble of feature selection methods, combined with appropriate evaluation criteria.
After surveying these many feature selection methods, we cannot say that which method perform the best under which classifier for intrusion detection (to the best of our knowledge).
Most of the proposed method works on two-class classification (normal and attack type) (to the best of our knowledge). Very little work has been done on multiple class classification (five-class- four classes of attack and one class of normal). Therefore, the research in many papers can be further extended in the future on multiple class classification.
Classes in KDD Cup’99 are unbalanced in both training and test sets as it can be seen in Table 1. Normal and DoS classes have enough instances, whereas Probe and R2L have small instances, particularly U2R. These classes (Probe, R2L, U2R) have not good classification rate due to small number of instances in training set. So, this is future research to develop the method combined with appropriate evaluation criteria to alleviate the small instance of dataset.
We can conclude that there are features that really significant in classifying the normal and attacks type as reported in literature. Also, there is no specific generic classifier that can best classify all the attack types as seen in this survey. Different researchers use different classifier to evaluate the feature set. This paper systematically summarized the contributions of each researcher and also projected the number of significant research problem in this field. We hope that this survey will provide useful insights, broad overview and new research directions about this field to the readers.