Research on Phishing Attack Recognition Mechanism Based on Improved Decision Tree Algorithm

Haoran Yang1, Yi Li2, Chang Liu3, Yichuan Zhou4
1 Beijing Troy Cloud Data Technology Co., Ltd., Beijing, 100071, China
2 School of Computer Science and Technology, Jilin University, Beijing, 100010, China
3 Department of Hospitality and Business Management, The Technological and Higher Education Institute of Hong Kong, 999077, Hong Kong
4Shanghai Shiyun Information Technology Co., Ltd., Shanghai, 200120, China

Abstract

Phishing has become an increasing threat on online networks with evolving Web, mobile device and social networking technologies. Therefore, there is an urgent need for effective methods and techniques used to detect and prevent phishing attacks. In this paper, a phishing detection model based on decision tree and optimal feature selection is proposed. An optimal feature selection algorithm based on a newly defined feature evaluation metric (f_Value), decision tree and local search is designed to prune out negative and useless features. The overfitting problem in the process of training neural network classifiers is mitigated. The optimal set of sensitive features for feature selection and the optimal structure for training the neural network classifier are constructed by tuning the parameters. Experiments on CART-based phishing detection system and comparative experiments based on different phishing detection models are also conducted. The experimental results show that the model precision, accuracy, and recall of the improved decision tree-based algorithm proposed in the article are 92.7%, 96.5%, and 88.3%, respectively, on the dataset of phishtank, and the three metrics are 98.3%, 99.1%, and 99.5%, respectively, on the datasets of Vrbanˇciˇc-small and show that the proposed CART has a higher performance than the many existing method models.

Keywords: decision tree, optimal feature selection, phishing detection model, neural network