Deep Neural Network with Stacked Denoise Auto Encoder for Phishing Detection

Authors

Abstract

Sensitive information such as credit card information, username, password and social security number etc, can be stolen using a fake page that imitates trusted website is called phishing. The attacker designs a similar webpage either by copying or making small manipulation to the legitimate page so that the online user cannot distinguish the legitimate and fake websites. A Deep Neural Network (DNN) was introduced to detect the phishing Uniform Resource Locator (URL). Initially, a 30-dimension feature vector was constructed based on URL-based features, Hypertext Markup Language (HTML)-based features and domain-based features. These features were processed in DNN to detect the phishing URL. However, the irrelevant, redundant and noisy features in the dataset increase the complexity of DNN classifier. So the feature selection is required for efficient phishing attack detection. But feature selection is a time-consuming process since it is an independent process. So in this paper, a feature vector is generated by DNN itself using Stacked Denoise Auto Encoder (SDAE). Moreover, the noisy data such as missing features affect the efficiency of phishing detection so the SDAE is trained to reconstruct a clean input feature vector. The initial input feature vector is corrupted by setting some feature vectors as zero. Then the corrupted feature vector is then mapped with basic auto encoder, to a hidden representation from which the input feature vector is reconstructed. The reconstructed features are given as input to DNN which selects the most relevant features and predicts the phishing URL. Hence the sparse feature representation of SDAE increases the classification accuracy of DNN. The experiments are conducted in Ham, Phishing Corpus and Phishload datasets in terms of accuracy, precision, recall and F-measure to prove the effectiveness of DNN-SDAE.

Author Biographies

Sumathi Kothandan, CMS College of Science and Commerce, Coimbatore, Tamilnadu, India

Kothandan Sumathi, Ph.D. Research scholar, CMS College of Computer Applications, CMS College of Science and Commerce, Chinnavedampatty, Coimbatore, Tamilnadu, India.  She is a student of CMS college of science and commerce, affiliated to Bharathiar University, Coimbatore, Tamilnadu, India. She is pursuing a Ph.D. in Computer Science. She is doing research in the area of information security

Vijayan Sujatha, CMS College of Science and Commerce, Coimbatore, Tamilnadu, India

Vijayan Sujatha, Dean-Administration, CMS College of Computer Applications, CMS College of Science and Commerce, Chinnavedampatty, Coimbatore, Tamilnadu, India. She has 16 years of teaching experience and 2 years of IT Industrial experience. Her area of specialization is web mining, IoT and Big Data Analysis. She has published 24 research articles in National and International Journals and also presented papers in several National Conferences, Seminars and Workshops. She is currently guiding M.Phil and Ph.D. Scholars. She has an ideal knowledge in programming languages, DOT NET frameworks and has developed two live projects using Visual programming. She also sets question papers for universities in TamilNadu.

Downloads

Published

2019-07-10

How to Cite

Kothandan, S., & Sujatha, V. (2019). Deep Neural Network with Stacked Denoise Auto Encoder for Phishing Detection. International Journal of Machine Learning and Networked Collaborative Engineering, 3(02), 114–124. Retrieved from https://mlnce.net/index.php/Home/article/view/87