Preprint
Article

This version is not peer-reviewed.

Deep Learning-Based Speech and Vision Synthesis to Improve Phishing Attack Detection through a Multi-layer Adaptive Framework

Submitted:

27 February 2024

Posted:

27 February 2024

You are already at the latest version

Abstract
The ever-evolving ways attacker continues to im prove their phishing techniques to bypass existing state-of-the-art phishing detection methods pose a mountain of challenges to researchers in both industry and academia research due to the inability of current approaches to detect complex phishing attack. Thus, current anti-phishing methods remain vulnerable to complex phishing because of the increasingly sophistication tactics adopted by attacker coupled with the rate at which new tactics are being developed to evade detection. In this research, we proposed an adaptable framework that combines Deep learning and Randon Forest to read images, synthesize speech from deep-fake videos, and natural language processing at various predictions layered to significantly increase the performance of machine learning models for phishing attack detection.
Keywords: 
;  ;  ;  ;  ;  ;  ;  ;  

1. Introduction

The insufficiency of traditional phishing detection methods such as user education [1] and rule-based methods [2] against sophisticated phishing attack techniques has led researchers to exploration of possible AI-based solutions. While several machine learning-based models have been proposed, the fact that attackers use advanced innovative methods that are continuously changing to carry out phishing attacks renders previously proposed machine learning models ineffective against sophisticated attacks [3]. Although tools like PhishTank, and OpenPhish were created for the effective detection of malicious Uniform Resource Locator (URL) the rate at which malicious websites are created coupled with the sophistication of the deception method easily overwhelm the system as phishing sites are being created every 11 seconds according to dataprot 2023 phishing statistic report.
One of the common problems with the current ML model is the quality of the dataset used for the training model which has a significant impact on both the accuracy and overall performance of the model [4]. These data do not reflect the ever-changing strategies through which attackers continue to fool existing machine learning-based models to evade detection. In addition, the balancing problem between human factors and model accuracy causes illegal flagging of newly registered legitimate websites due to weak domain authority. Phishing websites have a short life span as they are quickly taken down before detection and another one is created, It is the rate at which an existing phishing website is taken down after launching a campaign and the immediate creation of a new phishing website [5] to begin another campaign coupled with the ever-changing but sophisticated techniques that makes the problem very potent and significant.
While several models and machine learning-based frameworks have been proposed, the ever-evolving ways attacker increases the sophistication of phishing attack to bypass existing state-of-the-art anti-phishing detection and prevention systems pose a mountain of challenges leading to the relative ineffectiveness of previously proposed models against a more complex phishing attack. Thus, the constant evolvement and innovation in phishing techniques adopted by attackers are the reason why current detection method remains vulnerable to complex or more sophisticated forms of phishing due to their reliability on [6,7,8], blacklists/whitelists [9], natural language processing [10], visual similarity [10], rules [11,12], remains vulnerable to attack due to the following reasons;
  • Having understood how the machine learning-based model works, attackers are now increasingly relying on asymmetrical methods by uploading images and videos to evade detection under various pretexts, and none of the proposed models can single-handedly be effective against such.
  • Very small or minute changes to the uniform Resource Locator (URL) of a blacklisted URL will make the blacklist/ whitelist phishing detection method fail. Also, the fact that there is no worldwide centralized database for whitelisted or blacklisted URLs makes this method even more vulnerable, and so if company X blacklisted my phishing URL on their internal server, I can try it with company Y and be successful.
  • In machine learning phishing detection method that relies on relevant features like URL, webpage content, website traffic, search engine, WHOIS record, and Page Rank have their vulnerabilities because firstly, such classifier will misclassify a phishing URL that is hosted on a hacked or compromised server as benign leading to false negative, secondly using domain age as a feature to train a model will always lead to higher false positive simply because the URL of a newly registered legitimate company website will be misclassified. After all, the domain name was recently registered, the page rank is zero, and with low traffic, and thirdly the fact that parameters for those features are gotten from a third-party website is another concern. What will happen if the third-party website is having a downtime?
  • The issue with the visual similarity-based heuristic method which compares both the pre-stored signature such as images, font styles, page layout, screenshot, and so on of the new website with the old website will have general difficulty in detecting anomalies in a newly hosted phishing site.
  • The fact that the majority of the existing machine learning models are trained based on textual features such as “#”,”.”, Internet Protocol address, URL Length, domain levels, and so on from the Uniform Resource Locator (URL) does not help as any phisher or attacker with little web technologies can develop what we called "friendly URL” depending on the programming language adopted whether JAVA, C#, Python, PHP or framework to avoid all those features. With a friendly URL, such models are bound to misclassify leading to an increment in false negative rate.
For any Machine learning-based phishing detection method to be effective in real-time combat against phishing attacks, it must address each of the stated reasons above for which existing state-of-the-art anti-phishing methods continue to be vulnerable due to the increasingly sophisticated techniques by which phishing attacks are being carried out. It is worth noting that past research work on phishing attack detection had been largely based on approaches, classification, etc. RASHA ZIENI et al. [13] focus their review on list-based, similarity-based, and machine learning-based categories of approaches for phishing detection to identify pending research gap, Angad et al. [14] focus theirs on the advantages and limitations of existing approaches to phishing detection, while also using discussion of related application scenarios as guidance to propose a new method of anti-phishing detection, Yifei Wang [15] categorizes widely used phishing detection methods into seven categories and summarizes them. All previously proposed models, approaches, and frameworks have common limitation, there limitation was that they are either text-based or URL-based which makes it difficult for them to detect complex phishing attack where the attacker uses deep-fake videos, deep-fake images, textual-based images, or combination of any with traditional textual content.
In this research, we first reviewed some of the most recent works on phishing detection, and state-of-the-art algorithms from the past 5 years to investigate the performance of state-of-the-art machine learning and deep learning classifiers for phishing detection tasks, before proposing a multi-layered adaptive framework that uses computer vision to read images on a phishing webpage, and condense videos from a webpage to audio before synthesizing the speech into a condensed text to increase detection of a phishing attack. We use a combination of random forest algorithm and Long-Short Term Memory (LSTM) at different layers of the framework for effective coordination. The contributions of our research include the proposal of an adaptive multi-layered framework that uses computer vision to read graphic images, synthesize speech from uploaded videos, and natural language processing at various predictions layered to significantly increase the performance of machine learning-based models for phishing detection. Our artifacts which consist of source code, dataset, images, videos, and audio files for this research had been uploaded to a public GitHub repository for reproducibility of our research. Artifact can be found on GitHub at; Deep Learning-Based Speech and Vision Synthesis to Improve Phishing Attack Detection through a Multi-layer Adaptive Framework and also at Code Ocean computational research platform, with the exception of the internally generated deep-fake video and audio data files for privacy.

3. Experimental Setup

3.1. Dataset

The complexity of the research means we cannot rely on a single data. So, we use two publicly available datasets. There is no publicly available dataset for video-based, audio-based, and image-based phishing dataset, so we use simulation to internally generate them.
We use the "B" version of Mendeley phishing dataset which was designed as a benchmark dataset for training a machine learning models for phishing detection. It includes 11430 URLs and 87 extracted features from which models could be trained. Features in the dataset are classified further into three (3) different categories in which 56 extracted features are from the structure and syntax URL, 24 features were extracted from the content of the URL correspondent pages, and the remaining 7 features which are features with the greatest impact on prediction outcome are extracted from external services. 50% of the dataset used are "phishing" while the remaining 50% are from "legitimate" URLs. This balance of the dataset ensures that the prediction result is not unfairly tilted toward or against a particular category.
The second public dataset that we used was the Spam Message Classification dataset from KAGGLE containing 5157 unique records. The remaining datasets in the form of deep-fake videos and images were simulated and internally generated due to unavailability of such datasets in the public repository.

3.2. Settings

The proposed framework has 4 predictive layers, with each layer suitable for a specialized category of dataset to ensure adaptability. We show the results in several settings.
  • Layer 1 (URL-Based Training): We did traditional machine learning training on the first layer using the Mendeley phishing dataset. Out of possible 87 features, we use chi2 from sklearn feature selection library to select the best 19 features, having set the hyper-parameter k-value to 7 for optimal result which gave us a combination of the best 19 features. The dataset was split into two such that 80% was used for training, while the remaining 20% was used for validation tests. We choose random forest because of its suitability for URL-based phishing detection relative to other classifiers [26,27,28,29,30,31].During iteration, we set both the depth and random variable to several values for optimal result but only observed a small but negligible change in the variation of the accuracy until 39. with depth > 39, the accuracy remains constant, at least till when we increase the randomness of the tree to 1 before observing little change. We finally settled on setting the randomness state to 0 so that each tree remains the same each time it is generated.
  • Layer 2 (Image Processing): This is the layer where the Hypertext Markup Language (HTML) of the actual phishing webpage is secretly web-scrapped behind the scenes without any actual navigation for security purposes. The behind-the-scenes mode of web scrapping the HTML content protects the server and the network from potential drive-by attacks that might originate from the phishing site. All the syntax of HTML mark-up language was removed From the extracted HTML by REGEX as we needed only the content within the opening and closing of the body tag which is the section being served by web server to potential victims while on a phishing website, this step securely brings whatever content (textual, videos, or images) that will be served to potential victim into the framework for series of processing, and this effectively ensure that they cannot evade detection.
Table 1. Negligible impact of max_depth and random_state on accuracy.
Table 1. Negligible impact of max_depth and random_state on accuracy.
max_depth random_state accuracy
30 0 90.1
20 0 90.0
10 0 88.1
40 1 88.2
50 1 89.8
60 1 88.1
Next, we wrote an algorithm to iterate through every filtered word in the sentence, returning only the list of words with any image extension. The fact that the webpage was webs crapped means our program automatically returns a list of the full path of those images from the web server where they are hosted. The returned list is further iterated and passed through an Optical Character Recognition (OCR) library which uses computer vision to read content of each images into a raw text message and forward it to the next layer for further processing.
Algorithm 1 LSTM Model Training for Natural Language Processing (NLP) task
  • train _ LSTM ( f i , w i , o j )   
  • for epochs = 1 to N do   
  •    while  ( j m )  do   
  •      Randomly initialize w i = { w 1 , w 2 , , w n }   
  •      input o j = { o 1 , o 2 , , o m } in the input layer   
  •      forward propagate ( f i · w i ) through layers until getting the predicted result y   
  •      compute e = y y 2   
  •      back propagate e from right to left through layers   
  •      update w i   
  •    end while  
  • end for
  • Layer 3 (Speech Synthesis): Having successfully web-scrapped the hypertext mark-up language of the potential phishing site behind the scenes, and without any actual navigation for security purposes at the previous layer. Returned content from the previous layer 2 is further iterated through with "for" loop. "for" loop iterates through every filtered word in the sentence, returning only the list of words with any video extension, the return list is automatically the full path of those videos from the server where they are hosted.
Figure 1. Step-wise speech synthesis of each audio file during execution of "for" loop in layer 3 to produce text which was later passed on to layer 4. Texts from the phishing sites were processed at Layer 1, images were processed at Layer 2, while Layer 3 processed videos. All text was finally outputted to layer 4 for final prediction using a variant of Recurrent Neural Network in Long Short-Term Memory.
Figure 1. Step-wise speech synthesis of each audio file during execution of "for" loop in layer 3 to produce text which was later passed on to layer 4. Texts from the phishing sites were processed at Layer 1, images were processed at Layer 2, while Layer 3 processed videos. All text was finally outputted to layer 4 for final prediction using a variant of Recurrent Neural Network in Long Short-Term Memory.
Preprints 99973 g001
Next, we did further iteration through each of the returned video files, and on each iteration step, we used a combination of gtts, pydud, and moviepy for conversion from video file to audio file ".wav" format, after which the actual synthesis of each speech across the "loop" began with natural language processing speech recognition. The final operation output at this layer is a raw text file obtained from synthesizing the speech. At this stage, we have the images read to text from the previous layer 2 and speeches in the video synthesized to text, next, we combined each of the text from layer 1, layer 2, to layer 3 forwarding them to layer 4 for final prediction.
  • Layer 4 (Speech Synthesis): We choose LSTM network because of the effective solution it offers to vanishing and exploding gradient which are Long term dependency problem in Recurrent Neural Network, the cell state in LSTM network serves as a memory to the network thereby given it the ability to remember the past. At layer 4, we have all outputted and processed text contents from each of the previous layers, and there is need to capture every long-term dependencies, short term dependencies, and sequences which could be provided by the cell state in LSTM network to ensure a more accurate prediction.
We built an LSTM deep learning-based model, in which 80% of 5572 samples were used as training samples while the remaining 20% was used for validation. The dataset has a maximum of 10,000 features from the word sample, out of which we have 9004 unique words from the dataset. During training and validation, we had wide validation loss leading to low prediction ability but continued to adjust the number of layer, features, epoch, batch size, and activation. We obtained the best result at the following parameter; dense layer = 1 activation layer = sigmoid epochs = 10 batch size = 60 feature size = 32
Figure 2. LSTM network resulting in 0.98 accuracy at optimal parameter.
Figure 2. LSTM network resulting in 0.98 accuracy at optimal parameter.
Preprints 99973 g002
Figure 3. LSTM network resulting in 0.08 loss at optimal parameter.
Figure 3. LSTM network resulting in 0.08 loss at optimal parameter.
Preprints 99973 g003
The built LSTM network model is at the 4th layer of the framework where all processed output from each of the previous layers are merged and passed to the newly built LSTM model to make the final prediction.

4. Framework Adaptability and Performance Evaluation

To stretch and validate our multilayered adaptive framework for its effectiveness in the detection of phishing sites containing any of (Text, videos, and images), or a combination of any, or all of the 3. It is worth remembering that all existing AI or machine learning-based phishing detection techniques and frameworks can only detect text-based [2,28,32,33] or URL-based [27,32,34,35] phishing sites leading to their vulnerabilities to; -phishing sites with friendly URL -phishing site on hacked legitimate domain name server (DNS) -Image-only phishing site -video-only phishing site or, combination of any of them in any order. To validate both the effectiveness and adaptability of our proposed framework in overcoming such limitations, we created 4 categories of phishing sites and uploaded them to a secure server with a compromised DNS on a friendly URL; the first was a text-only phishing site, image-only phishing site, video-only phishing site, and a phishing site combining all the features. We use friendly URLs, and hacked legitimate DNS on the text-only phishing site, so that they can evade detection at layer 1 until the 4th layer where it will be detected, while we created phishing sites containing each image-only, text-only, and a combination of both to test the adaptability of the framework to different scenarios of a phishing site.
In each scenario, we have 100% accuracy as the framework successfully adapts to each scenario and detects accordingly, thereby overcoming limitations associated with current approaches to phishing detection methods.

5. Conclusions

In this research, we proposed a multi-layer adaptive framework that uses the computer vision capability of Optical Character Recognition (OCR) to read images on live phishing sites to text, and synthesize speech from uploaded deep-fake videos, while using Random Forest, and LSTM network, along with web scrapped text at various predictions layered of the framework to significantly improve the detection rate and performance of AI-based models for phishing detection. Considering the fact that existing AI-based phishing detection techniques, frameworks, and approaches can only detect text-based [2,28,32,33] or URL-based phishing [27,32,34,35] sites which leads to their vulnerability and inability to detect image-based, or video-based phishing sites, the proposed framework is able to overcome limitations in existing approaches, significantly improve phishing attack detection, and successfully detect complex phishing webpages with multi-dimentional deep-fake videos, images, and texts.

6. Limitation and Future Research direction

We used Mendeley and Kaggle phishing datasets which are URL-based and Text-based respectively. image-based and video-based phishing datasets are not publicly available because they are newly adopted forms of phishing websites to evade detection, we simulated them to get the data internally generated for this research especially with regard to deep-fake videos, hence getting publicly available image-based or Video-based phishing datasets will significantly help the research community in this.
The other research direction we will point at is the computational aspect during training. The proposed framework uses Random Forest and LSTM network at Layer 1 and Layer 4 respectively. The fact that Random Forest algorithm creates multiple trees each time to combine individual tree decisions for more accurate prediction leads to an increment in computation time, we have to set the random state to zero while changing the maximum depth for optimal hyper-parameter. Apart from the training computation time, there is also the server response time as the framework web scrapped the content behind the scenes thereby protecting the server against potential drive-by attacks. Hence, reducing the server response and computational to fraction of a second is an area open to future research in this domain.
It is also worth noting that our artifacts which consist of source code, dataset, images, videos, and audio files for this research had been uploaded to a public GitHub repository for reproducibility of our research. Artifact can be found on GitHub at; Deep Learning-Based Speech and Vision Synthesis to Improve Phishing Attack Detection through a Multi-layer Adaptive Framework and also at Code Ocean computational research platform, with the exception of the internally generated deep-fake video and audio data files for privacy.

References

  1. Sarker, O.; Jayatilaka, A.; Haggag, S.; Liu, C.; Babar, M.A. A Multi-vocal Literature Review on challenges and critical success factors of phishing education, training and awareness. Journal of Systems and Software 2024, 208, 111899. [Google Scholar] [CrossRef]
  2. Jain, A.K.; Gupta, B. A survey of phishing attack techniques, defence mechanisms and open research challenges. Enterprise Information Systems 2022, 16, 527–565. [Google Scholar] [CrossRef]
  3. Ige, T.; Marfo, W.; Tonkinson, J.; Adewale, S.; Matti, B.H. Adversarial Sampling for Fairness Testing in Deep Neural Network. arXiv 2023, arXiv:2303.02874 2023. [Google Scholar] [CrossRef]
  4. Uçar, M.K.; Nour, M.; Sindi, H.; Polat, K.; others. The effect of training and testing process on machine learning in biomedical datasets. Mathematical Problems in Engineering 2020, 2020. [Google Scholar] [CrossRef]
  5. Alkhalil, Z.; Hewage, C.; Nawaf, L.; Khan, I. Phishing attacks: A recent comprehensive study and a new anatomy. Frontiers in Computer Science 2021, 3, 563060. [Google Scholar] [CrossRef]
  6. Abdulrahman, L.M.; Ahmed, S.H.; Rashid, Z.N.; Jghef, Y.S.; Ghazi, T.M.; Jader, U.H. Web Phishing Detection Using Web Crawling, Cloud Infrastructure and Deep Learning Framework. Journal of Applied Science and Technology Trends 2023, 4, 54–71. [Google Scholar] [CrossRef]
  7. Aljofey, A.; Jiang, Q.; Rasool, A.; Chen, H.; Liu, W.; Qu, Q.; Wang, Y. An effective detection approach for phishing websites using URL and HTML features. Scientific Reports 2022, 12, 8842. [Google Scholar] [CrossRef] [PubMed]
  8. Anitha, J.; Kalaiarasu, M. A new hybrid deep learning-based phishing detection system using MCS-DNN classifier. Neural Computing and Applications 2022, 1–16. [Google Scholar] [CrossRef]
  9. Ghaleb Al-Mekhlafi, Z.; Abdulkarem Mohammed, B.; Al-Sarem, M.; Saeed, F.; Al-Hadhrami, T.; Alshammari, M.T.; Alreshidi, A.; Sarheed Alshammari, T. Phishing websites detection by using optimized stacking ensemble model. Computer Systems Science and Engineering 2022, 41, 109–125. [Google Scholar] [CrossRef]
  10. Jain, A.K.; Gupta, B.B. A machine learning based approach for phishing detection using hyperlinks information. Journal of Ambient Intelligence and Humanized Computing 2019, 10, 2015–2028. [Google Scholar] [CrossRef]
  11. Jain, A.K.; Gupta, B.B. A novel approach to protect against phishing attacks at client side using auto-updated white-list. EURASIP Journal on Information Security 2016, 2016, 1–11. [Google Scholar] [CrossRef]
  12. Okomayin, A.; Ige, T.; Kolade, A. Data Mining in the Context of Legality, Privacy, and Ethics. 2023. [Google Scholar]
  13. Zieni, R.; Massari, L.; Calzarossa, M.C. Phishing or not phishing? A survey on the detection of phishing websites. IEEE Access 2023, 11, 18499–18519. [Google Scholar] [CrossRef]
  14. Muneer, A.; Ali, R.F.; Al-Sharai, A.A.; Fati, S.M. A Survey on Phishing Emails Detection Techniques. 2021 International Conference on Innovative Computing (ICIC); IEEE, 2021; pp. 1–6. [Google Scholar]
  15. Wang, Y. A survey of phishing detection: from an intelligent countermeasures view. 2022 IEEE Conference on Telecommunications, Optics and Computer Science (TOCS); IEEE, 2022; pp. 761–769. [Google Scholar]
  16. Adewale, S.; Ige, T.; Matti, B.H. Encoder-Decoder Based Long Short-Term Memory (LSTM) Model for Video Captioning. arXiv 2023, arXiv:2401.02052 2023. [Google Scholar]
  17. Yaswanth, P.; Nagaraju, V. Prediction of Phishing Sites in Network using Naive Bayes compared over Random Forest with improved Accuracy. 2023 Eighth International Conference on Science Technology Engineering and Mathematics (ICONSTEM); IEEE, 2023; pp. 1–5. [Google Scholar]
  18. Ige, T.; Kiekintveld, C. Performance Comparison and Implementation of Bayesian Variants for Network Intrusion Detection. arXiv 2023, arXiv:2308.11834 2023. [Google Scholar]
  19. Karim, A.; Shahroz, M.; Mustofa, K.; Belhaouari, S.B.; Joga, S.R.K. Phishing Detection System Through Hybrid Machine Learning Based on URL. IEEE Access 2023, 11, 36805–36822. [Google Scholar] [CrossRef]
  20. Ishwarya, R.; Muthumani, S.; PG, S.S.K.; Suriya, S. Seperation of Phishing Emails Using Probabilistic Classifiers. 2023 9th International Conference on Advanced Computing and Communication Systems (ICACCS). IEEE, 2023; Vol. 1, pp. 1676–1679. [Google Scholar]
  21. Omari, K. Comparative Study of Machine Learning Algorithms for Phishing Website Detection. International Journal of Advanced Computer Science and Applications 2023, 14. [Google Scholar] [CrossRef]
  22. Magdacy Jerjes, A.Z.A.; Dawod, A.Y.; Abdulqader, M.F. Detect Malicious Web Pages Using Naive Bayesian Algorithm to Detect Cyber Threats. Wireless Personal Communications 2023, 1–13. [Google Scholar] [CrossRef]
  23. Nishitha, U.; Kandimalla, R.; Vardhan, R.M.M.; Kumaran, U. Phishing Detection Using Machine Learning Techniques. 2023 3rd Asian Conference on Innovation in Technology (ASIANCON); IEEE, 2023; pp. 1–6. [Google Scholar]
  24. Mustafa, T.; Karabatak, M. Feature Selection for Phishing Website by Using Naive Bayes Classifier. 2023 11th International Symposium on Digital Forensics and Security (ISDFS); IEEE, 2023; pp. 1–4. [Google Scholar]
  25. Jaya, T.; Kanyaharini, R.; Navaneesh, B. Appropriate Detection of HAM and Spam Emails Using Machine Learning Algorithm. 2023 International Conference on Advances in Computing, Communication and Applied Informatics (ACCAI); IEEE, 2023; pp. 1–5. [Google Scholar]
  26. Alazaidah, R.; Al-Shaikh, A.; AL-Mousa, M.; Khafajah, H.; Samara, G.; Alzyoud, M.; Al-Shanableh, N.; Almatarneh, S. Website Phishing Detection Using Machine Learning Techniques. Journal of Statistics Applications & Probability 2024, 13, 119–129. [Google Scholar]
  27. Shukla, S.; Misra, M.; Varshney, G. HTTP header based phishing attack detection using machine learning. Transactions on Emerging Telecommunications Technologies 2024, e4872. [Google Scholar] [CrossRef]
  28. van Geest, R.; Cascavilla, G.; Hulstijn, J.; Zannone, N. The applicability of a hybrid framework for automated phishing detection. Computers & Security 2024, 103736. [Google Scholar]
  29. Olukoya, D. Heterogeneous Ensemble Feature Selection and Multilevel Ensemble Approach to Machine Learning Phishing Attack Detection.
  30. Shaukat, M.W.; Amin, R.; Muslam, M.M.A.; Alshehri, A.H.; Xie, J. A hybrid approach for alluring ads phishing attack detection using machine learning. Sensors 2023, 23, 8070. [Google Scholar] [CrossRef] [PubMed]
  31. Joshi, K.; Bhatt, C.; Shah, K.; Parmar, D.; Corchado, J.M.; Bruno, A.; Mazzeo, P.L. Machine-learning techniques for predicting phishing attacks in blockchain networks: A comparative study. Algorithms 2023, 16, 366. [Google Scholar] [CrossRef]
  32. Alsubaei, F.S.; Almazroi, A.A.; Ayub, N. Enhancing Phishing Detection: A Novel Hybrid Deep Learning Framework for Cybercrime Forensics. IEEE Access 2024. [Google Scholar] [CrossRef]
  33. Jayaraj, R.; Pushpalatha, A.; Sangeetha, K.; Kamaleshwar, T.; Shree, S.U.; Damodaran, D. Intrusion detection based on phishing detection with machine learning. Measurement: Sensors 2024, 31, 101003. [Google Scholar] [CrossRef]
  34. Zhu, E.; Cheng, K.; Zhang, Z.; Wang, H. PDHF: Effective phishing detection model combining optimal artificial and automatic deep features. Computers & Security 2024, 136, 103561. [Google Scholar]
  35. Adebowale, M.A.; Lwin, K.T.; Hossain, M.A. Intelligent phishing detection scheme using deep learning algorithms. Journal of Enterprise Information Management 2023, 36, 747–766. [Google Scholar] [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.
Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

Disclaimer

Terms of Use

Privacy Policy

Privacy Settings

© 2026 MDPI (Basel, Switzerland) unless otherwise stated