Preprint Article Version 1 Preserved in Portico This version is not peer-reviewed

Examine Website Defacement Dataset by Exploiting Some Classifiers’ Capabilities

Version 1 : Received: 26 October 2023 / Approved: 26 October 2023 / Online: 27 October 2023 (05:27:14 CEST)

How to cite: Zayid, E.I.M.; Isah, I.; Adam, Y.A.; Farah, N.A.A.; Alshehri, O.A. Examine Website Defacement Dataset by Exploiting Some Classifiers’ Capabilities. Preprints 2023, 2023101743. https://doi.org/10.20944/preprints202310.1743.v1 Zayid, E.I.M.; Isah, I.; Adam, Y.A.; Farah, N.A.A.; Alshehri, O.A. Examine Website Defacement Dataset by Exploiting Some Classifiers’ Capabilities. Preprints 2023, 2023101743. https://doi.org/10.20944/preprints202310.1743.v1

Abstract

Website defacement is an illegal electronic act of changing a website. In this paper, robust machine learning classifiers’ capabilities were exploited to select the best input feature set for evaluating a website’s defacement risk. Zone-H, a private organization, offered us a defacement mining dataset. A sample of 93644 datapoints was concisely pre-processed and used for modelling purposes. Considering multidimensional features as input, reason and hackmode variables were chosen as outputs. Massive machine learning models were examined; however, decision tree (DT), k-nearest neighbour (k-NN), and random forest (RF) were the most powerful algorithms used to predict the target model. The input variables 'domain', 'system', 'web_server','redefacement', 'type', 'def_grade', and 'reason/hackmode' were tested and used to shape the final model. Using the cross-validation (CV) technique, the model’s key performance factors were calculated and reported. After calculating the average scores for the hyperbolic metrics (i.e., max-depth, min-sample-leaf, weight, max-features, and CV), both targets were evaluated, and the learning algorithms were ranked as RF, DT, and k-NN. The reason and hackmode variables were thoroughly analysed. The average score accuracies for the reason and hackmode targets were 0.85 and 0.585, respectively. The results showed a significant development in terms of modelling and optimizing website defacement risk. The study also successfully addresses the main cybersecurity concerns, in particular website defacement.

Keywords

prediction capabilities for website defacement; website defacement assessment; classification metrics for website hackticism; website defacement risks

Subject

Computer Science and Mathematics, Computer Science

Comments (0)

We encourage comments and feedback from a broad range of readers. See criteria for comments and our Diversity statement.

Leave a public comment
Send a private comment to the author(s)
* All users must log in before leaving a comment
Views 0
Downloads 0
Comments 0
Metrics 0


×
Alerts
Notify me about updates to this article or when a peer-reviewed version is published.
We use cookies on our website to ensure you get the best experience.
Read more about our cookies here.