Submitted:
05 March 2025
Posted:
05 March 2025
You are already at the latest version
Abstract
Keywords:
1. Introduction
- 1.
- Generation of Obfuscated XSS Attacks: Using LLMs and curated in-context learning, GenXSS generates complex XSS payloads validated against real-world vulnerable applications.
- 2.
- Automated Defense Mechanisms: The framework identifies bypassing payloads and employs machine learning and LLMs to generate and validate new WAF rules.
2. Background
2.1. XSS Attacks
2.2. Web Application Firewalls (WAFs)
2.3. Generative AI
3. Related Work
4. GenXSS Framework
4.1. Architecture
4.1.1. XSS Generation
4.1.2. XSS Validation
4.1.3. XSS Clustering
4.1.4. WAF Security Rule Generation
4.2. Role of Reinforcement Learning with Human Feedback
5. Evaluation and Results
5.1. Experimental Setup
5.2. Sample XSS Generation
5.2.1. \";\u0061\u006c\u0065\u0072\u0074(1);//
5.2.2. \";\u0061l\x65rt(1);//
5.2.3. Bypassing ModSecurity
5.2.4. Results
- GPT-4o: Generated 264 samples, of which 220 were valid XSS payloads, achieving a validity rate of 83%.
- Gemini: Generated 220 samples, of which 140 were valid, resulting in a validity rate of 63%.
- GPT-4o: Of the 220 validated XSS attacks, 174 (80%) bypassed ModSecurity.
- Gemini Pro: Of the 140 validated XSS attacks, 104 (74%) bypassed ModSecurity
6. Discussion
7. Conclusions and Future Work
7.1. Future Work
References
- OWASP Foundation. Cross-Site Scripting (XSS). https://owasp.org/www-community/attacks/xss/. Accessed: 2025-01-07.
- Applebaum, S.; Gaber, T.; Ahmed, A. Signature-based and machine-learning-based web application firewalls: a short survey. Procedia Computer Science 2021, 189, 359–367. [Google Scholar]
- Mallick, M.A.I.; Nath, R. Navigating the Cyber security Landscape: A Comprehensive Review of Cyber-Attacks, Emerging Trends, and Recent Developments. World Scientific News 2024, 190, 1–69. [Google Scholar]
- Abshari, D.; Fu, C.; Sridhar, M. LLM-assisted Physical Invariant Extraction for Cyber-Physical Systems Anomaly Detection. arXiv preprint arXiv:2411.10918 2024.
- Zibaeirad, A.; Koleini, F.; Bi, S.; Hou, T.; Wang, T. A comprehensive survey on the security of smart grid: Challenges, mitigations, and future research opportunities. arXiv preprint arXiv:2407.07966 2024.
- Abshari, D.; Sridhar, M. A Survey of Anomaly Detection in Cyber-Physical Systems. arXiv preprint arXiv:2502.13256 2025.
- Babaey, V.; Ravindran, A. GenSQLi: A Generative Artificial Intelligence Framework for Automatically Securing Web Application Firewalls Against Structured Query Language Injection Attacks. Future Internet 2025, 17, 8. [Google Scholar]
- Foundation, O. OWASP ModSecurity Core Rule Set Project, 2024. Accessed: 2024-12-03.
- Wu, C.; Chen, J.; Zhu, S.; Feng, W.; He, K.; Du, R.; Xiang, Y. Wafbooster: Automatic boosting of waf security against mutated malicious payloads. IEEE Transactions on Dependable and Secure Computing 2024. [Google Scholar] [CrossRef]
- Yao, Y.; He, J.; Li, T.; Wang, Y.; Lan, X.; Li, Y. An Automatic XSS Attack Vector Generation Method Based on the Improved Dueling DDQN Algorithm. IEEE Transactions on Dependable and Secure Computing 2023. [Google Scholar]
- Garn, B.; Lang, D.S.; Leithner, M.; Kuhn, D.R.; Kacker, R.; Simos, D.E. Combinatorially xssing web application firewalls. 2021 IEEE International Conference on Software Testing, Verification and Validation Workshops (ICSTW). IEEE, 2021, pp. 85–94.
- Alaoui, R.L.; others. Generative Adversarial Network-based Approach for Automated Generation of Adversarial Attacks Against a Deep-Learning based XSS Attack Detection Model. International Journal of Advanced Computer Science and Applications 2023, 14. [Google Scholar] [CrossRef]
- Khan, S. LL-XSS: End-to-End Generative Model-based XSS Payload Creation. 2024 21st Learning and Technology Conference (L&T). IEEE, 2024, pp. 121–126.
- BruteLogic. XSS Gym - p04. 2024. Available online: https://brutelogic.com.br/gym.php?p04=red (accessed on 2 December 2024).
- Bafna, P.; Pramod, D.; Vaidya, A. Document clustering: TF-IDF approach. 2016 International Conference on Electrical, Electronics, and Optimization Techniques (ICEEOT). IEEE, 2016, pp. 61–66.
- Schubert, E.; Sander, J.; Ester, M.; Kriegel, H.P.; Xu, X. DBSCAN revisited, revisited: why and how you should (still) use DBSCAN. ACM Transactions on Database Systems (TODS) 2017, 42, 1–21. [Google Scholar]
- Singh, A.; Ehtesham, A.; Kumar, S.; Khoei, T.T. Enhancing ai systems with agentic workflows patterns in large language model. 2024 IEEE World AI IoT Congress (AIIoT). IEEE, 2024, pp. 527–532.
- Et-Tolba, M.; Hanin, C.; Belmekki, A. DL-Based XSS Attack Detection Approach Using LSTM Neural Network with Word Embeddings. 2024 11th International Conference on Wireless Networks and Mobile Communications (WINCOM). IEEE, 2024, pp. 1–6.

| LLM Model | Num XSS Attacks | Num Valid | Num Invalid |
|---|---|---|---|
| GPT-4o | 264 | 220 | 44 |
| Gemini-Pro | 220 | 140 | 80 |
| Attack Types | Num XSS Attacks | Num Blocked | Num Bypass WAF |
|---|---|---|---|
| Reflected | 178 | 34 | 144 |
| Dom-Based | 42 | 12 | 30 |
| Attack Types | Num XSS Attacks | Num Blocked | Num Bypass WAF |
|---|---|---|---|
| Reflected | 116 | 22 | 94 |
| Dom-Based | 24 | 14 | 10 |
| Attack Types | Num XSS Attacks | Num Blocked | Num Bypass WAF |
|---|---|---|---|
| Reflected | 178 | 1 | 177 |
| Dom-Based | 42 | 0 | 42 |
| Attack Types | Num Samples | Clustering Algorithms | Num Rules | Num Blocked |
|---|---|---|---|---|
| Reflected | 144 | TF-IDF + HAC | 9 | 120 |
| SeqMatcher + DBSCAN | 5 | 114 | ||
| Dom-Based | 30 | TF-IDF + HAC | 6 | 30 |
| SeqMatcher + DBSCAN | 4 | 16 |
| Metric | Definition/Calculation |
|---|---|
| True Positives (TP) | Number of attacks correctly blocked by the WAF: |
| False Negatives (FN) | Number of attacks not blocked by the WAF: |
| True Negatives (TN) | Number of normal samples correctly not blocked by the WAF: |
| False Positives (FP) | Number of normal samples incorrectly blocked by the WAF: |
| Accuracy | |
| Precision | |
| Recall | |
| F1-Score |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).