Submitted:
11 June 2025
Posted:
11 June 2025
You are already at the latest version
Abstract
Keywords:
1. Introduction
2. Related Work
3. Methodology
3.1. Hybrid Architecture for Hijack Prompt Generation
3.1.1. Prompt Generator with NAS
3.2. Adversarial Prompt Refinement through Reinforcement Learning
3.3. Multi-Objective Optimization for Attack Robustness
3.4. Adversarial Prompt Evaluation with Defense-Aware Metrics
4. Loss Function
5. Data Preprocessing
5.1. Tokenization
5.2. Augmentation
5.3. Formatting
5.4. Batching
6. Evaluation Metrics
6.1. Attack Success Rate (ASR)
6.2. Output Distribution Divergence
6.3. Defense Evasion Rate (DER)
6.4. Task Success Rate (TSR)
7. Experiment Results
8. Conclusions
References
- Lu, J.; Long, Y.; Li, X.; Shen, Y.; Wang, X. Hybrid Model Integration of LightGBM, DeepFM, and DIN for Enhanced Purchase Prediction on the Elo Dataset. In Proceedings of the 2024 IEEE 7th International Conference on Information Systems and Computer Aided Education (ICISCAE). IEEE; 2024; pp. 16–20. [Google Scholar]
- Wang, Y.; Shen, G.; Hu, L. Importance evaluation of movie aspects: aspect-based sentiment analysis. In Proceedings of the 2020 5th International Conference on Mechanical, Control and Computer Engineering (ICMCCE). IEEE; 2020; pp. 2444–2448. [Google Scholar]
- Li, S. Enhancing Mathematical Problem Solving in Large Language Models through Tool-Integrated Reasoning and Python Code Execution. In Proceedings of the 2024 5th International Conference on Big Data &, 2024, Artificial Intelligence & Software Engineering (ICBASE). IEEE; pp. 165–168.
- Lu, J. Optimizing e-commerce with multi-objective recommendations using ensemble learning. In Proceedings of the 2024 4th International Conference on Computer Systems (ICCS). IEEE; 2024; pp. 167–171. [Google Scholar]
- Wu, T.; Chen, Y.; Chen, T.; Zhao, G.; Gao, F. Whole-Body Control Through Narrow Gaps From Pixels To Action. arXiv preprint arXiv:2409.00895, arXiv:2409.00895 2024.
- Li, S. Harnessing multimodal data and mult-recall strategies for enhanced product recommendation in e-commerce. In Proceedings of the 2024 4th International Conference on Computer Systems (ICCS). IEEE; 2024; pp. 181–185. [Google Scholar]
- Sun, Y.; Cui, Y.; Hu, J.; Jia, W. Relation classification using coarse and fine-grained networks with SDP supervised key words selection. In Proceedings of the Knowledge Science, Engineering and Management: 11th International Conference, KSEM 2018, Changchun, China, 2018, Proceedings, Part I 11. Springer, 2018, August 17–19; pp. 514–522.
- Xu, J.; Wang, Y. Enhancing Healthcare Recommendation Systems with a Multimodal LLMs-based MOE Architecture. arXiv preprint arXiv:2412.11557, arXiv:2412.11557 2024.
- Jin, T. Integrated Machine Learning for Enhanced Supply Chain Risk Prediction 2025.



| Model | ASR (%) | KL | DER (%) | TSR (%) |
|---|---|---|---|---|
| HijackNet | 92.5 | 0.12 | 85.7 | 90.4 |
| Mistral-7b-Instruct-v0.2 | 85.3 | 0.23 | 78.5 | 87.0 |
| - (with defense) | 72.8 | 0.30 | 65.2 | 75.3 |
| Model Variant | ASR (%) | KL | DER (%) | TSR (%) |
|---|---|---|---|---|
| HijackNet (full) | 92.5 | 0.12 | 85.7 | 90.4 |
| Augmentation | 87.1 | 0.19 | 75.4 | 84.2 |
| Tokenization Optimization | 88.3 | 0.21 | 79.2 | 85.5 |
| Defense Evasion Mechanism | 89.6 | 0.16 | 72.5 | 86.1 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).