Submitted:
08 January 2025
Posted:
08 January 2025
You are already at the latest version
Abstract
Keywords:
1. Introduction
2. Related Work
3. Methodology
4. Implementation
5. Performance Results
6. Conclusion
References
- Szegedy, C.; Zaremba, W.; Sutskever, I.; Bruma, J.; Erhan, D.; Goodfellow, I.J.; Fergus, R. Intriguing properties of neural networks. Proc. of ICLR Poster Track. 2014. [Google Scholar]
- Goodfellow, I.J.; Shlens, J.; Szegedy, C. Explaining and harnessing adversarial examples. Proc. of ICLR Poster Track. 2015. [Google Scholar]
- Alzantot, M.; Sharma, Y.; Elgohary, A.; Ho, B.-J.; Srivastava, M.; Chang, K.W. Generating natural language adversarial examples. Proc. of EMNLP 2018, 2890–2896. [Google Scholar]
- Chiang, C.-H.; Lee, H.-y. Are synonym substitution attacks really synonym substitution attacks? Findings of the ACL’23 2023, 1853–1878. [Google Scholar]
- Asl, J.R.; Rafiei, M.H.; Alohaly, M.; Takabi, D. A semantic, syntactic, and context-aware natural language adversarial example generator. IEEE Trans on Dep and Sec Comp 2024, 1–17. [Google Scholar]
- Vitorino, J.; Maia, E.; Praca, I. Adversarial evasion attack efficiency against large language models. 2024. arXiv:2406.08050. [Google Scholar]
- Zhao, J.; Chen, Y.; Li, X. Word-level textual adversarial attack method based on differential evolution algorithm. Proc. of Int Conf on Cloud Comp, Big Data and IoT, Wuhan, China 2022, 29–34. [Google Scholar]
- Li, H.; Zhang, J.; Gao, S.; Wu, L.; Zhou, W.; Wang, R. Towards query-limited adversarial attacks on graph neural networks. Proc. of the 34th Int. Conf on Tools with Art Intel (ICTAI) 2022, 516–521. [Google Scholar]
- Hu, X.; Liu, G.; Zheng, B.; Zhao, L.; Wang, Q.; Zhang, Y.; Du, M. FastTextDodger: Decision-based adversarial attack against black-box NLP models with extremely high efficiency. IEEE Trans. on Information Forensics and Security 2024, 19, 2398–2411. [Google Scholar]
- Parry, A.; Fröbe, M.; MacAvaney, S.; Potthast, M.; Hagen, M.; Parry, A.; Jia, M.R.; Liang, P. Analyzing adversarial attacks on sequence-to-sequence relevance models. 2024. arXiv:2403.07654. [Google Scholar]
- Waghela, H.; Rakshit, S.; Sen, J. A modified word saliency-based adversarial attack on text classification models. Proc. of ICCIDA, Hyderabad, India, 28–29 June 2024.
- Waghela, H.; Sen, J.; Rakshit, S. Saliency attention and semantic similarity-driven adversarial perturbation. Proc. of the 5th ICDSA, Jaipur, India, 17–19 July 2024.
- Ren, S.; Deng, Y.; He, K.; Che, W. Generating natural language adversarial examples through probability weighted word saliency. Proc. of the 57th Annual Meeting of the ACL 2019, 1085–1097. [Google Scholar]
- Li, L.; Ma, R.; Guo, Q.; Xue, X.; Qiu, X. BERT-Attack: Adversarial attack against BERT using BERT. Proc. of the Conf on EMNLP 2020, 6193–6202. [Google Scholar]
- Kim, Y. Convolutional neural networks for sentence classification. Proc. of EMNLP’14 2014, 1746–1751. [Google Scholar]
- Schuster, M.; Paliwal, K.K. Bidirectional recurrent neural networks. IEEE Trans on Sig Proc 1997, 45, 2673–2681. [Google Scholar]
- Zhang, X.; Zhao, J.; LeCun, Y. Character-level Convolutional Networks for Text Classification. Proc. of Advances in Neural Information Processing Systems 2015. [Google Scholar]
- Zhang, X.; LeCun, Y. Text understanding from scratch. Proc. of Advances in Neural Information Processing Systems 2015. arXiv:1502.01710. [Google Scholar]
- IMDB dataset: http://datasets.imdbws.com/.
- IMDB dataset: http://datasets.imdbws.com/.
- Lifferth, W. Fake news: https://kaggle.com/competitions/fake-news. 2019. [Google Scholar]
- Williams, A.; Nangia, N.; Bowman, S.R. A broad-coverage challenge corpus for sentence understanding through inference. 2018. arXiv:1704.05426v4. [Google Scholar]
- SNLI dataset: https://huggingface.co/datasets/stanfordnlp/snli.
- Hochreiter , S.; Schmidhuber, J. Long short-term memory. Neural Computing 1997, 9, 1735–1780. [Google Scholar]
- BERT-Large: ttps://huggingface.co/google-bert/bert-large-uncased.
- Chen, Q.; Zhu, X.; Ling, Z.; Wie, S.; Jiang, H.; Inkpen, D. Enhanced LSTM for natural language inference. 2016. arXiv: 1609.06038. [Google Scholar]






| Dataset | Model | Acc in abs of Attack (%) | Acc. under PWWS (%) | Acc. under DCP (%) |
|---|---|---|---|---|
| AG News | Word-CNN | 90.56 | 56.72 | 48.25 |
| Char-CNN | 89.70 | 56.20 | 46.20 | |
| IMDB | Bi-LSTM | 84.86 | 2.20 | 1.75 |
| Word-CNN | 86.55 | 5.50 | 3.60 |
| Dataset | Model | Perturb Rate with PWWS (%) | Perturb Rate with DCP (%) |
|---|---|---|---|
| AG News | Word-CNN | 16.76 | 15.25 |
| Char-CNN | 18.93 | 14.80 | |
| IMDB | Bi-LSTM | 3.38 | 2.80 |
| Word-CNN | 3.81 | 3.10 |
| Dataset | Attack Method | Original Accuracy | Accuracy in Presence of Attack | Perturb Rate of the Attack |
|---|---|---|---|---|
| IMDB | BERT-on-BERT | 90.90 | 11.40 | 4.40 |
| DCP | 7.40 | 2.70 | ||
| Yelp | BERT-on-BERT | 95.60 | 5.10 | 4.10 |
| DCP | 4.05 | 3.50 | ||
| Fake | BERT-on-BERT | 97.80 | 15.50 | 1.10 |
| DCP | 11.4 | 0.90 | ||
| AG News | BERT-on-BERT | 94.20 | 10.60 | 15.40 |
| DCP | 6.70 | 8.60 |
| Dataset | Attack Method | No of Queries | Semantic Similarity |
|---|---|---|---|
| IMDB | BERT-on-BERT | 454 | 0.86 |
| DCP | 347 | 0.96 | |
| Yelp | BERT-on-BERT | 273 | 0.77 |
| DCP | 238 | 0.94 | |
| Fake | BERT-on-BERT | 1558 | 0.81 |
| DCP | 943 | 0.93 | |
| AG News | BERT-on-BERT | 213 | 0.63 |
| DCP | 154 | 0.94 |
| Dataset | Attack Method | Original Accuracy | Accuracy in Presence of Attack | Perturb Rate of the Attack |
|---|---|---|---|---|
| MNLI Matched | BERT-on-BERT | 85.10 (H/P) | 7.90/11.90 | 8.80/7.90 |
| DCP | 5.30/10.80 | 7.40/6.70 | ||
| MNLI Unmatched | BERT-on-BERT | 82.10(H/P) | 7.00/13.70 | 8.00/7.10 |
| DCP | 5.10/10.60 | 7.20/7.00 | ||
| SNLI | BERT-on-BERT | 89.40 (H/P) | 7.40/16.10 | 12.40/9.30 |
| DCP | 3.20/12.60 | 8.20/6.30 |
| Dataset | Model | Acc in Absence of Attack | Acc in Presence of BERT-on-BERT | Acc in Presence of DCP |
|---|---|---|---|---|
| IMDB | Word-LSTM | 89.80 | 10.20 | 7.40 |
| BERT-Large | 98.20 | 12.40 | 8.30 | |
| Yelp | Word-LSTM | 96.00 | 1.10 | 0.70 |
| BERT-Large | 97.90 | 8.20 | 5.40 | |
| MNLI Matched | ESIM | 76.20 | 9.60 | 7.20 |
| BERT-Large | 86.40 | 13.20 | 10.80 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).