Preprint Article Version 1 Preserved in Portico This version is not peer-reviewed

Robust Testing of AI Language Models Resilience with Novel Adversarial Prompts

Version 1 : Received: 12 January 2024 / Approved: 15 January 2024 / Online: 15 January 2024 (10:27:27 CET)

A peer-reviewed article of this Preprint also exists.

Hannon, B.; Kumar, Y.; Gayle, D.; Li, J.J.; Morreale, P. Robust Testing of AI Language Model Resiliency with Novel Adversarial Prompts. Electronics 2024, 13, 842. Hannon, B.; Kumar, Y.; Gayle, D.; Li, J.J.; Morreale, P. Robust Testing of AI Language Model Resiliency with Novel Adversarial Prompts. Electronics 2024, 13, 842.

Abstract

In the rapidly advancing field of Artificial Intelligence (AI), this study presents a critical evaluation of the resilience and cybersecurity efficacy of leading AI models, including ChatGPT-4, Bard, Claude, and Microsoft Copilot. Central to this research is the use of innovative adversarial prompts designed to rigorously test the content moderation capabilities of these AI systems. The study introduces new types of adversarial tests and the Response Quality Score (RQS), a metric specifically developed to assess the nuances of AI responses. Additionally, the research spotlights FreedomGPT, an AI tool engineered to optimize the alignment between user intent and AI interpretation. The empirical results from this investigation are pivotal for assessing the current robustness and security of AI models. They highlight the necessity for ongoing development and meticulous testing to bolster AI defenses against an array of adversarial challenges. Importantly, the study also delves into the ethical and societal implications associated with employing advanced 'jailbreak' techniques in AI testing. The findings are significant for understanding AI vulnerabilities and formulating strategies to enhance the reliability and ethical soundness of AI technologies, paving the way for safer and more secure AI applications.

Keywords

Adversarial Testing; AI Model Resilience; Content Moderation in AI; Cybersecurity in AI Systems; Ethical AI Implications

Subject

Computer Science and Mathematics, Artificial Intelligence and Machine Learning

Comments (0)

We encourage comments and feedback from a broad range of readers. See criteria for comments and our Diversity statement.

Leave a public comment
Send a private comment to the author(s)
* All users must log in before leaving a comment
Views 0
Downloads 0
Comments 0
Metrics 0


×
Alerts
Notify me about updates to this article or when a peer-reviewed version is published.
We use cookies on our website to ensure you get the best experience.
Read more about our cookies here.