Real User Instruction: Black-Box Instruction Authentication Middleware Against Indirect Prompt Injection

Jingtang Luo; Chenlin Zhang

doi:10.20944/preprints202603.1023.v1

Submitted:

10 March 2026

Posted:

13 March 2026

You are already at the latest version

Abstract

Large Language Model (LLM) agents are increasingly deployed to interact with untrusted external data, exposing them to Indirect Prompt Injection (IPI) attacks. While current black-box defenses (i.e., model-agnostic methods) such as “Sandwich Defense” and “Spotlighting” provide baseline protection, they remain brittle against adaptive attacks like Actor-Critic (where injections evolve to better evade LLM’s internal defense). In this paper, we introduce Real User Instruction (RUI), a lightweight, black-box middleware that enforces strict instruction-data separation without model fine-tuning. RUI operates on three novel mechanisms: (1) a Privileged Channel that encapsulates user instructions within a cryptographic-style schema; (2) Explicit Adversarial Identification, a cognitive forcing strategy that compels the model to detect and list potential injections before response generation; and (3) Dynamic Key Rotation, a moving target defense that re-encrypts the conversation state at every turn, rendering historical injection attempts obsolete. We evaluate RUI against a suite of adaptive attacks, including Context-Aware Injection, Token Obfuscation, and Delimitation Spoofing. Our experiments demonstrate that RUI reduces the Attack Success Rate (ASR) from 100% (undefended baseline) to less than 8.1% against cutting-edge adaptive attacks, while maintaining a Benign Performance Preservation (BPP) rate of over 88.8%. These findings suggest that RUI is an effective and practical solution for securing agentic workflows against sophisticated, context-aware adversaries.

Keywords:

black-box defense

;

dynamic authentication

;

indirect prompt injection

;

large language model

;

middleware

;

user instruction

Subject:

Computer Science and Mathematics - Security Systems

Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.

Real User Instruction: Black-Box Instruction Authentication Middleware Against Indirect Prompt Injection

Abstract

Keywords:

Subject:

MDPI Initiatives

Important Links

Subscribe