Preprint
Article

This version is not peer-reviewed.

Stateful Guardrails for Multi-Turn LLM Systems: A Conversational Risk Accumulation Framework

Submitted:

21 April 2026

Posted:

22 April 2026

You are already at the latest version

Abstract
Guardrail systems for large language models (LLMs) are designed under a foundational but rarely examined assumption: that safety is a property of individual input–output exchanges. This assumption is adequate for single-turn deployments but fails structurally in multi-turn conversational systems, where risk does not reside in any single message but emerges from the accumulated trajectory of a session. We formalize this failure mode as Conversational Risk Accumulation (CRA), a class of adversarial and incidental threat patterns in which individually policy-compliant turns collectively produce outcomes that violate safety intent. We propose a stateful guardrail architecture, the CRA Framework, comprising three novel constructs: (1) a Semantic Drift Monitor that tracks divergence from declared session intent; (2) an Information Accumulation Graph (IAG) that models cross-turn entity and attribute disclosure as a growing knowledge structure; and (3) a Compliance Gradient Detector that identifies progressive erosion of refusal behavior across turns. These three signals are fused into a session-level CRA Score, which triggers guardrail intervention at the conversation layer rather than the message layer. We formalize the threat taxonomy, define the mathematical properties of the CRA Score, and derive theoretical bounds on detection latency. The framework is domain-agnostic and architecturally composable with existing single-turn guardrail systems. We discuss instantiation across the enterprise RAG deployments, agentic pipelines, and educational AI systems, and identify open problems in stateful safety that the framework surfaces.
Keywords: 
;  ;  ;  ;  ;  ;  ;  ;  ;  
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.
Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

Disclaimer

Terms of Use

Privacy Policy

Privacy Settings

© 2026 MDPI (Basel, Switzerland) unless otherwise stated