Submitted:
15 July 2025
Posted:
16 July 2025
You are already at the latest version
Abstract
Keywords:
1. Introduction
2. Related Work
2.1. Traditional Multi-Agent Systems: Formal Foundations
2.2. LLM-Native Frameworks: Popular but Architecturally Flawed
2.2.1. LangGraph: Workflow Orchestration Masquerading as Agency
- Conflation of Semantic and Coordination Logic: LangGraph requires LLM involvement at every coordination decision, forcing expensive language model inference for routine process management tasks that could be handled deterministically.
- Lack of Formal Verification: Unlike traditional agent architectures with mathematical foundations, LangGraph workflows cannot be formally verified for correctness, leading to unpredictable behavior in complex scenarios.
- Centralized Orchestration Bottlenecks: All coordination flows through centralized LLM-mediated decision points, creating scalability limitations as agent populations grow.
2.2.2. AutoGen: Multi-Agent Conversations Without True Coordination
- Brittle communication patterns, dependent on prompt engineering rather than formal protocols
- Lack of persistent agent state beyond conversation history
- No mechanisms for goal revision or long-term planning beyond what emerges from conversational dynamics
2.2.3. Agentic RAG: Reactive Information Retrieval, Not Proactive Agency
2.2.4. OpenAI Multi-Agent Examples: Hard-Coded Scripts, Not Autonomous Agents
2.3. The Core Problem: Architectural Limitations of Current Approaches
- Scalability Bottlenecks: Every coordination decision requires expensive LLM inference, creating computational and latency bottlenecks as system complexity grows.
- Lack of Formal Properties: Without mathematical foundations, these systems cannot guarantee coordination properties like deadlock freedom, liveness, or bounded response times.
- Brittle Prompt Dependencies: Coordination logic embedded in prompts creates fragile systems sensitive to language model variations and prompt engineering artifacts.
- Limited Human Integration: Current frameworks treat humans as external users rather than integrated participants in hybrid human-AI workflows.
2.4. Emerging Recognition of Architectural Deficits
2.5. TB-CSPN: Addressing Architectural Deficits Through Formal Foundations
- Separating Semantic and Coordination Logic: LLMs handle semantic processing (topic extraction) while Petri Net semantics manage coordination deterministically.
- Enabling Formal Verification: Built on Colored Petri Net foundations, TB-CSPN enables mathematical verification of coordination properties.
- Supporting Distributed Coordination: Topic-based communication eliminates centralized orchestration bottlenecks while maintaining semantic coherence.
- Integrating Human Agents: Explicit support for human participants as first-class agents in hybrid workflows.
3. What Is (and Isn’t) an Agent
3.1. Minimal Criteria for Agency
3.2. The Problem with Current “Agents” in LLM Pipelines
- exist only for a single prompt-execution cycle, violating the persistence criterion;
- lack memory or state continuity, preventing genuine adaptivity;
- operate via fixed execution logic or prompt chains, compromising autonomy;
- are hard-coded to route input to tools without adaptive negotiation, failing the interaction requirement.
3.3. Modular and Centaurian Agency: Two Legitimate Architectures
4. TB-CSPN: Architecture Overview
4.1. Conceptual Foundation and Formal Models
4.2. Petri Net Realization
4.3. Layered Communication Architecture
4.4. Integration with Modern AI Components
5. Implementation Walkthrough
5.1. Multi-Engine Architecture
- Rule Engine: Implements declarative rule-based coordination, emphasizing modular rule composition and local reasoning
- CPN Engine: Provides formal Colored Petri Net semantics with typed places, guarded transitions, and verifiable execution properties
- SNAKES Engine: Leverages established Petri net libraries for classical analysis, visualization, and formal verification
5.2. Core Architectural Principles
5.2.1. Semantic Token Model
5.2.2. Topic-Driven Coordination
5.3. Agent Hierarchy Implementation
5.3.1. Consultant Agent Architecture
- Input Processing: CSV-based news ingestion with company-specific grouping for contextual analysis
- LLM-based Topic Extraction: Structured prompting of large language models (GPT-4) to identify financial topics and assign relevance scores based on potential market impact
- Topic Aggregation: Optional LLM-driven consolidation of semantically similar topics to reduce dimensionality while preserving semantic richness
- Token Generation: Creation of structured tokens containing topic distributions, metadata, and traceability information
5.3.2. Supervisor Agent Logic
- Rule-Based Mode: Declarative rules serve as abstractions of human-approved decision patterns, operating over consultant-generated tokens to pattern-match against topic distributions and generate directives when activation thresholds are exceeded. These rules encode institutional knowledge and established protocols, enabling consistent application of human strategic judgment.
- Human-in-the-Loop Mode: Direct human supervision allows strategic decision-makers to review topic-annotated tokens and issue directives manually, particularly for novel situations or high-stakes decisions that require contextual judgment beyond codified rules.
- Centaurian Mode: AI-augmented human decision-making where supervisory agents provide machine-generated recommendations, risk assessments, or scenario analyses to support human strategic reasoning, combining human intuition with computational analysis capabilities.
5.3.3. Worker Agent Execution
5.4. Formal Coordination Mechanisms
5.4.1. Hybrid LLM-Rule Processing
5.4.2. Petri Net Semantics
5.5. Integration and Extensibility
- Consultant Layer Integration. LLM integration occurs exclusively at the consultant layer through structured prompting that produces explicit topic annotations, avoiding the brittleness of end-to-end prompt engineering found in prompt-chained architectures. This layer leverages the semantic understanding capabilities of large language models while constraining their usage to well-defined topic extraction tasks, ensuring both efficiency and interpretability.
- Supervisor Layer: Human-Centaurian Coordination. The supervisor layer is designed specifically for human strategic decision-making, optionally augmented through centaurian architectures that combine human judgment with AI-assisted analysis. This design recognizes that strategic coordination requires contextual understanding, ethical reasoning, and accountability that remain fundamentally human capabilities.
- Worker Layer: Specialized AI Execution. Worker agents implement narrow AI systems optimized for specific operational tasks—portfolio optimization, risk assessment, data analysis, or external system integration. These agents operate deterministically based on supervisor directives, ensuring predictable and auditable execution.
- Learning and Adaptation. Learning-enabled components can be integrated at both consultant and worker layers while maintaining formal coordination guarantees through the supervisor layer’s oversight mechanisms.
5.6. Observability and Verification
6. Case Study: Topic-Grounded Consultant Agents vs. LLM Chaining Pipelines
6.1. Motivation: Why Not LangGraph?
6.2. Financial News Processing Implementation
- Consultant: receives unstructured textual input and annotates it with weighted topics;
- Supervisor: interprets the topics using rule-based logic and emits a directive;
- Worker: executes or simulates an action based on the directive.
6.3. Consultant Agent Design
- Direct Topic Extraction: the model is prompted to return a set of key topics with confidence scores.
- Aggregated Topic Extraction: individual topics are grouped by semantic similarity, and their scores aggregated to produce a more abstract representation.
6.4. Example Pipeline Execution
“Federal Reserve signals possible rate hike in July.”
“Retail stocks underperform despite holiday sales.”
“Tech sector surges on new AI chip breakthrough.”
{market_volatility: 0.9, fed_policy: 0.8}
{retail_sector: 0.7, consumer_spending: 0.5}
{AI_sector: 0.9, tech_momentum: 0.6}
"Monitor AI sector for strategic repositioning."
6.5. Step-by-Step Comparison: TB-CSPN vs. Prompt-Chained Agentic Pipelines
“Tech sector surges on new AI chip breakthrough.”
6.6. From Use Case to Evaluation
6.7. Quantitative Performance Evaluation
6.7.1. Fair Comparison Methodology
- TB-CSPN Pipeline: Single LLM call for topic extraction followed by deterministic rule-based coordination
- LangGraph Pipeline: Multiple LLM calls throughout the workflow (consultant → supervisor → worker nodes)
6.7.2. Performance Results
6.7.3. Architectural Efficiency Analysis
6.7.4. Cost-Benefit Analysis
- Operational Costs: TB-CSPN processes 1000 news items for approximately $30 versus $90 for equivalent LangGraph deployment
- Infrastructure Requirements: Lower latency enables higher concurrency with reduced compute resources
- Reliability: Deterministic coordination reduces debugging complexity and operational overhead
6.7.5. Implications for Agentic AI Architecture
7. Comparative Analysis
8. Conclusions and Future Work
- Scalability Analysis: Investigating TB-CSPN performance across larger agent populations and more complex coordination scenarios
- Learning Integration: Developing adaptive topic extraction and threshold optimization mechanisms that maintain formal verification properties
- Domain Expansion: Applying TB-CSPN principles to domains beyond financial analysis, including healthcare workflows, emergency response, and scientific collaboration
- Human-AI Coevolution: Exploring dynamic role allocation between human and artificial agents based on expertise, context, and trust metrics
- Standardization: Developing interoperability protocols for TB-CSPN systems to enable ecosystem-wide adoption
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
Abbreviations
| A2A | Agent-to-Agent |
| ACP | Agent Communication Protocol |
| AI | Artificial Intelligence |
| ANP | Agent Network Protocol |
| BDI | Belief-Desire-Intention |
| CNP | Contract Net Protocol |
| CPN | Colored Petri Nets |
| LLM | Large Language Models |
| MAS | Multi-Agent Systems |
| MCP | Model Context Protocol |
| R2A2 | Reflective Risk-Aware Agent Architecture |
| RAG | Retrieval-Augmented Generation |
| TB-CSPN | Topic-Based Communication Space Petri Net |
References
- Li, Q.; Xie, Y. From Glue-Code to Protocols: A Critical Analysis of A2A and MCP Integration for Scalable Agent Systems. arXiv preprint 2025, abs/2505.03864. [CrossRef]
- Wooldridge, M.; Jennings, N.R. Intelligent Agents: Theory and Practice. The Knowledge Engineering Review 1995, 10, 115–152.
- Nwana, H.S. Software agents: an overview. Knowl. Eng. Rev. 1996, 11, 205–244. [CrossRef]
- Sycara, K.P.; Zeng, D.D. Coordination of Multiple Intelligent Software Agents. Int. J. Cooperative Inf. Syst. 1996, 5, 181–212. [CrossRef]
- Nardi, B.A.; Miller, J.R.; Wright, D.J. Collaborative, Programmable Intelligent Agents. Commun. ACM 1998, 41, 96–104. [CrossRef]
- Padgham, L.; Winikoff, M. Developing Intelligent Agent Systems – A Practical Guide; Wiley series in agent technology, Wiley, 2004.
- Liu, X.; Wang, J.; Sun, J.; Yuan, X.; Dong, G.; Di, P.; Wang, W.; Wang, D. Prompting Frameworks for Large Language Models: A Survey. arXiv preprint 2023, abs/2311.12785. [CrossRef]
- Wooldridge, M.J. Introduction to Multiagent Systems; Wiley, 2002.
- Rao, A.S.; Georgeff, M.P. BDI Agents: From Theory to Practice. In Proceedings of the Proceedings of the First International Conference on Multiagent Systems, June 12-14, 1995, San Francisco, California, USA; Lesser, V.R.; Gasser, L., Eds. The MIT Press, 1995, pp. 312–319.
- Smith, R.G. The Contract Net Protocol: High-Level Communication and Control in a Distributed Problem Solver. IEEE Trans. Computers 1980, 29, 1104–1113. [CrossRef]
- Borghoff, U.M.; Pareschi, R.; Fontana, F.A.; Formato, F. Constraint-Based Protocols for Distributed Problem Solving. Sci. Comput. Program. 1998, 30, 201–225. [CrossRef]
- LangChain. LangGraph. https://langchain-ai.github.io/langgraph/, 2025. Accessed: 2025-03-15.
- Wu, Q.; Bansal, G.; Zhang, J.; Wu, Y.; Zhang, S.; Zhu, E.; Li, B.; Jiang, L.; Zhang, X.; Wang, C. AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation Framework. arXiv preprint 2023, abs/2308.08155. [CrossRef]
- Schneider, F.; Ahmadi, N.B.; Ahmadi, N.B.; Vogel, I.; Semmann, M.; Biemann, C. CollEX - A Multimodal Agentic RAG System Enabling Interactive Exploration of Scientific Collections. arXiv preprint 2025, abs/2504.07643. [CrossRef]
- OpenAI. OpenAI Cookbook. https://github.com/openai/openai-cookbook, 2025. Accessed on June 25, 2025.
- Sapkota, R.; Roumeliotis, K.I.; Karkee, M. AI Agents vs. Agentic AI: A Conceptual Taxonomy, Applications and Challenges. arXiv preprint 2025, abs/2505.10468. [CrossRef]
- Su, H.; Luo, J.; Liu, C.; Yang, X.; Zhang, Y.; Dong, Y.; Zhu, J. A Survey on Autonomy-Induced Security Risks in Large Model-Based Agents. arXiv preprint 2025, abs/2506.23844. [CrossRef]
- Acharya, D.B.; Kuppan, K.; Bhaskaracharya, D. Agentic AI: Autonomous Intelligence for Complex Goals - A Comprehensive Survey. IEEE Access 2025, 13, 18912–18936. [CrossRef]
- Xi, Z.; Chen, W.; Guo, X.; He, W.; Ding, Y.; Hong, B.; Zhang, M.; Wang, J.; Jin, S.; Zhou, E.; et al. The Rise and Potential of Large Language Model Based Agents: A Survey. Sci. China Inf. Sci. 2025, 68. [CrossRef]
- Schlosser, M.E. Embodied Cognition and Temporally Extended Agency. Synth. 2018, 195, 2089–2112. [CrossRef]
- Schlosser, M. Agency; The Stanford Encyclopedia of Philosophy (Winter Edition), 2019.
- Russell, S.; Norvig, P. Artificial Intelligence: A Modern Approach; Prentice Hall, 2010.
- Churchland, P.M. Eliminative Materialism and the Propositional Attitudes. The Journal of Philosophy 1981, 78, 67–90.
- Dennett, D.C. True Believers: The Intentional Strategy and Why It Works. Scientific Explanation: Papers based on Herbert Spencer Lectures 1981, pp. 17–36.
- Dennett, D.C. The Intentional Stance; MIT Press, 1987.
- Quine, W.V.O. Word and Object; MIT Press, 1960.
- Brooks, R.A. Intelligence without Representation. In Proceedings of the Artificial Intelligence: Theoretical Foundations, 1991, pp. 139–159.
- Brooks, R. A Robust Layered Control System for a Mobile Robot. IEEE Journal on Robotics and Automation 1986, 2, 14–23. [CrossRef]
- Pareschi, R. Beyond Human and Machine: An Architecture and Methodology Guideline for Centaurian Design. Sci 2024, 6. [CrossRef]
- Saghafian, S.; Idan, L. Effective Generative AI: The Human-Algorithm Centaur. arXiv preprint 2024, abs/2406.10942. [CrossRef]
- Borghoff, U.M.; Bottoni, P.; Pareschi, R. An Organizational Theory for Multi-Agent Interactions Integrating Human Agents, LLMs, and Specialized AI. Discover Computing 2025, 28. [CrossRef]
- Van Der Aalst, W.M.P. The application of Petri Nets to Workflow Management. Journal of Circuits, Systems and Computers 1998, 08, 21–66.
- Borghoff, U.M.; Bottoni, P.; Pareschi, R. Human-Artificial Interaction in the Age of Agentic AI: A System-Theoretical Approach. Frontiers in Human Dynamics 2025, 7. [CrossRef]
- Fox, R.; Pakman, A.; Tishby, N. Taming the Noise in Reinforcement Learning via Soft Updates. In Proceedings of the Proceedings of the Thirty-Second Conference on Uncertainty in Artificial Intelligence, UAI 2016, June 25-29, 2016, New York City, NY, USA; Ihler, A.; Janzing, D., Eds. AUAI Press, 2016.





| Criterion | Description |
|---|---|
| Autonomy | The agent initiates or regulates its own behavior without external micromanagement. |
| Persistence | The agent exists across time steps and can maintain or update its internal status. |
| Interaction | The agent can send and/or receive semantically structured signals or tokens. |
| Adaptivity | The agent can modify its behavior based on feedback or environmental variation. |
| Role-bearing | The agent plays a functionally identifiable part within a broader |
| modularity | system of roles. |
| Step | TB-CSPN Behavior | LangGraph-style Pipeline |
|---|---|---|
| 1. Input Reception | Raw text is received by a Consultant agent. | Text is passed to a prompt-engineered LLM node. |
| 2. Semantic Processing | LLM extracts structured topics with confidence scores; optionally aggregates similar topics. | LLM generates free-form summaries or unstructured tags based on prompt instructions. |
| 3. Token Creation | A Consultant agent wraps topics in a token object with a UUID and metadata. | Resulting string is passed directly to next node as input; state is implicit. |
| 4. Coordination | A Supervisor agent interprets topics using rules and thresholds to issue a directive. | Next LLM node attempts to determine next action based on prior response and memory context. |
| 5. Delegation | A Worker agent executes directive based on topic-weighted logic. | Another LLM node or code function simulates the action; logic often embedded in prompt. |
| 6. Traceability | All steps are logged: input, topics, thresholds, and directives. | Logging may capture prompt input/output but lacks structural interpretation. |
| 7. Adaptation | Change the rule set or threshold for a Supervisor agent or topic categories for a Consultant agent. | Must modify multiple prompts or retrain orchestration logic; changes are brittle. |
| Metric | TB-CSPN | LangGraph | Improvement |
|---|---|---|---|
| Avg. Processing Time | 0.301s | 0.802s | 62.5% faster |
| Peak Throughput | 199.5 items/min | 74.8 items/min | 166.8% higher |
| LLM Calls per Item | 1.0 | 3.0 | 66.7% fewer |
| Success Rate | 100.0% | 100.0% | Equal reliability |
| Cost Efficiency | $X per 1000 items | $3X per 1000 items | 66.7% lower cost |
| Feature | TB-CSPN | LangGraph / LangSmith |
|---|---|---|
| Core Model | Colored Petri Nets and | Event-driven DAG and |
| Rule-based Coordination | Observability Tools | |
| Execution Semantics |
Tokenized transitions with formal interleaving |
Procedural chaining of nodes |
| Coordination Logic | Declarative rules, topic-based token flow |
Implicit chaining via callbacks/prompts |
| Modularity | Places, transitions, | Modular node-based chaining |
| tokens as composable units | with state passing | |
| Concurrency | True concurrency semantics via | Async execution without |
| Support | Petri Nets | formal concurrency |
| Agent Modeling | Multi-agent roles (supervisor, consultant, Worker) |
Mostly monolithic LLM agents |
| Guard Conditions |
Verifiable thresholds using predicates |
Prompt-based control logic |
| LLM Role | One of many agents in layered architecture | Primary orchestrator |
| Adaptivity | Learning-enabled agents (e.g., G-Learning) |
Manual chain reconfiguration |
| Formal Verification |
Yes (reachability, fairness, deadlock) |
No formal verification |
| Traceability | Token-level flow and semantic traceability | Dashboard logs and callbacks |
| Reusability | High at rule and token level | Moderate (node and template reuse) |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).