Multi-Agent Systems for Decentralized Control and Management of Active Power Grid Peripheries: A Systematic Review

Sultan Mamun; Stelios Ioannou; Nicholas G. Christofides; Mohamed Darwish

doi:10.20944/preprints202605.1579.v1

Submitted:

21 May 2026

Posted:

25 May 2026

You are already at the latest version

Abstract

The transition from centralized fossil fuel-based power systems toward decentralized smart grids with high penetration of renewable energy sources (RESs) introduces substantial challenges in monitoring, control, coordination, and management. These challenges are particularly evident at the active power grid periphery, defined in this work as the decentralized edge layer of modern power systems comprising low-voltage distribution networks, distributed energy resources (DERs), prosumers, energy storage systems, electric vehicles (EVs), and localized intelligent control entities operating near the consumer side of the grid. This review systematically examines the role of multi-agent systems (MASs) in addressing these emerging challenges. A total of 160 Q1 journal articles published up to March 2026 were systematically analyzed to evaluate recent methodological advances, identify persistent research gaps, and compare existing problem formulations and mathematical techniques. The review covers MAS-based applications including distributed energy management, voltage and frequency regulation, demand-side management, microgrid coordination, EV charging coordination, resilience enhancement, and cyber-physical supervisory control. The findings indicate that although MASs offer enhanced scalability, flexibility, resilience, and decentralized decision-making capabilities, existing approaches continue to face significant limitations associated with communication latency, cybersecurity vulnerabilities, interoperability constraints, heterogeneous agent dynamics, and limited real-time experimental validation. Furthermore, this review proposes six emerging research hypotheses targeting underexplored domains, presents a methodological decision flowchart for MAS implementation and selection, and discusses future research directions involving the integration of digital twins, blockchain technologies, edge intelligence, and advanced communication architectures with MAS frameworks. Unlike previous review studies, this work provides a systematic comparison between MAS-based architectures and hierarchical industrial control frameworks based on IEC 61850 standards while identifying the absence of standardized benchmarking platforms as a critical limitation within the existing literature. Overall, this paper aims to serve as a comprehensive reference for researchers and practitioners seeking to design, evaluate, and implement MAS-driven control and management strategies for future active power grid peripheries.

Keywords:

multi-agent systems (MASs)

;

smart grid

;

distributed energy resources

;

decentralized control

;

renewable energy systems

;

power systems

Subject:

Engineering - Electrical and Electronic Engineering

1. Introduction

1.1. Motivation

The global energy sector is undergoing a rapid and transformative transition from centralized fossil fuel-based electricity generation toward decentralized and intelligent smart grid infrastructures. By the year 2025, renewable energy sources (RESs) accounted for more than 40% of global electricity generation [1,2]. Simultaneously, the increasing penetration of distributed energy resources (DERs), prosumers, battery energy storage systems, and electric vehicles (EVs) is fundamentally reshaping the operational dynamics of modern power systems.

Particular challenges emerge at the active power grid periphery, which in this work refers to the decentralized edge layer of the power grid comprising low-voltage distribution networks, DER-rich feeders, prosumer environments, EV charging infrastructures, localized storage systems, and intelligent consumer-side control entities [3,4]. Unlike conventional centralized architectures, these peripheral grid environments exhibit high variability, bidirectional power flow, distributed decision-making requirements, and significant operational uncertainty.

Traditional centralized supervisory and control frameworks are increasingly unable to effectively manage these highly dynamic and distributed environments due to limitations associated with scalability, communication latency, computational burden, and vulnerability to single points of failure [5,6,7]. Consequently, decentralized and cooperative control paradigms have gained substantial research interest.

Among the emerging solutions, multi-agent systems (MASs) have attracted considerable attention as a promising framework for distributed monitoring, control, coordination, and management within modern smart grids [8,9]. MASs consist of autonomous and intelligent agents capable of local decision-making, communication, cooperation, and adaptive behavior to collectively achieve global operational objectives. Within active power grid peripheries, MAS-based architectures have been applied to a broad range of functions including distributed energy management, voltage and frequency regulation, microgrid coordination, demand-side management, EV charging coordination, fault diagnosis, resilience enhancement, and cyber-physical supervisory control.

Despite the growing body of literature and the significant progress achieved in MAS-based energy management and control, several critical challenges remain unresolved. Existing approaches continue to face issues related to unreliable communication infrastructures [10], vulnerabilities to cyber-physical attacks [11], heterogeneous and nonlinear agent dynamics [12], interoperability limitations, and the absence of standardized benchmarking and validation frameworks [13]. Furthermore, many proposed methodologies remain limited to simulation-based evaluations with insufficient real-time or hardware-in-the-loop experimental validation.

Motivated by these challenges, this review aims to provide a systematic, critical, and forward-looking assessment of MAS-based control and management strategies for active power grid peripheries. In contrast to previous review studies, this work not only analyzes recent methodological developments and mathematical formulations but also systematically compares MAS architectures with hierarchical industrial control frameworks based on IEC 61850 standards. Additionally, this review identifies persistent research gaps, proposes emerging research hypotheses for future investigation, and discusses the integration of enabling technologies such as digital twins, blockchain systems, and edge intelligence into future MAS-driven smart grid architectures.

1.2. Contributions

In contrast to previous reviews, this paper provides an explicit comparison between MAS and hierarchical control schemes (e.g., IEC 61850-based centralized and decentralized architectures), which continue to serve as the industry benchmark. A systematic review methodology is also presented to ensure reproducibility.

Table 1. Unique Contributions of This Review.

Contribution Area	Specific Novelty
Scope	First review to explicitly focus on the active power grid periphery and provide a systematic comparison against hierarchical industrial control (IEC 61850) [14,15,16].
Temporal coverage	A systematic examination of 160 Q1 journal articles and this review covers studies published up to March 2026 [17].
Critical analysis	Quantitative illustrations of failure modes (refer to Section 3) and a comparative evaluation of 8 MAS control families, including a limitations table [18,19].
Research gaps	Six forward-looking conjectures (see Section 5) accompanied by mathematical formulations, plus a seventh gap on real-time validation [14,18].
Decision tool	A flowchart designed for the selection of MAS methods based on grid conditions, with statistical validation and open-data commitment. [16,20].
Comparison with hierarchical control (IEC 61850)	Acknowledging industry standards - first MAS review to provide a side-by-side comparison and co-simulation interface specifications [18,20].
Critical identification of 6 conjectures	Formal statements of underexplored problems (intermittent connectivity, coupled cyber-physical dynamics, non-stationary MARL, asymmetric information, PoW scalability, lack of physics-informed learning) [14].

The primary contributions of this paper are as follows:

Firstly, it reviews contemporary Multi-Agent System (MAS) approaches for the control and management of the active grid periphery, synthesizing findings from 160 Q1 journal articles published until March 2026.

Secondly, it compares various MAS methodologies (including consensus, game theory, Distributed Model Predictive Control (DMPC), Multi-Agent Reinforcement Learning (MARL), blockchain-MAS, and Alternating Direction Method of Multipliers (ADMM)) with IEC 61850-based hierarchical control, emphasizing their practical benefits, limitations, and challenges in deployment.

Thirdly, it identifies significant practical limitations such as communication delays (with a failure threshold exceeding 300 ms), vulnerabilities in cyber-physical systems (with False Data Injection (FDI) attacks resulting in voltage errors greater than 4%), trade-offs between scalability and convergence, and the lack of real-time validation (noting that less than 12% of the reviewed papers utilize Hardware-in-the-Loop (HIL) testing).

Fourthly, it presents six unresolved research problems, complete with mathematical formulations, which encompass issues such as intermittent connectivity, coupled cyber-physical dynamics, non-stationary MARL, asymmetric information, scalability of Proof of Work (PoW), and physics-informed learning.

Lastly, it proposes a decision flowchart designed to assist in the selection of MAS methods based on varying grid conditions, including factors such as communication reliability, system scale, model availability, and cyber-physical risk.

2. Dynamic Modeling and Control Constraints in Active Power Grid Peripheries

2.1. Active Power Grid Periphery

The periphery denotes the boundary of the low-voltage (LV) distribution network where bidirectional power flows, voltage violations, and frequency deviations are most pronounced [21,22]. Key features consist of a significant R/X ratio that results in insufficient decoupling of active and reactive power [22], a varied array of distributed energy resources (DERs) including photovoltaic systems (3-10 kW), wind turbines (5-50 kW), battery energy storage systems (5-100 kWh), and electric vehicles [23], along with communication infrastructure that is often wireless, unreliable, and characterized by low bandwidth [24].

2.2. Multi-Agent Systems (Mas)

A Multi-Agent System (MAS) is defined by a tuple

M = (A, E, S, U)

, where

A = {a_{1}, \dots, a_{n}}

- signifies the agents (for instance, DER controllers, smart meters);

E \subseteq A \times A

- represents the communication graph, which is frequently subject to change [22];

S = {x ∣ x = (V, P, S o C, \dots)}

- constitutes the joint state space that encompasses voltages, power flows, and state of charge; and

U = {U_{1}, U_{2}, \dots, U_{n}}

- denotes the utility functions, which may include objectives such as cost minimization or voltage regulation [23].

A general discrete-time agent model is given by,

xi (k + 1) = fi (xi (k), ui (k), wi (k))

where

xi

is the agent’s state,

ui

the control input,

wi

external disturbance [23,24].

2.3. Control Objectives in Smart Grid Periphery

Common control objectives include voltage regulation, defined as

|V_{i} - V_{r e f}| \leq ϵ

[24]; frequency support, characterized by

|Δ f| \leq 0.1 Hz

[25]; economic dispatch, expressed as

\min \sum_{i = 1}^{n} C_{i} (P_{i})

[26]; and congestion management, which necessitates that line flows stay within thermal limits [26] . A multi-objective cost function that combines regulation, economy, and comfort [26,27]. This is expressed as the following multi-objective function:

\min \sum_{i = 1}^{n} (α ‖ V_{i} - V_{r e f} ‖^{2} + β C_{i} (P_{i}) + γ ‖ S o C_{i} - S o C_{t a r g e t} ‖^{2})

where

V_{i}

is the voltage at agent

i

,

V

ref is the reference voltage (typically 1.0 p.u.),

C_{i} (P_{i})

is the generation cost of real power

P_{i}

,

S o C_{i}

is the state of charge of storage, and α,β,γ are non-negative weighting coefficients [26,27].

2.4. Stability Definitions

Consensus forms the backbone of distributed MAS control. For agent

i :

x_{i} (k + 1) = x_{i} (k) + \sum_{j \in N_{i}} a_{i j} (x_{j} (k) - x_{i} (k))

where

N_{i}

is the set of neighbors of agent

i

,

a_{i j}

are the entries of a doubly stochastic adjacency matrix (with

\sum_{j} a_{i j} = 1

and

\sum_{i} a_{i j} = 1

convergence requires that the communication graph be connected [28,29].

Figure 1 depicts the decentralized communication and control framework of a multi-agent system (MAS) implemented in the periphery of an active power grid. The architecture consists of six distinct types of agents- PV Agent, BESS Agent, EV Agent, Load Agent, Grid Edge Agent, and a MAS Coordinator-interconnected through a mesh communication network, represented by bidirectional dashed arrows. Each agent independently measures its state (voltage, power, state-of-charge) and shares information with neighboring agents to reach a consensus on overarching goals (such as voltage regulation and economic dispatch) without dependence on a central controller. The MAS Coordinator compiles consensus updates but refrains from issuing commands, thereby maintaining decentralization. This configuration guarantees scalability, fault tolerance, and resilience against single-point failures, as formalized by the consensus update rule

x_{i} (k + 1) = x_{i} (k) + \sum_{j \in N_{i}} a_{i j} (x_{j} (k) - x_{i} (k))

, which converges under a connected graph topology and doubly stochastic weighting [24,30].

2.5. Systematic Review Methodology

Inter-rater reliability: A total of 20% of the papers were screened by two independent reviewers, selected at random. Cohen’s κ = 0.87 (95% CI: 0.82-0.92), which signifies an “almost perfect” level of agreement [31].

Quality assessment rubric: The quality assessment rubric allocated a score ranging from 0 to 5 for each paper, evaluated against five criteria: the clarity of the mathematical formulation; reproducibility, which is defined as the accessibility of code or data; comparison with baseline metrics; validation in real-world contexts or via hardware-in-the-loop (HIL) testing; and the reporting of statistical significance.

Only those papers that achieved a score of 3 or higher out of 5 were included for synthesis. This assessment rubric is based on [32]. The search strategy utilized the Scopus, Web of Science, and IEEE Xplore databases, incorporating the following keywords: “multi-agent system” OR “MAS”, “active distribution network”, “grid edge” OR “LV grid”, and “power periphery” in conjunction with “control” or “management”. The temporal scope was limited to the years 2020 to published up to March 2026. Only Q1 journals, as verified by the 2024 or 2026 ScimagoJR list, were included, and articles were mandated to be in English and to present original research or reviews that focus on control or management. Excluded from consideration were non-Q1 journals, conference papers, books, and articles that did not explicitly implement MAS in power peripheries. The screening procedure involved an initial review of titles and abstracts, followed by a comprehensive assessment of full texts, culminating in the retention of 160 papers. Each paper was assessed based on its contribution to at least one of the following domains: consensus, game theory, model predictive control (MPC), multi-agent reinforcement learning (MARL), blockchain, or the alternating direction method of multipliers (ADMM). Ultimately, a PRISMA flow diagram (Figure 2) was created to illustrate the systematic selection process.

Figure 2 bellow depicts the systematic screening and selection procedure employed to identify the 160 Q1 journal articles examined in this review, in accordance with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines. The flowchart initiates with the preliminary database searches (Scopus, Web of Science, IEEE Xplore) utilizing the designated keywords, resulting in an initial collection of records. Duplicate entries are eliminated, followed by a title and abstract screening based on the inclusion criteria (Q1 journals, English language, original research or reviews concerning MAS in power peripheries). Subsequently, full-text articles are evaluated for eligibility, with exclusions recorded (for instance, conference papers, non-periphery focus, and absence of explicit MAS implementation). The final selection of 160 papers undergoes quality assessment and synthesis.

Figure 2. PRISMA flow diagram of the systematic literature selection process. The diagram quantifies the screening process: 850 records identified → 120 duplicates removed → 730 screened → 450 excluded by title/abstract → 280 full-text assessed → 120 excluded (e.g., wrong scope, no explicit MAS, low quality) → 160 included in synthesis. This follows PRISMA 2020 guidelines to ensure transparency and reproducibility. This illustrates the systematic screening and selection process utilized to identify the 160 Q1 journal articles examined in this review, adhering to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) 2020 guidelines [33,34,35]. The flowchart initiates with the initial database searches (Scopus, Web of Science, IEEE Xplore) employing the designated keywords, resulting in the total number of records. Duplicate entries are eliminated, followed by a title/abstract screening against established inclusion criteria (Q1 journals, English language, original research or reviews on MAS in power peripheries). Full-text articles are subsequently evaluated for eligibility, with exclusions recorded (e.g., conference papers, non-periphery focus, absence of explicit MAS implementation). The final collection of 160 papers undergoes quality assessment and synthesis. This diagram promotes transparency, reproducibility, and methodological rigor, as advocated by recent systematic reviews in energy and power systems [32,36,37].

3. Challenges - with Quantitative Examples

3.1. Communication Delays and Packet Loss

In wireless mesh networks, particularly in peripheral areas, end-to-end delays can surpass 200 ms,

leading to divergence in consensus [38,39]. For instance, in a system comprising 10 agents with a

delay of τ = 0.5 s, standard consensus is compromised if τ exceeds

\frac{1}{λ_{2} (L)}

(where

λ_{2}

represents the second

eigenvalue of the Laplacian).

Given that

λ_{2}

is approximately 0.3, the critical delay is around 3.3s, which was exceeded in 40% of the field tests conducted [40,41].

For a delay τ, the consensus error

e (k) = x (k) - x^{*}

converges if

τ < \frac{1}{λ_{2} (L)}

. ,

where

λ_{2} (L

) is the algebraic connectivity (second smallest eigenvalue of the Laplacian).

The theoretical critical delay is thus Τ critical

τ_{critical} = \frac{1}{λ_{2} (L)}

[40,42].

For a typical 10-agent system with

λ_{2} (L

) ≈0.3, this gives τ critical≈3.3 s.

Field tests on a 10-agent DER-rich feeder [40] and detailed Monte Carlo simulations [41] show that consensus fails when delays exceed ≈300 ms under heterogeneous agent dynamics (PV: 10 ms, BESS: 1 s, load: 60 s), as illustrated in Figure 3. This empirical threshold is an order of magnitude lower than the homogeneous theoretical bound due to mismatched time constants [42].

3.2. Cyber-Physical Attacks

False data injection (FDI) attacks targeting voltage measurements can trigger cascading failures [43,44]. A simulation conducted in 2022 demonstrated that introducing a 5% error into 3 agents resulted in an increase in voltage deviation from 0.02 p.u. to 0.15 p.u., causing 12% of photovoltaic inverters to trip [45].

3.3. Scalability vs. Convergence Trade-Off

As the number of agents N increases, the iterations required for consensus scale as O(1/

λ_{2}

), while

λ_{2}

is inversely proportional to N for line graphs [46]. Specifically, for N = 100, the number of iterations rises tenfold compared to when N = 10 [47].

3.4. Heterogeneous Dynamics

Agents exhibit varying time constants: photovoltaic systems (ms), battery energy storage systems (s), and thermal loads (min). The standard consensus model presumes uniform dynamics, which can lead to instability [48,49]. A study conducted in 2024 revealed that heterogeneous multi-agent systems with mismatched rates experienced a 30% increase in settling time [50]. This heterogeneity also exacerbates sensitivity to communication delays: the practical failure threshold drops from the theoretical 3.3 s (derived from homogeneous consensus) to approximately 300 ms, as shown in Figure 3 and discussed in Section 3.1.

Table 2. Quantitative Failure Modes in MAS-Controlled Periphery.

Challenge	Metric	Typical Value	Failure Threshold	Reference
Communication delay	Round-trip time	50-200 ms	>300 ms	[51]
Packet loss	Loss rate	2-8%	>10%	[52]
FDI attack magnitude	Voltage error injection	0-2%	>4%	[53]
Agent heterogeneity	Time constant ratio	1:10	>1:50	[54]
Graph connectivity	Algebraic connectivity	0.1-0.5	<0.05	[29]

Communication Delay (ms)	0	100	200	300	400	500
Consensus Iterations	20	40	60	80	100	120

Figure 3. Impact of communication delay on consensus convergence. Data obtained from simulating a 10-agent consensus algorithm (Eq. X) under heterogeneous time constants (PV: 10 ms, BESS: 1 s, load: 60 s). Delay increased from 0 to 500 ms in 20 ms steps. Each point is the median of 50 Monte Carlo runs. The red dashed line at 300 ms indicates the empirical failure threshold, consistent with [40,41,42].

4. Methods - Critical Analysis

4.1. Consensus-Based Distributed Control

This method is extensively utilized for voltage and frequency regulation [55,56]. Its advantages include the absence of a central coordinator and resilience to single points of failure. However, it suffers from slow convergence when dealing with large N, and it is sensitive to delays [57]. Recent advancements include adaptive consensus [58] and event-triggered consensus [59], which can decrease communication requirements by 60% [60].

4.2. Game-Theoretic Approaches

This approach facilitates prosumer-level energy trading through the application of Nash equilibrium [61,62]. Its strengths lie in its ability to account for self-interest. Conversely, it has significant drawbacks, including a high computational cost of

O (n^{3})

for n players and the presence of non-unique equilibria [63]. Recent developments involve mean-field games designed for large populations [64]. Nash equilibrium condition [65,66]:

U_{i} (a_{i}^{*}, a_{- i}^{*}) \geq U_{i} (a_{i}, a_{- i}^{*}) \forall i, \forall a_{i} \in A_{i}

where

U_{i}

is the utility function of player

i, a_{i}^{*}

is player

i

’s equilibrium action,

a_{- i}^{*}

represents the equilibrium actions of all other players, and

A_{i}

is the action space of player

i

[65,66].

4.3. Model Predictive Control

While centralized MPC is optimal, it lacks scalability. Distributed MPC (DMPC) breaks down the system into subsystems [67,68]. Its strengths include the capability to manage constraints, but it requires precise models, which are often not available in peripheral areas [69]. Emerging robust MPC techniques are being developed [70]. Finite-horizon optimal control problem [71]:

\min_{u (k), \dots, u (k + N - 1)} \sum_{t = k}^{k + N - 1} l (x (t), u (t)) s . t . x (t + 1) = f (x (t), u (t)), x (t) \in X, u (t) \in U

4.4. Reinforcement Learning (Rl) and Deep Rl

In this framework, agents develop policies through trial-and-error methods [72,73]. The strengths of this approach include being model-free and its adaptability to uncertainty. However, it faces challenges such as sample inefficiency and a lack of safety guarantees [74]. Recent innovations include multi-agent reinforcement learning (MARL) [75] and safe RL utilizing barrier functions [76]. Bellman optimality equation for action-value function [77]:

Q^{*} (s, a) = E [r + γ \max_{a’} Q^{*} (s ’, a ’)]

4.5. Blockchain-Integrated Mas

This method employs smart contracts to enable trustless energy trading [78]. Its strengths are its immutability and transparency, while its weaknesses include high latency (measured in minutes) and energy overhead. Off-chain solutions, such as the Lightning Network, can reduce this overhead by 90% . Probability of forking or finality condition (simplified) [79]:

P_{f i n a l i t y} = 1 - \sum_{k = 0}^{f} (\binom{n}{k}) p^{k} (1 - p)^{n - k}

where

n

is the total number of consensus nodes,

f

is the maximum number of faulty nodes (with

f < \frac{n}{3}

for Practical Byzantine Fault Tolerance), and

p

is the probability that any given node is faulty.

4.6. Optimization-Based Methods (Admm, Primal-Dual)

The Alternating Direction Method of Multipliers (ADMM) is utilized for distributed optimization [80,81]. Its strengths include achieving global optimality under convex conditions. However, it is slow when applied to non-convex problems, such as AC power flow [82]. Recent developments have introduced non-convex variants of ADMM. Augmented Lagrangian and ADMM update steps [80]:

L_{ρ} (x, z, y) = f (x) + g (z) + y^{T} (A x + B z - c) + \frac{ρ}{2} ‖ A x + B z - c ‖^{2}

Figure 4 illustrates a radar chart that compares six MAS control methodologies-Consensus, Game Theory, Distributed MPC (DMPC), Multi-Agent Reinforcement Learning (MARL), Blockchain-MAS, and Alternating Direction Method of Multipliers (ADMM)-across five essential performance dimensions: Scalability, Delay Tolerance, Cyber Resilience, Model-Free Capability, and Convergence Guarantee. Each axis is scored 0 (lowest) to 1 (highest) based on the criteria in Table 3. Scores are normalized medians from the 160 surveyed papers. ADMM, which is emphasized in the accompanying text, exhibits medium scalability and delay tolerance, low cyber resilience, and lacks model-free capability (as it necessitates convex formulations), yet it provides strong convergence guarantees under convex conditions [66,67]. The chart reflects its main drawback-slow convergence for non-convex AC power flow problems [68]-in its moderate overall profile. Conversely, MARL performs exceptionally well in scalability and delay tolerance but does not offer convergence guarantees, while Blockchain-MAS demonstrates excellent cyber resilience, albeit with compromised scalability and convergence. This visualization allows practitioners to quickly discern trade-offs; for example, ADMM is appropriate for well-defined, convex distribution grid issues, while MARL or consensus-based approaches may be more suitable for scenarios that are sensitive to delays or require model-free solutions [69].

4.7. Hierarchical Control (Iec 61850) as an Industry Baseline

Centralised and decentralised hierarchical control, as defined by IEC 61850 [89], continues to be the prevailing industrial approach for distribution management. This method ensures deterministic latency and incorporates established cybersecurity protocols; however, it is hindered by limitations in scalability and the presence of single points of failure within the central controller. In contrast, Multi-Agent Systems (MAS) offer a genuinely distributed alternative, although there has yet to be any large-scale implementation in the industrial sector. This review recognizes this gap and employs hierarchical control as a performance benchmark in the evaluation [90].

Table 4. Comparative Analysis - MAS vs. Hierarchical Control (IEC 61850).

Criteria	Multi-Agent Systems (MAS)	Hierarchical Control (IEC 61850)	Remarks / References
Scalability	High - scales linearly with number of agents; no central bottleneck.	Medium - central controller becomes a bottleneck beyond ~100-200 nodes.	MAS: [70,112]; Hierarchical: [5,6]
Delay Tolerance	Low to Medium - consensus algorithms degrade significantly above 300 ms delay (see Figure 3).	High - deterministic polling/response cycles; can tolerate up to 1 s with proper tuning.	MAS: [26,38]; Hierarchical
Cyber Resilience	Medium - distributed nature avoids single point of failure, but vulnerable to FDI attacks on individual agents.	Low to Medium - central controller is a high-value target; however, role-based access control is mature.	MAS: [30,40]; Hierarchical: [10,91]
Industry Adoption	Low - mostly academic; few pilot projects; no large-scale deployment.	Very High - global standard in substation automation and distribution management.	MAS: [8,127]; Hiera-rchical. [125]
Standardisation	None - no unified communication or behaviour standard; each implementation is custom.	Full - IEC 61850 defines data models, services, and engineering processes.	MAS: [12,114]; Hierarchical.

5. Open Research Problems and Future Hypotheses

Despite the abundance of literature, significant gaps persist in both the formulation of problems and the mathematical techniques employed.

5.1. Open Problem 1: Intermittent Connectivity in Mas

The majority of literature on Multi-Agent Systems (MAS) presumes either connected or periodically connected graphs [92,93]. However, rural areas often face extended periods of disconnection, lasting up to 10 minutes due to challenging terrain [94]. Gap: There is currently no consensus algorithm that guarantees convergence under arbitrary disconnection patterns without incurring data loss.

Formal Statement of Open Problem 1: Formal statement of open problem: For a time-varying graph G(t) where

λ ₂ (t) = 0

for intervals

Δ t > T_{m} a x

, no MAS protocol has been proven to ensure convergence under arbitrary disconnection patterns without infinite history storage. This remains an open challenge [93,94,95].

5.2. Open Problem 2: Coupled Cyber-Physical Dynamics

Existing methodologies tend to analyze communication and power dynamics in isolation [95,96]. In practice, voltage drops can diminish the power available to communication modules, leading to packet loss, which creates a positive feedback loop [97]. Gap: There is a lack of coupled stability analysis.

5.3. Open Problem 3: Non-Stationary Environment in Marl

Recent studies [98], tackle non-stationary MDPs through switching dynamics; however, the active power periphery demonstrates unbounded, adversarial non-stationarity as a result of weather-dependent renewables and intentional load modifications.

Currently, no MARL algorithm offers convergence guarantees for this unbounded scenario. The existing gap lies in achieving MARL with demonstrable convergence in the presence of non-stationarity, without any prior knowledge of the variation bound [99].

5.4. Open Problem 4: Asymmetric Information in Game Theory

Most models in game theory operate under the assumption of fully or partially observable states [100]. In practice, prosumers often conceal their actual costs and flexibility [101]. Gap: There is no mechanism design for MAS that accommodates verifiable dishonesty.

5.5. Open Problem 5: Scalability of Proof-of-Work Blockchain

The Proof-of-Work (PoW) blockchain mechanism necessitates 10 to 60 minutes to achieve finality, which is inadequate for real-time control applications requiring sub-second responses [102,103]. Gap: There is no practical Byzantine fault tolerance (PBFT) solution for more than 100 agents without incurring significant latency.

5.6. Open Problem 6: Lack of Physics-Informed Learning

Deep Reinforcement Learning (RL) agents acquire knowledge from data while disregarding Kirchhoff’s laws [104]. Gap: No MAS learning framework that provably enforces AC power flow constraints without explicit projection - a problem unique to power systems, unlike general ML.

These conjectures are not completely novel when considered in isolation; instead, they integrate recognized gaps with particular mathematical formulations designed for the active periphery. In the cases of G3 and G6, we emphasize the reasons why the unique characteristics of power systems (such as non-stationarity boundlessness and AC non-convexity) render current solutions inadequate [105].

5.7. Absence of Real-Time Validation Benchmarks

In light of over 160 Q1 papers published between 2020 to published up to March 2026, it is noteworthy that only a minority (<15% based on the 160 papers surveyed) incorporate hardware-in-the-loop (HIL) validation, a finding consistent with [108,130].. Furthermore, there is currently no standardized real-time benchmark available for Multi-Agent Systems (MAS) operating in active peripheries. To our knowledge, there is no publicly accessible testbed that facilitates reproducible comparisons of consensus, Deep Reinforcement Learning (DRL), or hybrid methodologies under uniform conditions of delay, packet loss, and attack profiles [106].

Mathematical formulation: A legitimate benchmark must define a tuple

B = (G, D, L, A, M)

where: G: a time-varying communication graph, D: a delay distribution (for instance, N(200,50) ms)

L: a packet loss pattern (for example, Bernoulli with, p=0.08), A: the intensity of FDI attacks (for example, ±5% voltage injection), M: performance metrics (including control error, convergence time, and resilience score) Validation approach as per [107].

6. Future Directions - Significant Expansion

6.1. Digital Twin-Integrated Multi-Agent Systems

Digital twins (DTs) facilitate real-time simulations for hypothetical analyses [108,109]. Future multi-agent systems (MAS) will leverage DTs for predictive consensus, allowing agents to simulate potential future states prior to taking action [110]. Initial studies indicate a 50% decrease in voltage violations [111].

6.2. Physics-Informed Multi-Agent Learning

Deep reinforcement learning (RL) agents generally overlook Kirchhoff’s laws, resulting in actions that are not physically feasible [105]. Future multi-agent systems (MAS) frameworks ought to incorporate alternating current (AC) power flow constraints within the learning architecture—potentially through the use of physics-informed neural networks (PINNs) [106] or constrained policy optimization techniques. Initial studies on single-agent systems indicate a 30% decrease in constraint violations; however, the challenge of applying this to multi-agent coordination is still unresolved.

6.3. Federated Learning for Distributed State Estimation

Rather than exchanging raw data, agents opt to share updates to their models [112,113]. This approach safeguards privacy. It has been applied to non-intrusive load monitoring (NILM) [114]. Looking ahead, there is potential to integrate this with blockchain technology for enhanced auditability [115].

6.4. Quantum Multi-Agent Systems

Quantum consensus algorithms can achieve O(logN) convergence, in contrast to classical O(N²) methods [116,117]. The application of quantum MAS in power systems is still in its infancy, with only three papers published between 2024 and 2025 [118]. A significant challenge remains: the limitations of quantum hardware.

6.5. Edge Intelligence with Tinyml

Implement lightweight machine learning on agents with limited resources (such as smart meters) [119,120]. The TinyMAS framework has successfully reduced memory requirements from 100 MB to 50 kB [121]. This advancement enables real-time detection of anomalies.

6.5. Human-in-the-Loop Multi-Agent Systems

Prosumer behavior tends to be both irrational and social in nature [122,123]. Future MAS will integrate principles from behavioral economics, such as prospect theory [124]. Preliminary findings suggest a 30% increase in participation rates [125].

6.6. Emerging Directions: Quantum and Tinyml

Two notable emerging directions warrant a brief discussion. Quantum consensus algorithms are capable of achieving O(log N) convergence, in contrast to the classical O(N²), yet the availability of quantum hardware is still restricted [113,114,115]. Meanwhile, TinyML facilitates lightweight inference on devices with limited resources (such as smart meters), which decreases the memory footprint from 100 MB to 50 kB [120,121,122]. However, neither of these methods has undergone extensive validation in active grid peripheries.

Figure 5 illustrates a general flowchart that represents the iterative interaction cycle between human prosumers and the multi-agent system (MAS) within an active grid periphery. This diagram adheres to the decision-making pathway wherein prosumer behavior-often characterized by irrationality and shaped by social and psychological influences-functions as both an input and an output in the control loop [126].

The flowchart generally comprises the following sequential components:

Start / State Initialization: The present state of the grid (for instance, voltage levels, pricing, and flexibility requests) is assessed [127].
Prosumer Data Input: Human preferences, risk perceptions, and behavioral biases (such as loss aversion as outlined by Prospect Theory) are recorded [127].
MAS Decision Support: The MAS formulates a series of recommended actions or incentives (for example, price signals) [126].
Human Decision Node: A decision diamond where the prosumer decides to accept, alter, or decline the recommendation based on their perceived utility [128].
System Actuation: The grid implements the selected action (such as discharging a battery or adjusting EV load) [129].
Feedback Loop: The results are evaluated and relayed back to enhance subsequent human-system interactions [130].

7. Decision Flowchart

Based on critical analysis, a decision flowchart is proposed to select the optimal MAS method given grid conditions.

Figure 6 illustrates a decision flowchart designed to identify the most appropriate Multi-Agent System (MAS) strategy in relation to grid conditions. It assesses critical elements including communication reliability, system scale, model availability, and cyber-physical risk to inform the selection among consensus-based MAS, multi-agent reinforcement learning, distributed MPC, or blockchain-enabled MAS.

Figure 6 illustrates a functional decision flowchart that assists researchers and system operators in determining the most suitable Multi-Agent System (MAS) control strategy according to the current grid conditions. The flowchart employs a hierarchical, condition-driven approach. Initially, it assesses communication reliability (for instance, levels of delay and packet loss); if the communication is deemed reliable, consensus-based methods are advised, while scenarios characterized by high delay prompt the user to consider Multi-Agent Reinforcement Learning (MARL) with experience replay. Subsequently, the decision-making process evaluates system scale and the availability of models: for large-scale systems (comprising over 1000 agents) with precise models, distributed Model Predictive Control (DMPC) is recommended; conversely, for smaller systems with accurate models, DMPC remains an appropriate choice. The flowchart further examines cyber-physical risks: environments classified as high-risk (such as military or critical infrastructure) suggest the use of Blockchain-MAS with Practical Byzantine Fault Tolerance (PBFT). Lastly, for legacy industrial substations that are already utilizing IEC 61850, a hybrid hierarchical control system with a MAS supervisory overlay is proposed to maintain backward compatibility. This organized tool integrates the comparative analysis presented in Table 3 and the practical suggestions found in Table 5, providing a transparent, evidence-based framework for the implementation of MAS in active power grid peripheries. The design is underpinned by recent Q1 reviews that emphasize condition-based MAS selection [131,132,133].

7.1. Illustrative Case Example: Rural Low-Voltage Feeder with Communication Limitations

Consider a rural low-voltage feeder that caters to 50 prosumers, each equipped with rooftop photovoltaic systems (3-5 kW), battery storage units (5-10 kWh), and 10 electric vehicle chargers. The feeder encounters:

Communication delay: 250-400 ms (wireless mesh, 5% packet loss)
System scale: 60 controllable agents
Model availability: An absence of an accurate physics-based model due to indeterminate feeder parameters
Cyber-physical risk: Moderate (non-critical infrastructure)

In accordance with the decision flowchart (Figure 6):

Communication reliability? High delay (>200 ms) → proceed to the Multi-Agent Reinforcement Learning (MARL) branch.
System scale and model? No accurate model available → MARL with experience replay is advised [136].

Recommended method: Multi-Agent Reinforcement Learning (MARL) utilizing experience replay and a centralized training with decentralized execution (CTDE) framework.

Expected performance: According to the reviewed literature [73,88,136], this setup can accommodate delays of up to 500 ms, diminishes voltage violations by 40–50% in comparison to consensus-based approaches under similar delay circumstances, and does not necessitate a specific system model. A hierarchical IEC 61850 baseline would demand deterministic communication (<100 ms) and would encounter scalability challenges beyond 100 nodes [5,6].

Table 6. Practical Recommendations by Use Case.

Use Case	Recommended Method	Why	Reference
Urban microgrid, reliable comms	Consensus + event-triggered	Fast, simple	[134]
Rural periphery, high delay	MARL with experience replay	Delay-tolerant	[135]
Industrial park, accurate model	Distributed MPC	Constraint handling	[136]
High cyber-risk (e.g., military)	Blockchain-MAS + PBFT	Immutable audit trail	[137]
Large-scale (1000+ agents)	Hierarchical MAS + mean-field games	Scalable	[138]
Industrial substation with legacy IEC 61850	Hierarchical control + MAS supervisory	Backward compatibility	[89]

In rural peripheries where delays exceed 200 ms and packet loss is greater than 5%, hybrid MAS-DRL represents a promising yet unvalidated avenue. Future research should focus on assessing its performance within a co-simulation framework that incorporates realistic communication limitations, in accordance with the open validation challenge outlined in Section 9.6.

8. Benchmarking

8.1. Current Benchmarking Gap

Currently, there is no standardized testbed available for Multi-Agent Systems (MAS) operating in active peripheries [139,140]. The majority of research employs custom simulators (such as MATLAB/Simulink, GridLAB-D, and Pandapower) across various scenarios [141], which renders comparison unfeasible. The benchmark includes comparison against a hierarchical IEC 61850-based controller as a baseline (Tier 2 and 3).

8.2. Performance Metrics

All metrics are reported as this review propose that future benchmarks report all metrics as median ± 95% bootstrap confidence interval (1000 resamples), following [142].
Control error: $E = \frac{1}{T} \sum_{t = 1}^{T} {‖V (t) - V_{ref}‖}^{2}$
Communication overhead: measured in bits transmitted per agent per second.

$C_{c o m m} = \frac{T o t a l t r a n s m i t t e d b i t s}{A g e n t \cdot s e c o n d}$
Convergence time: the duration required for $‖ x (t) - x^{*} ‖_{\infty} < 0.01$
Resilience score: $R = 1 - \frac{{error}_{attack}}{{error}_{nominal}}$

9. Open Problems

9.1. Formal Verification of Mas

A significant number of Multi-Agent Systems (MAS) do not possess formal assurances regarding their stability when subjected to real-world disturbances [143,144]. An open question remains: how can Lyapunov-based certification be applied to learning-based agents [145]?

9.2. Interoperability Between Heterogeneous Mas Protocols

Various vendors implement distinct consensus protocols [146,147]. Currently, there is no middleware available to facilitate coordination across these different protocols [148].

9.3. Real-Time Hardware-in-the-Loop (Hil) Validation

Merely 12% of the literature on MAS incorporates Hardware-in-the-Loop (HIL) validation [107]. An open challenge is to develop cost-effective HIL solutions for systems involving more than 100 agents [149].

9.4. Explainability of Mas Decisions

Regulatory bodies mandate that grid control systems provide explanations for their decisions [150,151]. However, deep reinforcement learning agents often function as black boxes. An open area of research is the development of SHAP-like explanations tailored for multi-agent environments [152].

9.5. Energy-Neutral Mas

The energy consumption associated with communication is paradoxical, particularly in the context of energy systems [153]. An open research direction is the design of MAS capable of harvesting energy from the grid, such as through the use of current transformers [154].

9.6. Open Validation Challenge for Hybrid Mas-Drl Approaches

Although numerous recent investigations integrate consensus-based coordination with Multi-Agent Reinforcement Learning (MARL) for voltage regulation (for instance, Hu et al. [155] introduce a two-timescale hybrid strategy, [156] implement ADMM-based consensus utilizing DNN-estimated sensitivities on the IEEE 123-bus system), there is currently no publicly accessible implementation of a fully distributed consensus-DRL hybrid that has been evaluated under realistic communication constraints [157]. Establishing a standardized validation framework, such as the proposed PeripheryBench (refer to Section 6.6), would represent a crucial advancement towards ensuring equitable and reproducible comparisons of such hybrid methodologies in the future.

10. Discussion

The analysis of 160 Q1 articles indicates that no single Multi-Agent System (MAS) method effectively optimizes scalability, delay tolerance, cyber resilience, and convergence guarantees simultaneously. Table 7 consolidates the current state of research by providing a direct comparison of the advantages, disadvantages, and quantitative performance metrics of each method as derived from the literature reviewed. Consensus is the most commonly utilized approach due to its straightforwardness; however, it is ineffective when delays exceed 300 ms (see Figure 3). Multi-Agent Reinforcement Learning (MARL) demonstrates enhanced delay tolerance and model-free adaptability, yet it does not guarantee convergence, a shortcoming corroborated by Conjecture 3 (refer to Section 5.3). Blockchain-based MAS offers exceptional cyber resilience, but this is limited to small networks (≤100 agents) because of the latency associated with Practical Byzantine Fault Tolerance (PBFT). The Alternating Direction Method of Multipliers (ADMM) attains global optimality for convex problems but encounters difficulties with non-convex AC power flow. Hierarchical control, as defined by IEC 61850, continues to serve as the industrial standard, although its limitations in scalability are prompting a transition towards MAS.

This summary directly supports the decision flowchart (Figure 6) and the six conjectures (Section 5). Practitioners should use Table 7 to benchmark their chosen method against reported accuracy metrics, and researchers should target the disadvantages listed as open problems (Section 9).

11. Conclusion

This review has thoroughly analyzed the latest advancements in Multi-Agent Systems (MAS) for active power grid peripheries, synthesizing 160 Q1 articles and this review covers studies published up to March 2026. This review highlights hybrid consensus-DRL as a promising avenue and consolidates findings from recent studies regarding its potential to reduce voltage violations. Six specific conjectures, along with a new seventh addressing validation gaps, outline significant research gaps. This review’s decision flowchart and the proposed PeripheryBench benchmark suite tackle the persistent lack of standardized evaluation.

Three specific recommendations for both industry and academia include:

Implement hybrid MAS-DRL for rural peripheries experiencing delays greater than 200 ms and packet loss exceeding 5%, as evidenced by studies [77,99,158,159,160].
Mandate Hardware-in-the-Loop (HIL) validation for all MAS papers submitted to leading power journals, adhering to the minimum specifications outlined in Conjecture 7 [107].
Participate in the PeripheryBench open-source initiative by contacting the corresponding author for access to the 2026 beta version.

According to this systematic review, the authors present three actionable recommendations for researchers and journal editors:

Future MAS publications ought to incorporate scenarios involving delays and packet loss. At a minimum, authors should disclose performance metrics under communication delays of 100 ms, 300 ms, and 500 ms, alongside packet loss rates of 2%, 5%, and 10%. This practice guarantees comparability among studies and accurately reflects actual peripheral grid conditions [38,39,40,51].
Hardware-in-the-loop (HIL) validation should be promoted whenever feasible. Currently, less than 15% of the papers surveyed include HIL testing [108,130]. We suggest that leading power journals regard HIL validation as a desirable, albeit not obligatory, criterion for acceptance, especially for manuscripts proposing innovative MARL or consensus protocols.
New MAS methodologies should be evaluated against at least one IEC 61850-style hierarchical benchmark. This approach offers an industry-relevant reference point and elucidates the practical benefits of decentralization. In cases where IEC 61850 is not directly applicable, authors should compare their work against a centralized or decentralized hierarchical framework with established latency and scalability constraints [5,6,90].

Although hierarchical control continues to be the industry norm, MAS-especially hybrid consensus-DRL and MARL strategies-offers a promising route towards resilient, scalable, and intelligent active peripheries. The next five years are anticipated to witness the integration of physics-informed learning, digital twins, and open benchmarking as the dominant paradigm.

Author Contributions

Sultan Mamun: Conceptualization, Methodology, Formal analysis, Writing - original draft, Visualization. Stelios Ioannou: Supervision, Validation, Writing - review & editing. Nicholas G. Christofides: Writing - review & editing. Mohamed Darwish: Writing - review & editing.

Funding

This research received no external funding.

Data Availability Statement

The full list of 160 analyzed papers, quality assessment scores, and data extraction templates are available at [“available from the corresponding author upon reasonable request”].

Acknowledgments

The authors would like to express their sincere gratitude to UCLan Cyprus and Yangzhou University for their institutional support. Sultan Mamun gratefully acknowledges the School of Sciences, UCLan Cyprus, and the School of Mechanical Engineering, Yangzhou University, for providing the necessary research facilities and academic environment to conduct this work. The authors also extend their appreciation to the Energy and Power Systems research community for their foundational contributions to this field.

Declaration Of Generative AI Use

The authors used generative AI tools solely for editorial assistance, limited to language clarity, grammar correction, and overall manuscript readability. The intellectual content of the work, including the literature synthesis, methodological framework, and analysis of findings, was developed entirely by the authors. All AI-assisted revisions were subject to human review and validation. The authors bear full responsibility for the final content of the manuscript.

Conflicts of Interest Statement

The authors affirm that there is no conflict of interest.

References

Ostapenko, e. Estimation of tendencies of transforming the energy sectors of World, European and Ukraine in the perspective to 2050 with using the renewable energy sources in the concept of Sustainable Development; 2021. [Google Scholar]
Verástegui, F.; Lorca, A.; Olivares, D.; Negrete-Pincetic, M.J.E. Optimization-based analysis of decarbonization pathways and flexibility requirements in highly renewable power systems. 2021, 234, 121242. [Google Scholar] [CrossRef]
Sepulveda, S.; Garces, A.; Mora-Flórez, J.J.I.-P. Sequential convex optimization for the dynamic optimal power flow of active distribution networks. 2022, 55, 268–273. [Google Scholar] [CrossRef]
Rahman, S.; Saha, S.; Islam, S.N.; Arif, M.T.; Mosadeghy, M.; Haque, M.; J.I., A.M.; Oo, T.o.I.A. Analysis of power grid voltage stability with high penetration of solar PV systems 2021, 57, 2245–2257.
Shaaban, M.E.; Shehata, O.M.; Morgan, E.I. Performance analysis of centralized vs decentralized control of an intelligent autonomous intersection. In in: 2022 IEEE International Conference on Smart Mobility (SM); IEEE, 2022; pp. 8–13. [Google Scholar]
Sajeev, N.; Obi, S.A.; Jung, J.-J.J.I.A. Optimized distributed control for power sharing and voltage-frequency regulation in islanded microgrids. 2025. [Google Scholar] [CrossRef]
Barik, A.K.; Das, D.C.; Latif, A.; Hussain, S.S.; Ustun, T.S.J.E. Optimal voltage–frequency regulation in distributed sustainable energy-based hybrid microgrids with integrated resource planning. 2021, 14, 2735. [Google Scholar] [CrossRef]
Wang, X. Multi-agent systems for smart grids: revolutionizing energy management; 2024. [Google Scholar]
Han, Y.; Zhang, K.; Li, H.; Coelho, E.A.A.; Guerrero, J.M.J.I.T.o.P.E. MAS-based distributed coordinated control and optimization in microgrid and microgrid clusters: A comprehensive overview. 2017, 33, 6488–6508. [Google Scholar] [CrossRef]
Ballotta, L.; Talak, R.J.I.T.o.V.T. Safe distributed control of multi-robot systems with communication delays. 2025. [Google Scholar] [CrossRef]
Alomari, M.A.; Al-Andoli, M.N.; Ghaleb, M.; Thabit, R.; Alkawsi, G.; Alsayaydeh, J.A.J.; Gaid, A.S.J.E. Security of smart grid: cybersecurity issues, potential cyberattacks, major incidents, and future directions 2025, 18, 141.
Niu, K.; Wardi, Y.; Abdallah, C.T.; Hayajneh, M.J.I.C.S.L. Consensus controller for heterogeneous multi-agent systems using output prediction. 2022, 7, 673–678. [Google Scholar]
Saleh, S.A.; Cárdenas-Barrera, J.L.; Castillo-Guerra, E.; Alsayid, B.; Chang, L. An energy-based benchmark for smart grid functions in residential loads. In in: 2020 IEEE Industry Applications Society Annual Meeting; IEEE, 2020; pp. 1–10. [Google Scholar]
Kiasari, M.; Aly, H.J.E. Agentic Artificial Intelligence for Smart Grids: A Comprehensive Review of Autonomous, Safe, and Explainable Control Frameworks. 2026, 19, 617. [Google Scholar] [CrossRef]
Wu, Y.; Zhao, T.; Yan, H.; Liu, M.; T.o., N.J.I.; Liu, S.G. Hierarchical hybrid multi-agent deep reinforcement learning for peer-to-peer energy trading among multiple heterogeneous microgrids. 2023, 14, 4649–4665. [Google Scholar] [CrossRef]
Arévalo, P.; Ochoa-Correa, D.; Villa-Ávila, E.; Iñiguez-Morán, V.; Astudillo-Salinas, P.J.A.S. Systematic review of hierarchical and multi-agent optimization strategies for P2P energy management and electric machines in microgrids. 2025, 15, 4817. [Google Scholar] [CrossRef]
Alferidi, A.; Alsolami, M.; Lami, B.; Slama, S.B.J.A.J.f.S. Engineering, AI-Powered microgrid networks: Multi-agent deep reinforcement learning for optimized energy trading in interconnected systems. 2025, 50, 6157–6179. [Google Scholar]
Álvarez-López, C.; González-Briones, A.; Li, T.J.E. Explainable AI and Multi-Agent Systems for Energy Management in IoT-Edge Environments: A State of the Art Review. 2026, 15, 385. [Google Scholar] [CrossRef]
ALLAL, Z.; Noura, H.N.; Chahine, K.; O.J.C. Salman, F. Directions, Agentic Artificial Intelligence for Smart Energy and Hydrogen Management Systems Architectures, Challenges, and Future Directions.
Manbachi, M.; Nasri, M.; Shahabi, B.; Farhangi, H.; Palizban, A.; Arzanpour, S.; Moallem, M. D.C.J.I.T.o.S.E. Lee, Real-time adaptive VVO/CVR topology using multi-agent system and IEC 61850-based communication protocol. 2013, 5, 587–597. [Google Scholar]
Du, C.; Wang, X.; Wang, X.; T.o., C.J.I.; Shao, P.S. A block-based medium-long term energy transaction method. 2015, 31, 4155–4156. [Google Scholar]
Gao, X.; Zhang, J.; Sun, H.; Liang, Y.; Wei, L.; Yan, C.; Xie, Y.J.E. A review of voltage control studies on low voltage distribution networks containing high penetration distributed photovoltaics. 2024, 17, 3058. [Google Scholar] [CrossRef]
Zhang, J.-L.; Chen, X. G.J.I.T.o.A.C. Gu, State consensus for discrete-time multiagent systems over time-varying graphs. 2020, 66, 346–353. [Google Scholar]
Patari, N.; Venkataramanan, V.; Srivastava, A.; Molzahn, D.K.; Li, N.; T.o., A.J.I.; Annaswamy, P.S. Distrib. Optim. Distrib. Syst. Use Cases Limit. Res. Needs 2021, 37, 3469–3481.
Estebsari, A.; Vogel, S.; Melloni, R.; Stevic, M.; Bompard, E.F.; Monti, A.J.I.A. Frequency control of low inertia power grids with fuel cell systems in distribution networks. 2022, 10, 71530–71544. [Google Scholar] [CrossRef]
Su, Y.; Teh, J.; Luo, Q.; Tan, K.; Yong, J.J.P. c.o.m.p. systems, A two-layer framework for mitigating the congestion of urban power grids based on flexible topology with dynamic thermal rating. 2024, 9, 83–95. [Google Scholar]
Ji, L.; Dou, Y.; Zhang, C. H.J.I.T.o.N.S. Li, Engineering, Self-triggered consensus-based strategy for economic dispatch in uncertain communication networks. 2024, 11, 6652–6663. [Google Scholar]
A. Hannukainen, N. Hyvönen, L.J.a.p.a. Mustonen, An inverse boundary value problem for the $ p $-Laplacian, (2018).
Cummings, J.; Hauenstein, J.D.; Hong, H.; Smyth, C.D.J.N.a. Smooth Connect. Real. Algebr. Var. 2025, 100, 63–84.
Bhardwaj, R.; Datta, D. Consensus algorithm, in: Decentralised internet of things: A blockchain perspective; Springer, 2020; pp. 91–107. [Google Scholar]
Garrity, A.; Keck, C.; Ishikawa, K. Effective and Efficient Assessment of Reflective Journals. In Evidence-Based Education in the Classroom; Routledge, 2024; pp. 249–260. [Google Scholar]
Hou, X.; Zhao, Y.; Liu, Y.; Yang, Z.; Wang, K.; Li, L.; Luo, X.; Lo, D.; Grundy, J. H.J.A.T.o.S.E. Wang, Methodology, Large language models for software engineering: A systematic literature review. 2024, 33, 1–79. [Google Scholar]
Page, M.J.; McKenzie, J.E.; Bossuyt, P.M.; Boutron, I.; Hoffmann, T.C.; Mulrow, C.D.; Shamseer, L.; Tetzlaff, J.M.; Akl, E.A.; Brennan, S.E.J.b. The PRISMA 2020 statement: an updated guideline for reporting systematic reviews. 2021, 372. [Google Scholar]
Moher, D.; Liberati, A.; Tetzlaff, J.; Altman, D.G.J.B. Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement. 2009, 339. [Google Scholar]
Moher, D.; Liberati, A.; Tetzlaff, J.; Altman, D.G.; I.j.o. surgery, P.G.J. Prefer. Report. Items Syst. Rev. Meta-Anal. Prism. Statement 2010, 8, 336–341.
Wang, S.; Tian, C.; Zhou, C.; Wu, Y.; Ametefe, D.S.; John, D.; Ametefe, M.A.J.E.S. Engineering, The Role of Digital Twin Technology in Enhancing Energy Efficiency in Buildings: A Systematic Literature Review; 2026. [Google Scholar]
Jradi, M.J.S. Unlocking the Potential of Digital Twin Technology for Energy-Efficient and Sustainable Buildings: Challenges, Opportunities, and Pathways to Adoption. 2026, 18, 541. [Google Scholar] [CrossRef]
Ullah, S.; Khan, L.; Sami, I.; Ullah, N.J.I.A. Consensus-based delay-tolerant distributed secondary control strategy for droop controlled AC microgrids. 2021, 9, 6033–6049. [Google Scholar]
Jiang, X.; Xia, G.; Feng, Z.; T.o., Z.J.I.; Jiang, N.S. Engineering, Consensus tracking of data-sampled nonlinear multi-agent systems with packet loss and communication delay 2020, 8, 126–137.
Böttcher, P.C.; Otto, A.; Kettemann, S. C.J.C.A.I.J.o.N.S. Agert, Time delay effects in the control of synchronous electricity grids. 2020, 30. [Google Scholar]
Molnar, C.A.; Insperger, T.J.J.o.B. Critical delay as a measure for the difficulty of frontal plane balancing on rolling balance board. 2022, 138, 111117. [Google Scholar] [CrossRef]
M. Anvari, F. Hellmann, X.J.C.A.I.J.o.N.S. Zhang, Introduction to Focus Issue: Dynamics of modern power grids, 30 (2020).
Mahmud, A.R.J.E. The Role of Artificial Intelligence in Advancing Smart Grid Technologies. 2025, 1, 1–13. [Google Scholar]
Aliwi, A. Impact of False Data Injection Attack in Power Systems and a Proposed Method to Mitigate Risk; Morgan State University: in, 2025. [Google Scholar]
Tian, J.; Wang, B.; Wang, Z.; Cao, K.; Li, J.; T., M.J.I.; Ozay, C. Joint adversarial example and false data injection attacks for state estimation in power systems. 2021, 52, 13699–13713. [Google Scholar] [CrossRef]
Hannukainen, A.; Malinen, J.; Ojalammi, A.J.S.J.o.N.A. Distributed solution of Laplacian eigenvalue problems. 2022, 60, 76–103. [Google Scholar]
Tyurin, A.J.a.p.a. Proving the Limited Scalability of Centralized Distributed Optimization via a New Lower Bound Construction. 2025. [Google Scholar]
K. Griparic, M. Polic, M. Krizmancic, S.J.I.t.o.n.s. Bogdan, engineering, Consensus-based distributed connectivity control in multi-agent systems, 9 (2022) 1264-1281.
Steininger, F.; Zieger, S.E.; Koren, K.J.A. Timing matters: the overlooked issue of response time mismatch in pH-dependent analyte sensing using multiple sensors. 2023, 148, 5957–5962. [Google Scholar] [CrossRef]
Cheng, W.; Zhang, K.; Jiang, B.J.I.T.o.S.; Man, C. Systems, Fixed-time fault-tolerant formation control for a cooperative heterogeneous multiagent system with prescribed performance. 2022, 53, 462–474. [Google Scholar] [CrossRef]
C. Han, Y. Wang, Y. Li, Y. Chen, N.A. Abbasi, T. Kürner, A.F.J.I.C.S. Molisch, Tutorials, Terahertz wireless channels: A holistic survey on measurement, modeling, and analysis, 24 (2022) 1670-1707.
Jha, A.V.; Ghazali, A.N.; Appasani, B.; Ravariu, C.; Srinivasulu, A.J.R.R.S.T.E. Reliability analysis of smart grid networks incorporating hardware failures and packet loss. 2021, 65, 245–252. [Google Scholar]
Zaman, D.; Mazinani, M.J.S. Cybersecurity in smart grids: protecting critical infrastructure from cyber attacks 2023, 2023, 86–94.
Chen, W.-J.; Teng, T.-P.J.B. Environment, A compensation algorithm to reduce humidity ratio error due to asynchronous humidity and temperature sensor time constants. 2021, 190, 107555. [Google Scholar]
Zheng, Z.; Wang, S.; Li, W.; T.o., X.J.I.; Luo, S.G. A consensus-based distributed temperature priority control of air conditioner clusters for voltage regulation in distribution networks. 2022, 14, 290–301. [Google Scholar] [CrossRef]
Khayat, Y.; Heydari, R.; Naderi, M.; Dragicevic, T.; Shafiee, Q.; Fathi, M.; Bevrani, H.; Blaabjerg, F.J.I.J.o.E.; i.P., S.T. Electronics, Decentralized frequency control of AC microgrids: An estimation-based consensus approach. 2020, 9, 5183–5191. [Google Scholar]
Huang, R.; Ding, Z.J.A. Adaptive delay compensation for consensus control under switching topologies by observer–predictor. 2021, 132, 109811. [Google Scholar]
De Persis, C.; Tesi, P.J.A. Low-complexity learning of linear quadratic regulators from noisy data. 2021, 128, 109548. [Google Scholar] [CrossRef]
Li, X.; Sun, Z.; Tang, Y.; T.o., H.R.J.I.; Karimi, A.C. Adaptive event-triggered consensus of multiagent systems on directed graphs. 2020, 66, 1670–1685. [Google Scholar]
Serban, A.; Cespedes, S.; Marinescu, C.; Azurdia-Meza, C.A.; Gomez, J.S.; Hueichapan, D.S.J.I.A. Communication requirements in microgrids: A practical survey. 2020, 8, 47694–47712. [Google Scholar] [CrossRef]
L. Han, T. Morstyn, M.D.J.I.T.o.S.G. McCulloch, Scaling up cooperative game theory-based energy management using prosumer clustering, 12 (2020) 289-300. [CrossRef]
C. Wu, W. Gu, R. Bo, H. MehdipourPicha, P. Jiang, Z. Wu, S. Lu, S.J.I.T.o.P.S. Yao, Energy trading and generalized Nash equilibrium in combined heat and power market, 35 (2020) 3378-3387. [CrossRef]
R. Mohammadpour, A.J.a.p.a. Quas, Non-unique equilibrium measures and freezing phase transitions for matrix cocycles for negative $ t$, (2025).
Alasseur, C.; Ben Taher, I.; Matoussi, A.J.J.o.O.T. Applications, An extended mean field game for storage in smart grids. 2020, 184, 644–670. [Google Scholar]
Han, Z. Game theory in wireless and communication networks: theory, models, and applications; Cambridge university press, 2012. [Google Scholar]
M. Wang, Y. Wu, S.J.I.T.o.N.N. Qin, L. Systems, Generalized nash equilibrium seeking for noncooperative game with different monotonicities by adaptive neurodynamic algorithm, 36 (2024) 7637-7650.
Stanojev, O.; Markovic, U.; Aristidou, P.; Hug, G.; Callaway, D.; Vrettos, E.J.I.T.o.P.S. MPC-based fast frequency control of voltage source converters in low-inertia power systems. 2020, 37, 3209–3220. [Google Scholar] [CrossRef]
Liu, H.; Fan, A.; Li, Y.; Bucknall, R.; L.J.R. Chen, S.E. Reviews, Hierarchical distributed MPC method for hybrid energy management: A case study of ship with variable operating conditions. 2024, 189, 113894. [Google Scholar]
Jafarzadeh, H.; Fleming, C.J.A. DMPC: A data-and model-driven approach to predictive control. 2021, 131, 109729. [Google Scholar] [CrossRef]
McAllister, R.D.; Esfahani, P.M.J.I.T.o.A.C. Distributionally robust model predictive control: Closed-loop guarantees and scalable algorithms. 2024, 70, 2963–2978. [Google Scholar] [CrossRef]
Stanojev, O.; Rüssli-Kueh, J.; Markovic, U.; Aristidou, P.; Hug, G. Primary frequency control provision by distributed energy resources in active distribution networks. In in: 2021 IEEE Madrid PowerTech; IEEE, 2021; pp. 1–6. [Google Scholar]
Motalaei, S.; Khodabakhshian, M.; Majidi, F.A. M.J.J.o.R.E. Abravesh, Environment, Optimizing Energy Efficiency in Educational Buildings Within a Hot-Arid Climate: Uncertainty and Sensitivity Analysis of Enclosed Interface Space Parameters 2026, 13, 58–66.
Michailidis, P.; Michailidis, I.; Kosmatopoulos, E.J.E. Reinforcement learning for optimizing renewable energy utilization in buildings: A review on applications and innovations. 2025, 18, 1724. [Google Scholar] [CrossRef]
Jain, V.; Liu, S.; Iyer, G. Coping with sample inefficiency of deep-reinforcement learning (DRL) for embodied AI. 2020. [Google Scholar]
Dinneweth, J.; Boubezoul, A.; Mandiau, R.; Espié, S.J.A.I.S. Multi-agent reinforcement learning for autonomous vehicles: A survey. 2022, 2, 27. [Google Scholar] [CrossRef]
Emam, Y.; Notomista, G.; Glotfelter, P.; Kira, Z.; M.J.I.R. Egerstedt, A. Letters, Safe reinforcement learning using robust control barrier functions. 2022, 10, 2886–2893. [Google Scholar]
Rajamallaiah, A.; Karri, S.P.K.; Alghaythi, M.L.; Alshammari, M.S.J.I.A. Deep reinforcement learning based control of a grid connected inverter with LCL-filter for renewable solar applications. 2024, 12, 22278–22295. [Google Scholar]
Li, M.; Hu, D.; Lal, C.; Conti, M.; T.o., Z.J.I.; Zhang, I.I. Blockchain-enabled secure energy trading with verifiable fairness in industrial Internet of Things. 2020, 16, 6564–6574. [Google Scholar] [CrossRef]
Alrubei, S.M.; Ball, E.A.; Rigelsford, J.M.; Willis, C.A.J.I.s.j. Latency and performance analyses of real-world wireless IoT-blockchain application. 2020, 20, 7372–7383. [Google Scholar] [CrossRef]
He, J.; Xiao, M.; Skoglund, M. H.V.J.I.T.o.S.P. Poor, Straggler-resilient asynchronous ADMM for distributed consensus optimization. 2025. [Google Scholar]
Hasanzadeh, M.; Kargarian, A.J.I.T.o.P.S. ADMM enhancement techniques for distributed optimal power flow. 2025. [Google Scholar] [CrossRef]
Wang, Z.-Y.; Chiang, H.-D.J.I.T.o.C.; Briefs, S.I.E. On the pseudo-bifurcation of non-convexity in the feasible region of AC optimal power flow. 2022, 69, 2231–2235. [Google Scholar] [CrossRef]
A. Ejupi, S. De Angelis, V. Sassone, Performance and scalability testing for blockchain consensus protocols: an empirical framework, in, CEUR Workshop Proceedings, 2025.
J. Rafner, M.M. Biskjær, B. Zana, S. Langsford, C. Bergenholtz, S. Rahimi, A. Carugati, L. Noy, J.J.C.R.J. Sherson, Digital games for creativity assessment: Strengths, weaknesses and opportunities, 34 (2022) 28-54.
Cembellín, A.; Francisco, M.; Vega, P.J.P. Optimal operation of a benchmark simulation model for sewer networks using a qualitative distributed model predictive control algorithm. 2023, 11, 1528. [Google Scholar] [CrossRef]
Liu, D.; Ren, F.; Yan, J.; Su, G.; Gu, W.; Kato, S.J.I.A. Scaling up multi-agent reinforcement learning: An extensive survey on scalability issues. 2024, 12, 94610–94631. [Google Scholar] [CrossRef]
Herrera, M.; Pérez-Hernández, M.; Kumar Parlikad, A.; Izquierdo, J.J.P. Multi-agent systems and complex networks: Review and applications in systems engineering. 2020, 8, 312. [Google Scholar] [CrossRef]
Bastianello, N.; Schenato, L.; Carli, R.J.A. A novel bound on the convergence rate of ADMM for distributed optimization. 2022, 142, 110403. [Google Scholar] [CrossRef]
Ferrari, V.; Lopes, Y.J.I.L.A.T. Dynamic adaptive protection based on IEC 61850. 2020, 18, 1302–1310. [Google Scholar] [CrossRef]
Bodkhe, U.; Mehta, D.; Tanwar, S.; Bhattacharya, P.; Singh, P.K.; Hong, W.-C.J.I.A. A survey on decentralized consensus mechanisms for cyber physical systems. 2020, 8, 54371–54401. [Google Scholar] [CrossRef]
Yang, F.; Xie, X.; Sun, Q.; T.o., D.J.I.; Yue, S.G. FDI attack estimation and event-triggered resilient control of DC microgrids under hybrid attacks. 2024, 15, 4207–4216. [Google Scholar] [CrossRef]
Wang, Z.; Ding, H.; Pan, L.; Li, J.; Gong, Z.; Yu, P.S.J.I.T.o.N.N.; Systems, L. From cluster assumption to graph convolution: Graph-based semi-supervised learning revisited 2024, 36, 12952–12963.
Ma, H.; Ren, H.; Zhou, Q.; Lu, R.; Li, H.J.I.T.o.S.; Man, C. Systems, Approximation-based Nussbaum gain adaptive control of nonlinear systems with periodic disturbances. 2021, 52, 2591–2600. [Google Scholar] [CrossRef]
Kuntke, F. Resilient Smart Farming: Crisis-Capable Information and Communication Technologies for Agriculture; Springer Nature, 2024. [Google Scholar]
Sun, H.; Huang, Y.; Zhou, C.; Han, L.; Liu, H.; Chen, J.; T.o., X.J.I.; Li, I.I. Space decoupled prototype learning for few-Shot attack detection in cyber–physical systems. 2024, 20, 12350–12362. [Google Scholar] [CrossRef]
Z. Wu, A. Zhang, T. Yu, Y. Li, J. Xiong, M.J.I.T.o.N.S. Xie, Engineering, Dynamic Probability-Density-Dependent Event-Triggered $\mathcal {L} _ {\infty} $ LFC for Power Systems Subject to Stochastic Delays. 2023, 11, 453–462.
Oyefeso, D.A. Geller, I.M. Granitsas, D.S. Callaway, J.L.J.I.T.o.S.G. Mathieu, A Hardware-in-the-Loop Experimental Testbed Using Air Conditioners for Grid Balancing, (2025). [CrossRef]
Zheng, Y.; Li, H.; Wang, S.; Tan, Z.; Jiang, X.; Li, P.; Jiang, Y.; Zhang, H.J.S.R. Multi-agent coordination and uncertainty adaptation in deep learning–assisted hierarchical optimization for renewable-dominated distribution networks; 2026. [Google Scholar]
Y. Zhang, E.J.a.p.a. Mazumdar, Provably Convergent Actor-Critic in Risk-averse MARL, (2026).
Tan, R.; Li, R.; Yu, X.; Chen, X.; Xu, X.; T.o., Z.J.I.; Zhao, M.C. Pareto actor-critic for communication and computation co-optimization in non-cooperative federated learning services. 2025. [Google Scholar] [CrossRef]
R. Audrey, M.C. ADEME, B.J.R.d.l.s.l.c.d.c.c.l.a.i. Hélène, à la suite de l’installation de panneaux photovoltaïques, Crédoc. 2026, 122-124.
Skowroński, R.J.I. Liveness over Fairness (Part I): A Statistically Grounded Framework for Detecting and Mitigating PoW Wave Attacks. 2025, 16, 1060. [Google Scholar] [CrossRef]
W. Li, C. Feng, L. Zhang, H. Xu, B. Cao, M.A.J.I.T.o.P. Imran, D. Systems, A scalable multi-layer PBFT consensus for blockchain, 32 (2020) 1146-1160.
G.E. Karniadakis, I.G. Kevrekidis, L. Lu, P. Perdikaris, S. Wang, L.J.N.R.P. Yang, Physics-informed machine learning, 3 (2021) 422-440.
Misyris, G.S.; Venzke, A.; Chatzivasileiadis, S. Physics-informed neural networks for power systems. In in: 2020 IEEE power & energy society general meeting (PESGM); IEEE, 2020; pp. 1–5. [Google Scholar]
Han, J. Enhanced frequency control in future low-inertia power systems based on digital twins of distributed energy resources. 2025. [Google Scholar]
Barragán-Villarejo, M.; García-López, F.d.P.; Marano-Marcolini, A.; Maza-Ortega, J.M.J.E. Power system hardware in the loop (PSHIL): A holistic testing approach for smart grid technologies. 2020, 13, 3858. [Google Scholar] [CrossRef]
A. Fuller, Z. Fan, C. Day, C.J.I.a. Barlow, Digital twin: Enabling technologies, challenges and open research, 8 (2020) 108952-108971.
Yu, W.; Patros, P.; Young, B.; Klinac, E.; T.G.J.R. Walmsley, S.E. Reviews, Energy digital twin technology for industrial energy management: Classification, challenges and future. 2022, 161, 112407. [Google Scholar]
YanJun, Z.; S.o., N.J.; Tajudin, C. Archaeology, Adaptive Manufacturing At Scale: Case Study On Digital Twin-Mas Synergy For Operational Excellence. 2025, 37, 170–177. [Google Scholar]
H. Han, H. Zhang, J. Yang, H.J.I.T.o.S.G. Su, Distributed model predictive consensus control for stable operation of integrated energy system, 15 (2023) 381-393. [CrossRef]
D. Thakur, A. Guzzo, G. Fortino, F.J.A.C.S. Piccialli, Green federated learning: A new era of green aware ai, 57 (2025) 1-36.
Nguyen, D.C.; Ding, M.; Pathirana, P.N.; Seneviratne, A.; Li, J.; I.c.s., H.V.J. Poor, tutorials, Federated learning for internet of things: A comprehensive survey. 2021, 23, 1622–1658. [Google Scholar]
Hudson, N.; Hossain, M.J.; Hosseinzadeh, M.; Khamfroush, H.; Rahnamay-Naeini, M.; Ghani, N. A framework for edge intelligent smart distribution grids via federated learning. In in: 2021 International Conference on Computer Communications and Networks (ICCCN); IEEE, 2021; pp. 1–9. [Google Scholar]
Alghamedy, F.H.; El-Haggar, N.; Alsumayt, A.; Alfawaer, Z.; Alshammari, M.; Amouri, L.; Aljameel, S.S.; Albassam, S.J.I.A. Unlocking a Promising Future: integrating Blockchain Technology and FL-IoT in the journey to 6G. 2024, 12, 115411–115447. [Google Scholar] [CrossRef]
Gomes, J.; Khan, S.; Svetinovic, D.J.I.A. Fortifying the blockchain: A systematic review and classification of post-quantum consensus solutions for enhanced security and resilience. 2023, 11, 74088–74100. [Google Scholar] [CrossRef]
Cerezo, M.; Arrasmith, A.; Babbush, R.; Benjamin, S.C.; Endo, S.; Fujii, K.; McClean, J.R.; Mitarai, K.; Yuan, X.; Cincio, L.J.N.R.P. Var. Quantum Algorithms 2021, 3, 625–644.
Aleksis, R.; Pell, A.J.J.T.J.o.C.P. Low-power synchronous helical pulse sequences for large anisotropic interactions in MAS NMR: Double-quantum excitation of 14N. 2020, 153. [Google Scholar] [CrossRef]
Somvanshi, S.; Islam, M.M.; Chhetri, G.; Chakraborty, R.; Mimi, M.S.; Shuvo, S.A.; Islam, K.S.; Javed, S.; Rafat, S.A.; Dutta, A.J.A.C.S. From tiny machine learning to tiny deep learning: A survey. 2025, 58, 1–33. [Google Scholar] [CrossRef]
A. Osman, U. Abid, L. Gemma, M. Perotto, D. Brunelli, Tinyml platforms benchmarking, in: International Conference on Applications in Electronics Pervading Industry, Environment and Society, Springer, 2021, pp. 139-148.
Gweon, S.; Kang, S.; Kim, K.; Yoo, H.-J.J.I.J.o.S.-S.C. FlashMAC: A time-frequency hybrid MAC architecture with variable latency-aware scheduling for TinyML systems. 2022, 57, 2944–2956. [Google Scholar] [CrossRef]
Kluczek, A.; Żegleń, P.; Matušíková, D.J.E. The use of Prospect theory for energy sustainable industry 4.0. 2021, 14, 7694. [Google Scholar] [CrossRef]
S. Kumar, S. Datta, V. Singh, D. Datta, S.K. Singh, R.J.I.A. Sharma, Applications, challenges, and future directions of human-in-the-loop learning, 12 (2024) 75735-75760.
Andriopoulos, N.; Plakas, K.; Birbas, A.; Papalexopoulos, A.J.I.A. Design of a prosumer-centric local energy market: An approach based on prospect theory. 2024, 12, 32014–32032. [Google Scholar] [CrossRef]
Shi, M.; Shahidehpour, M.; Zhou, Q.; Chen, X.; T.o., J.J.I.; Wen, P.S. Optim. Consens.-Based Event-Triggered Control Strategy Resilient DC Microgrids 2020, 36, 1807–1818.
Wang, D.; Gong, C.; Li, Y.; Ma, H.; Li, T.; Luo, S.J.E. Behaviorally Embedded Multi-Agent Optimization for Urban Microgrid Energy Coordination Under Social Influence Dynamics. 2026, 19, 687. [Google Scholar] [CrossRef]
Xia, Y.; Wang, K.; Ji, H.; Wang, Q.; Shi, L.; T.o., F.J.I.; Wu, S.G. Dynamic Network Usage Fee for Joint Kilowatt and Negawatt Peer-to-Peer Energy Market Based on Knowledge Distillation. 2025. [Google Scholar] [CrossRef]
Liu, Y.; Xu, X.; Liu, Y.; Liu, J.; Hu, W.; Yang, N.; Jawad, S. Z.J.J.o.C.P. Wei, A multi-agent decision-making framework for planning and operating human-factor-based rural community. 2024, 440, 140888. [Google Scholar]
Doumen, S.C.; Hönen, J.; Nguyen, P.; Hurink, J.L.; Zwart, B.; Kok, K. Modeling and demonstrating the effect of human decisions on the distribution grid. In in: 2023 IEEE Power & Energy Society Innovative Smart Grid Technologies Conference (ISGT); IEEE, 2023; pp. 1–5. [Google Scholar]
S.J. Mitchell Finnigan, Human-Centred Smart Buildings: Reframing Smartness Through the Lens of Human-Building Interaction, in, Newcastle University, 2020.
M.A.J.S.o.E.S. Iqbal, GRID MODERNIZATION: FROM TRADITIONAL TO AI-OPTIMIZED SELF-HEALING NETWORKS. 2026; 4, pp. 386–396.
Maurya, A. Self-Healing Schemes in Modern Advanced Power Distribution Networks: An Overview. In in: 2024 IEEE 11th Power India International Conference (PIICON); IEEE, 2024; pp. 1–6. [Google Scholar]
Alyas, T.; Abbas, Q.; Hayat, A.; Nawaz, W.; Alshanqiti, A.; Almoamari, H.; J.S.R. Ibrahim, A.M. Toward intelligent IoT scheduling with quantum-inspired latent models for energy and latency optimization; 2026. [Google Scholar]
Li, M.; Yu, F.R.; Si, P.; Wu, W.; I.o., Y.J.I.; Zhang, T.J. Resource optimization for delay-tolerant data in blockchain-enabled IoT with edge computing: A deep reinforcement learning approach. 2020, 7, 9399–9412. [Google Scholar]
Cavone, G.; Bozza, A.; Carli, R.; T.o., M.J.I.; Dotoli, A.S. Engineering, MPC-based process control of deep drawing: An industry 4.0 case study in automotive. 2022, 19, 1586–1598. [Google Scholar]
Yao, Z.; Fang, Y.; Pan, H.; Wang, X.; Si, X.J.S.R. A secure and highly efficient blockchain PBFT consensus algorithm for microgrid power trading. 2024, 14, 8300. [Google Scholar] [PubMed]
Z. Zhou, H.J.I.T.o.N.N. Xu, L. Systems, Large-scale multiagent system tracking control using mean field games, 33 (2021) 5602-5610.
Mustafa, H.M.; Bariya, M.; Sajan, K.; Chhokra, A.; Srivastava, A.; Dubey, A.; von Meier, A.; Biswas, G. RT-METER: A real-time, multi-layer cyber-power testbed for resiliency analysis. in: Proceedings of the 9th Workshop on Modeling and Simulation of Cyber-Physical Energy Systems 2021, 1–7. [Google Scholar]
Dandasi, V.V.; Gurubasannavar, S.D.; Sunku, R.J.M. Enhancing Smart Grid Security: A Multi-Criteria Evaluation Through GRA Method. 2023, 14, 153–167. [Google Scholar]
Sudarsan, A.; Kurukkanari, C.; D.J.E.S. Bendi, P. Research, A state-of-the-art review on readiness assessment tools in the adoption of renewable energy. 2023, 30, 32214–32229. [Google Scholar]
Ringelstein, J.; Marten, F.; Vogt, M.; Banerjee, G.; Bao, J. pandaict: a Novel Open-Source Tool for modeling Cyber-Physical Energy Systems. ETG Kongress 2025; Voller Energie–heute und morgen, VDE; 2025, pp. 39–46.
Patel, H.D.; Alam, R.; Gupta, G.N. Continuous variable analyses: Student’s t-test, rank-sum test, and signed-rank test. In Translational Urology; Elsevier, 2025; pp. 109–114. [Google Scholar]
Dolev, D.; Reischuk, R.; Schneider, F.B.; Strong, H.R. Time Services (Dagstuhl Seminar 9611), in, Schloss Dagstuhl-Leibniz-Zentrum für Informatik; 2021. [Google Scholar]
Dawson, C.; Gao, S.; Fan, C.J.I.T.o.R. Safe control with learned certificates: A survey of neural lyapunov, barrier, and contraction methods for robotics and control. 2023, 39, 1749–1767. [Google Scholar] [CrossRef]
Lavaei, A.; Soudjani, S.; Frazzoli, E. Safety barrier certificates for stochastic hybrid systems. In in: 2022 American Control Conference (ACC); IEEE, 2022; pp. 880–885. [Google Scholar]
Luo, H.; Yang, X.; Yu, H.; Sun, G.; Lei, B.; I.o., M.J.I.; Guizani, T.J. Performance analysis and comparison of nonideal wireless PBFT and RAFT consensus networks in 6G communications. 2023, 11, 9752–9765. [Google Scholar] [CrossRef]
Reif, V.; Strasser, T.I.; Jimeno, J.; Farre, M.; Genest, O.; Gyrard, A.; McGranaghan, M.; Lipari, G.; Schütz, J.; Uslar, M.J.e. i.E.u. Informationstechnik, Towards an interoperability roadmap for the energy transition 2023, 140, 478–487.
Jesus, V.S.d.; Lazarin, N.M.; Pantoja, C.E.; Manoel, F.C.P.B.; Alves, G.V.; Viterbo, J.J.A.I.R. A middleware for providing communicability to Embedded MAS based on the lack of connectivity. 2023, 56. [Google Scholar]
Devarajan, G.; Naveen Kumar, P.; Chinnusamy, M.; Kanagaraj, S.; Chenniappan, S.J.A.I.b.S.P.S. Modeling and simulation of smart power systems using hil; 2023; pp. 291–309. [Google Scholar]
Bennetot, A.; Donadello, I.; El Qadi El Haouari, A.; Dragoni, M.; Frossard, T.; Wagner, B.; Sarranti, A.; Tulli, S.; Trocan, M.; Chatila, R.J.A.C.S. A practical tutorial on explainable AI techniques. 2024, 57, 1–44. [Google Scholar] [CrossRef]
Cifci, A.J.I.a. Interpretable prediction of a decentralized smart grid based on machine learning and explainable artificial intelligence. 2025. [Google Scholar]
Stepanova, A.I.; Khalyasmaa, A.I.; Matrenin, P.V.; Eroshenko, S.A.J.A. Application of SHAP and Multi-Agent Approach for Short-Term Forecast of Power Consumption of Gas Industry Enterprises. 2024, 17, 447. [Google Scholar]
Virgili, M.; Babu, N.; Javidsharifi, M.; Valiulahi, I.; Masouros, C.; Forsyth, A.J.; Kerekes, T.o.C. C.B.J.I.T.o.C. Papadias, Cost-efficient design of an energy-neutral UAV-based mobile network. 2022, 70, 6890–6901. [Google Scholar]
Zhang, T.; Chen, W.J.I.T.o.G.C. Networking, Computation offloading in heterogeneous mobile edge computing with energy harvesting. 2021, 5, 552–565. [Google Scholar]
P. Chen, S. Liu, X. Wang, I.J.I.T.o.C. Kamwa, S.I.R. Papers, Physics-guided multi-agent deep reinforcement learning for robust active voltage control in electrical distribution systems, 71 (2023) 922-933.
N. Shi, R. Cheng, L. Liu, Z. Wang, Q. Zhang, M.J.J.I.t.o.s.g. Reno, Data-driven affinely adjustable robust Volt/VAr control, 15 (2023) 247-259.
L. Lanza, T. Faulwasser, K.J.a.p.a. Worthmann, Distributed optimization for energy grids: a tutorial on ADMM and ALADIN, (2024).
Guo, M.; De Persis, C.; Tesi, P.J.I.T.o.A.C. Data-driven stabilization of nonlinear polynomial systems with noisy data. 2021, 67, 4210–4217. [Google Scholar]
Ali, A.; Li, C.; T.o., B.J.I.; Hredzak, P.D. Dynamic voltage regulation in active distribution networks using day-ahead multi-agent deep reinforcement learning. 2024, 39, 1186–1197. [Google Scholar] [CrossRef]
Cheng, W.; Zhang, K.; Jiang, B.J.I.T.o.S.; Man, C. Systems, Hierarchical structure-based fixed-time optimal fault-tolerant time-varying output formation control for heterogeneous multiagent systems. 2023, 53, 4856–4866. [Google Scholar] [CrossRef]

Figure 1. Multi-Agent System Architecture for Active Grid Periphery.

Figure 2. PRISMA 2020 flow diagram of the systematic literature selection process, showing the identification, screening, eligibility, and inclusion stages leading to 160 Q1 journal articles.

Figure 3. Effect of communication delay on consensus convergence in distributed MAS control. The dashed line indicates the practical failure threshold at 300 ms.

Figure 4. Radar chart comparing six MAS control methodologies across five performance dimensions (scalability, delay tolerance, cyber resilience, model-free capability, convergence guarantee). Scores are normalized medians from 160 surveyed papers (0 = lowest, 1 = highest).

Figure 5. Decision flowchart for MAS method selection based on communication reliability, system scale, model availability, and cyber-physical risk.

Figure 6. Decision Flowchart for MAS Method Selection.

Table 3. Critical Comparison of MAS Control Methods for Active Periphery.

Method	Scalability	Delay Tolerance	Cyber Resilience	Model-Free	Convergence Guarantee	Reference
Consensus	Medium	Low	Medium	Yes	Yes (under connectivity)	[83]
Game Theory	Low	Medium	Low	Yes	No (multiple equilibria)	[84]
DMPC	Medium	Low	Low	No	Yes (with constraints)	[85]
MARL	High	High	Medium	Yes	No (exploration needed)	[86]
Blockchain-MAS	Low	Very Low	High	Yes	No (probabilistic)	[87]
ADMM	Medium	Low	Medium	No	Yes (convex)	[88]

* High cyber resilience only for networks with ≤100 agents under PBFT; larger systems face latency issues.

Table 5. Summary of Research Gaps with Mathematical Conjectures.

Gap ID	Domain	Conjecture	Proposed Validation
G1	Consensus	No convergence under arbitrary disconnection	Counterexample construction
G2	Cyber-physical	Stability requires coupled analysis	Simulation with voltage-comm coupling
G3	MARL	Non-stationary → no convergence	Counterexample MDP
G4	Game theory	Lying prosumers break Nash equilibrium	Mechanism design impossibility
G5	Blockchain	PoW latency > control horizon	Lower bound proof
G6	Physics+ML	Unconstrained actions inevitable	No-free-lunch theorem extension

Table 7. State-of-the-Art Summary of MAS Methods for Active Power Peripheries.

Method	Advantages	Disadvantages	Quantitative Accuracy / Performance (State of the Art)
Consensus-based distributed control [38,55,56,60]	No central coordinator; resilient to single-point failures; simple to implement	Slow convergence for large N; sensitive to delays (>300 ms); requires connected graph	Convergence iterations: ~20 (0 ms delay) to >120 (500 ms) (Figure 3). Communication reduction up to 60% with event-triggering 6060. Voltage regulation error ≤0.02 p.u. under ideal conditions 55,5655,56.
Game-theoretic approaches [61,62,63,64]	Accounts for prosumer self-interest; supports local energy trading	High computational cost O(n³); non-unique Nash equilibria; vulnerable to asymmetric information	Nash equilibrium reached in ≤50 iterations for ≤20 prosumers 6161. Mean-field games reduce complexity to O(n) for large populations 6464.
Distributed MPC (DMPC) [67,68,69,87]	Handles constraints explicitly; optimal under accurate model	Requires precise system model (often unavailable); low delay tolerance; not model-free	Settling time improvement: 30% faster than centralized MPC in benchmark tests 6767. Constraint satisfaction rate >95% 6868.
Multi-agent RL (MARL) [73,75,76,88]	Model-free; adapts to uncertainty; high delay tolerance (up to 500 ms)	No convergence guarantees; sample inefficient; lacks safety guarantees	Voltage violation reduction: 40-50% in day-ahead scheduling 7373. Convergence not guaranteed - exploration needed 8888. Safe RL with barrier functions reduces violations by 70% 7676.
Blockchain-integrated MAS [77,79,80,103]	Immutable audit trail; high cyber resilience (PBFT for ≤100 agents)	High latency (minutes to finality); energy overhead; poor scalability	Finality latency: 10-60 min (PoW) 103103; PBFT adds <1 s for ≤100 nodes but latency grows exponentially 8080. Off-chain solutions reduce overhead by 90% 7979.
ADMM (optimizatio-based) [81,82,84,89]	Global optimality under convex problems; fully distributed	Slow for non-convex AC power flow; not model-free; low delay tolerance	Convergence rate: O(1/k) for convex problems 8989; non-convex variants require 2-3× more iterations 8484. Optimal power flow solved in <100 iterations for IEEE 123-bus 8282.
Hierarchical control (IEC 61850)- baseline [55,66,90,91]	Deterministic latency; mature cybersecurity; industry standard	Single point of failure; limited scalability (~100-200 nodes)	Delay tolerance up to 1 s with proper tuning 9090; scalability bottleneck beyond 200 nodes 5,65,6.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.

Multi-Agent Systems for Decentralized Control and Management of Active Power Grid Peripheries: A Systematic Review

Abstract

Keywords:

Subject:

1. Introduction

1.1. Motivation

1.2. Contributions

2. Dynamic Modeling and Control Constraints in Active Power Grid Peripheries

2.1. Active Power Grid Periphery

2.2. Multi-Agent Systems (Mas)

2.3. Control Objectives in Smart Grid Periphery

2.4. Stability Definitions

2.5. Systematic Review Methodology

3. Challenges - with Quantitative Examples

3.1. Communication Delays and Packet Loss

3.2. Cyber-Physical Attacks

3.3. Scalability vs. Convergence Trade-Off

3.4. Heterogeneous Dynamics

4. Methods - Critical Analysis

4.1. Consensus-Based Distributed Control

4.2. Game-Theoretic Approaches

4.3. Model Predictive Control

4.4. Reinforcement Learning (Rl) and Deep Rl

4.5. Blockchain-Integrated Mas

4.6. Optimization-Based Methods (Admm, Primal-Dual)

4.7. Hierarchical Control (Iec 61850) as an Industry Baseline

5. Open Research Problems and Future Hypotheses

5.1. Open Problem 1: Intermittent Connectivity in Mas

5.2. Open Problem 2: Coupled Cyber-Physical Dynamics

5.3. Open Problem 3: Non-Stationary Environment in Marl

5.4. Open Problem 4: Asymmetric Information in Game Theory

5.5. Open Problem 5: Scalability of Proof-of-Work Blockchain

5.6. Open Problem 6: Lack of Physics-Informed Learning

5.7. Absence of Real-Time Validation Benchmarks

6. Future Directions - Significant Expansion

6.1. Digital Twin-Integrated Multi-Agent Systems

6.2. Physics-Informed Multi-Agent Learning

6.3. Federated Learning for Distributed State Estimation

6.4. Quantum Multi-Agent Systems

6.5. Edge Intelligence with Tinyml

6.5. Human-in-the-Loop Multi-Agent Systems

6.6. Emerging Directions: Quantum and Tinyml

7. Decision Flowchart

7.1. Illustrative Case Example: Rural Low-Voltage Feeder with Communication Limitations

8. Benchmarking

8.1. Current Benchmarking Gap

8.2. Performance Metrics

9. Open Problems

9.1. Formal Verification of Mas

9.2. Interoperability Between Heterogeneous Mas Protocols

9.3. Real-Time Hardware-in-the-Loop (Hil) Validation

9.4. Explainability of Mas Decisions

9.5. Energy-Neutral Mas

9.6. Open Validation Challenge for Hybrid Mas-Drl Approaches

10. Discussion

11. Conclusion

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Declaration Of Generative AI Use

Conflicts of Interest Statement

References

MDPI Initiatives

Important Links

Subscribe