This section formulates a discrete-time multi-agent model abstracting a theatre workshop. A distinctive feature of this model is that it incorporates Hill function nonlinearity and inter-variable coupling into the predictive model itself, so that agents possess this dynamics as an internal model and reflect it in action selection through the Risk term of EFE.
All simulations were implemented in Python using standard numerical libraries.
4.2. State Variables
This subsection introduces the set of latent and internal variables required to express the workshop dynamics as a tractable multi-agent POMDP on a network. We distinguish between a shared, partially observable field state (collective trust), dyadic relational states (interpersonal trust), and agent-specific internal states (empowerment and stamina). Inter-agent coupling is structured by a sparse network, where agents perceive a local trust-weighted average of others’ behavior. This design choice makes explicit what agents must infer (collective and interpersonal trust) versus what they can access directly (their own empowerment and stamina), which is central for the epistemic term in Expected Free Energy.
i: Agent (participant) index
t: Discrete time step
: Relative time within planning horizon
: Collective trust (Gaussian latent variable)
: Agent i’s empowerment (Gaussian latent variable)
: Agent i’s stamina
: Interpersonal trust between agents i and j (network edge weight)
Agents are embedded in a sparse undirected network with adjacency matrix , where . The network is generated as an Erdos–Rényi random graph with connection probability , where is the target average degree (default: ). Each edge carries a dynamic interpersonal trust weight that evolves based on action synchrony (see Section 4.3.0.21). The neighborhood of agent i is denoted .
This network structure serves two purposes:
- 1.
Avoiding complete mean-field approximation: Rather than assuming all agents observe the same global expression rate , each agent perceives a local expression rate computed over its neighborhood.
- 2.
Maintaining computational tractability: The sparse network topology () and local averaging keep inference tractable while introducing heterogeneous social perception.
A key design principle is that the primary state variables
and
are modeled as
Gaussian latent variables without artificial clipping to
. This ensures theoretical consistency with the Gaussian assumptions underlying belief updating (Extended Kalman Filter) and information gain computation. Physical effects that require bounded values are obtained through sigmoid transformations:
Interpretation of Unbounded States:
Negative S (distrust): Values represent collective distrust or psychological unsafety. The sigmoid transformation ensures that negative trust does not produce paradoxical amplification effects.
Negative u (disempowerment): Values represent a sense of powerlessness or learned helplessness.
Large positive values: Values significantly above 1 represent strong trust or high empowerment, with diminishing marginal physical effects due to sigmoid saturation.
This design choice keeps the model’s state dynamics and belief updating aligned with the Gaussian assumptions used in inference (Extended Kalman Filter) and in the information gain computation. The information gain expression is exact under linear–Gaussian assumptions. Accordingly, we keep and unconstrained and apply sigmoid transformations only when mapping latent variables to bounded interaction terms, maintaining internal consistency of the computations.
: Agent i’s action (0: Rest, 1: Chat & Exercise, 2: Express)
: Expression indicator,
The three-valued action space distinguishes between:
Rest (): Passive observation with stamina recovery.
Chat & Exercise (): Low-cost social interaction that provides information gain about interpersonal trust (via ) and also updates through action synchrony.
Express (): High-commitment creative expression that consumes stamina, builds collective trust S via the Hill-function mechanism, and updates interpersonal trust through action synchrony—but does not provide information gain (see Supplementary Material (Sec. S-X) for sensitivity analysis of this design choice).
The model supports two modes of collective coupling:
(a) Global statistics (used for collective trust
S dynamics):
(b) Local statistics (used for empowerment and stamina dynamics):
where
is agent
i’s neighborhood,
is the sigmoid function, and
is a small constant for numerical stability.
The local expression rate is a trust-weighted average over the agent’s neighborhood: neighbors with higher interpersonal trust contribute more to the perceived social context. This replaces the mean-field statistic used in earlier formulations, enabling heterogeneous social perception while maintaining computational tractability.
The model employs an intentional asymmetry in information scope that reflects the physical structure of theatre workshop settings:
Global coupling for collective trust S: Expressive acts (Express) are publicly visible performances that all participants can observe simultaneously—akin to stage performances visible to the entire room. Accordingly, the collective trust
S is updated based on the
global expression rate
(Eq.
11).
Local coupling for empowerment and stamina : In contrast, the psychological impact of others’ expression on one’s own sense of empowerment and fatigue depends on
whom one is attending to—typically nearby participants with whom one has established interpersonal trust. Hence, empowerment (Eq.
12) and stamina dynamics use the
local trust-weighted expression rate
.
Local coupling for interpersonal trust : Dyadic trust evolves based on pairwise action synchrony between connected agents (Eq.
15).
This two-layer structure—global media (publicly visible expression affecting shared trust climate) combined with local communication (interpersonal relationships affecting individual empowerment)—generalizes beyond theatre workshops. Analogous patterns appear in organizational settings (company-wide announcements vs. team-level interactions), social media (viral content vs. direct messaging), and community dynamics (public gatherings vs. neighborhood conversations).
In Active Inference, agents minimize variational free energy to approximate the posterior distribution over hidden states given observations [
3,
4]. Since collective trust
and interpersonal trust
are not directly observable, agent
i maintains a probabilistic belief
over them. We employ a Gaussian approximation for these beliefs, parameterized by their sufficient statistics (mean and variance). This belief state constitutes the agent’s “perception” of the social environment and serves as the basis for calculating the epistemic value (information gain) of future actions.
: Mean and variance of belief about collective trust S
: Mean and variance of belief about interpersonal trust
Active Inference unifies perception and action under the single imperative of minimizing free energy. Goal-directed behavior arises from the agent’s
prior preferences over states, denoted as
. Action selection minimizes Expected Free Energy (EFE), which includes a “Risk” term defined as the KL divergence between the predicted state distribution and the preferred distribution
[
4]. In this model, agents possess preferred setpoints for trust, empowerment, and stamina. Crucially, unlike fixed reward functions in standard RL, these preference parameters (specifically
) can themselves be updated through learning (see
Section 4.7).
: Preferred collective trust level
: Preferred empowerment level (adaptively learned)
: Preferred stamina level
4.3. State Transition Dynamics
We specify a discrete-time generative process in which (i) collective trust evolves as a leaky (AR(1)) process with a nonlinear, threshold-like social amplification term, (ii) empowerment accumulates through self- and other-driven gains modulated by trust, and (iii) stamina implements a simple resource constraint with recovery and cooperative cost reduction.
where
is process noise,
is the global expression rate (Eq.
10), and
is the global chat rate.
Here, captures gradual decay toward a baseline, the Hill term implements a sharp increase in trust once expression becomes sufficiently common (critical threshold K), and the linear term provides a modest contribution from Chat & Exercise activity.
Parameters:
: Decay coefficient ()
: Baseline constant
: Hill function effect strength (Express)
: Linear effect strength (Chat & Exercise)
The Hill function term captures the nonlinear amplification of trust when group expression exceeds the critical threshold
K. Conceptually, this corresponds to the idea that a psychologically safe climate can emerge nonlinearly once sufficient interpersonal risk-taking (here, expressions) becomes common [
26,
27].
where
is process noise,
is the sigmoid function (Eq.
7), and
is the local expression rate (Eq.
10).
The cooperative factor uses sigmoid-transformed empowerment to prevent runaway positive feedback. The trust–empowerment coupling term makes vicarious gains systematically larger when inferred trust is high. Both sigmoid transformations ensure bounded contributions regardless of the magnitude of the underlying Gaussian latent variables.
Parameters:
: Decay coefficient ()
: Empowerment increase from self-expression
: Empowerment increase from others’ expression
: Cooperative amplification strength
: Trust–empowerment coupling coefficient
Theoretical Significance of Trust–Empowerment Coupling:
The underlined term represents that when the field trust S is high, the positive effect of neighbors’ expression on one’s own empowerment is amplified.
Treatment of Negative Trust: The trust variable
is modeled as a Gaussian latent variable (Eq.
11) and can take any real value. Physical effects that depend on trust use sigmoid-transformed values, ensuring bounded contributions:
This design has several advantages:
- 1.
Theoretical consistency: The Kalman filter belief updating and information gain formulas remain exact for the underlying Gaussian process.
- 2.
Asymmetric interpretation: Negative trust (distrust) produces near-zero effective coupling, while positive trust produces positive coupling. This captures the psychological observation that distrust does not symmetrically reverse trust effects—rather, it attenuates or nullifies them.
- 3.
Smooth dynamics: The sigmoid transformation is differentiable everywhere, avoiding discontinuities that could destabilize the dynamics.
This coupling term models several interrelated psychological phenomena. In trusted environments, others’ expressions more readily empower the observer through empathic resonance—witnessing a breakthrough can feel as if it were one’s own—and through imitation and contagion of expressive content [
26,
27]. Social learning is facilitated: others’ successful experiences transfer more easily to one’s own self-efficacy in safe contexts [
24,
28], and supportive environments promote internalization and autonomous motivation [
29]. The resulting interaction between
S and
u creates positive feedback that stabilizes multistability (here: bistability).
This piecewise update implements recovery during rest, mild depletion during chat & exercise, and significant depletion during expression, with costs reduced when neighbors are also expressing (cooperative cost reduction via local expression rate ). The min/max operators keep stamina within , representing a hard resource constraint.
Parameters:
: Maximum stamina
: Stamina recovery during rest
: Chat & Exercise cost (small)
: Basic expression cost
: Cooperative cost reduction coefficient
: Hill coefficient (stamina)
: Half-saturation constant (stamina)
The factor
implements
cooperative cost reduction: expression costs less when neighbors are also expressing. This positive feedback mechanism is key to enabling multistability. This stylized assumption is consistent with classic findings that the presence of others can systematically modulate effort and performance (social facilitation) [
30], and with psychophysical accounts that treat effort as a subjective, reportable quantity (ratings of perceived exertion) [
31].
Sociological Justification for the Hill Function: While the Hill function is originally derived from biochemistry (cooperative oxygen binding to hemoglobin), we employ it here as a phenomenological ansatz for
threshold-dependent social reinforcement. This aligns with sociological theories of “critical mass” and “complex contagion” [
32], which posit that adoption of costly or risky behaviors (like expressive performance) requires social reinforcement from multiple sources exceeding a specific threshold, unlike simple information spreading. The Hill coefficient
n controls the steepness of this threshold, effectively modeling how strictly the group enforces this critical mass requirement.
From an ecological perspective, the Hill function formalizes an affordance structure: the collective environment either affords or constrains sustained expression depending on whether participation exceeds the critical threshold. This framing clarifies that we are specifying what the environment makes possible—a physical and psychosocial constraint—rather than what agents should prefer. The collective patterns that emerge are not built into preferences but arise from agents navigating this affordance landscape under EFE minimization.
We use a Hill-type nonlinearity as a compact way to represent cooperative, threshold-like changes in endurance costs. The Hill function is classically associated with cooperative oxygen binding and the resulting sigmoidal saturation curve [
22], and endurance physiology highlights oxygen transport as a key determinant of fatigue resistance [
33]. By analogy, when group expression exceeds a critical threshold, a “collective oxygen supply” effect emerges: mutual entrainment and synchronization among participants—well documented in sports science and rhythmic coordination [
30]—reduces the subjective effort required for sustained expression. Empirically, perceived effort and fatigue exhibit nonlinear dependence on physiological and contextual factors [
31]. In this model the Hill term is not a literal haemoglobin model; it is an abstract mechanism by which supportive collective conditions can reduce the effective cost of expression once participation exceeds a critical level.
In addition to the collective trust
, agents maintain dyadic interpersonal trust weights
on network edges. These weights evolve based on
action synchrony—whether paired agents take coordinated actions:
where
is process noise and
is the
synchrony effect function:
Interpretation:
Express–Express synchrony (): Joint creative expression strongly builds interpersonal trust.
Chat & Exercise synchrony (): Mutual conversation and exercise moderately builds trust.
Asynchrony (): Mismatched actions (one expresses while the other rests) erode trust.
Table 2.
Synchrony effect function .
Table 2.
Synchrony effect function .
|
Rest (0) |
Chat & Ex. (1) |
Express (2) |
| Rest (0) |
|
|
|
| Chat & Ex. (1) |
|
|
|
| Express (2) |
|
|
|
The trust modulation ensures that interpersonal trust builds more readily in high collective trust environments. Since (undirected network), both agents experience the same trust update.
Key distinction: All actions (Rest, Chat & Exercise, Express) affect the
true dynamics of
through the synchrony matrix
. However, only Chat & Exercise provides
information gain about
(see Eq.
54). This separation means that Express can build interpersonal trust through joint creative activity, but agents cannot directly observe this trust-building; they can only learn about interpersonal trust through Chat & Exercise.
Network Parameters:
Table 3.
Network parameters and default values.
Table 3.
Network parameters and default values.
| Symbol |
Description |
Default |
|
Average network degree |
4 |
|
Interpersonal trust decay |
0.90 |
|
Interpersonal trust bias |
0.00 |
|
Synchrony effect strength |
0.40 |
|
Interpersonal trust noise |
0.10 |
For belief updating over
, agents use variational message passing (VMP) with Jaakkola bounds to handle the sigmoid nonlinearity in Eq.
10. Details are provided in Supplementary Material (Sec. S-XIII).
4.7. Precision-Gated Preference Learning
A distinctive feature of our model is that preferences can update dynamically based on experience. This implements and extends
Preferential Inference [
4] to include adaptive preference parameter learning.
4.7.1. Theoretical Foundation: Hierarchical Bayesian Model
We formulate preference learning as variational inference over a hierarchical generative model. Rather than treating the preference parameter (the empowerment setpoint) as updated by ad hoc threshold rules, we hierarchically embed it as a latent hyperparameter subject to precision-weighted Bayesian updating.
Hierarchical Generative Model:
where is a “learning signal” derived from experienced empowerment, and is a trust-dependent observation variance (inverse precision).
4.7.2. Precision Modulation
The key mechanism is
precision modulation: the precision (inverse variance) of the learning signal depends on current trust level
.
where
is the logistic function, and
controls the sharpness of the transition.
Interpretation: When trust is high (), the precision approaches , meaning observed empowerment is treated as a reliable signal for updating preferences. When trust is low, precision approaches , and the same observation has minimal influence on preference learning. This implements the psychological intuition that high empowerment experiences in untrusted environments are likely to be attributed to external factors (“just happened to work out”) rather than internalized as genuine capability.
Theatre workshop interpretation: A participant who spontaneously leads a scene during an early, awkward session may dismiss the experience as a fluke. The same experience in a cohesive, supportive group is more likely to be internalized as “I can do this”—updating the agent’s preference setpoint upward.
4.7.3. Variational Update (Kalman Form)
Given the hierarchical model, the variational (or Kalman) update for the preference mean is:
where the
Kalman gain is determined by the precision ratio:
and
is the prior variance of the preference hyperparameter.
4.7.4. Empowerment Overshoots Latent Variable
To implement the empirical observation that comfort zones “expand but do not easily contract,” we introduce a latent
empowerment overshoots indicator :
When , the empowerment experience is treated as a “mastery” signal that updates preferences. When , the experience is attributed to external factors and does not update preferences.
Theatre workshop interpretation: The empowerment overshoots correspond to Grotowski’s “removal of habitual defenses” [
7]—a moment when the participant breaks through their usual shell and acts beyond their prior comfort zone (
). Only such threshold-crossing experiences expand the preference setpoint. Incremental improvements within the current comfort zone do not trigger this update, reflecting the experiential distinction between routine practice and transformative breakthrough.
Unidirectional expansion: By setting when , we ensure that preferences only update upward, never downward. This implements the “ratchet effect” whereby comfort zones expand irreversibly.
Combined Update: The effective update weight is , combining:
Precision gating (): How much to trust the observation (trust-dependent)
-
Mastery gating ():
Whether the experience qualifies as genuine growth (gap-dependent)
Table 4.
Preference learning parameters.
Table 4.
Preference learning parameters.
| Symbol |
Description |
Value |
|
Minimum learning precision (low trust) |
0.01 |
|
Maximum learning precision (high trust) |
0.5 |
|
Precision modulation sharpness |
4.0 |
|
Trust threshold for precision center |
0.5 |
|
Hyperstate drift noise |
0.005 |
|
Prior variance of preference |
0.1 |
|
Mastery detection sharpness |
5.0 |
|
Minimum gap for mastery detection |
0.3 |
4.7.5. Relationship to Prior Work
This formulation connects to several strands of Active Inference literature:
Preferential Inference: Da Costa et al. [
4] define preferential inference as approximating
with
, where the preference model depends on history. Our extension treats preference
parameters (not just the conditional distribution) as subject to inference.
Precision as confidence: The interpretation of precision as “confidence” or “reliability” is standard in Active Inference [
34]. Here, we apply this principle to preference learning rather than just observation.
Empirical priors: Friston’s formulation of empirical priors [
3] allows priors to depend on random variables learned from experience. Our preference hyperparameter
functions as such an empirical prior.
4.8. EFE Computation and Action Selection
4.8.1. State Variable Prediction
Prediction of each state variable
steps ahead. Note that
the agent’s predictive model is a simplification of the true environmental dynamics. In particular, the trust–empowerment coupling term
that appears in the environmental state transitions (Eq.
12) is omitted from the agent’s internal predictive model. This design choice reflects three considerations:
- 1.
Partial observability of trust: Since is only partially observable, the agent cannot condition predictions on its true value. Substituting the belief mean would add uncertainty and bookkeeping to multi-step predictions.
- 2.
Complexity of social interaction: Predicting co-evolving others’ behavior and collective trust is cognitively demanding. Omitting this interaction term is consistent with bounded rationality: agents deploy a simpler internal model that respects realistic limits on prospective social prediction [
35].
- 3.
Conservative prediction: Without the trust-mediated bonus, predicted empowerment gains are conservative; any such gains realized in the environment appear as positive surprise.
This
model mismatch is consistent with bounded rationality [
35] and does not prevent effective action selection.
(a) Collective Trust S (Partially Observable → Expressed as Belief)
where mean and variance are computed recursively:
where
is the expression indicator for action
a.
(b) Empowerment u (Fully Observable → Point Estimate)
The agent’s predictive model for empowerment omits the trust–empowerment coupling:
where
indicates that only Express actions contribute self-expression gains.
(c) Stamina H (Fully Observable → Point Estimate)
4.8.2. Risk Computation (All Four State Variables)
Risk is the KL divergence between predicted states and preferred states. Assuming independence of each state variable:
(a) Risk for Trust (KL divergence between Gaussian distributions) We quantify pragmatic deviation from preferences by the Kullback–Leibler divergence between the agent’s predicted belief over
and its preferred distribution. Let
Then
admits a closed form. Starting from
and using the standard Gaussian identities for quadratic expectations,
we obtain the expression below.
Implementation note: For computational efficiency, the implementation uses a quadratic approximation that omits the constant and logarithmic terms:
where
is a weighting coefficient. This approximation is motivated by retaining only the terms that vary with action-dependent predictions: the mean-deviation penalty
and the uncertainty penalty
. The remaining terms (
and the logarithmic ratio) are either constant under fixed preference variance or can introduce numerical instability when
becomes very small.
(b) Risk for Empowerment
Empowerment
u is treated as (effectively) fully observable to the agent, so the pragmatic term does not require a belief distribution with substantial uncertainty. Formally, one can recover the squared-deviation form as a small-variance limit of a KL divergence. For example, represent the predicted state by a narrow Gaussian
with
, and the preference by
. Then
so, up to constants and a rescaling, Risk reduces to a squared deviation from the preferred empowerment level. Accordingly, we use:
Implementation note: The implementation includes a predictive variance term to account for future uncertainty in
u:
where
accumulates process noise over the prediction horizon.
extbf(c) Risk for Stamina
This form corresponds to the “fully observed” or “zero-variance” limit: if one were to represent stamina with a narrow Gaussian belief and take , the KL-based risk reduces (up to constants) to a squared deviation from the preferred level .
extitImplementation note: Similarly to empowerment, the implementation can include a predictive variance term for
H to penalize uncertainty accumulation over the horizon:
where
summarizes accumulated process noise in the stamina prediction.
(d) Risk for Interpersonal Trust (aggregated over neighborhood)
The Risk for interpersonal trust is computed using the
aggregated interpersonal trust and its uncertainty
:
where
and
are the mean and variance of agent
i’s belief about
.
The variance expression follows from a standard propagation-of-uncertainty argument. If the dyadic beliefs are treated as conditionally independent Gaussians
given
, then their average
is also Gaussian with
which yields
.
The Risk penalizes deviation from a neutral interpersonal trust preference
:
This quadratic form can be viewed as the same approximation used for collective trust (Eq.
41), applied to the aggregated latent variable
. Intuitively, it encourages neighborhood-averaged trust to stay near a “neutral” target while discouraging policies that leave large uncertainty in dyadic relations.
Parameters:
4.8.3. Information Gain Computation (Computed for latent S and W)
In our formulation, the information-gain (epistemic) term is computed for partially observable latent variables: the collective trust S and the interpersonal trust . Because are directly observed at each time step in the model, they yield zero observation-based information gain under the assumed observation model. Importantly, this does not mean that epistemic value is irrelevant for predicting future internal dynamics: since the transitions of u and H depend (directly or indirectly) on S and W, reducing uncertainty about these latent variables also reduces uncertainty about futureu and H trajectories through the predictive model.
Under the linear Gaussian model, information gain for
S can be computed exactly:
where
H is the observation matrix (linearized gradient of the observation function).
Implementation note: The implementation assumes a direct observation model with
, yielding:
This simplification is appropriate when the observation function is approximately linear near the current belief mean.
Important Design: Since (expressing reduces observation noise), expression action increases information gain for S.
A key design feature of the network model is that information gain about interpersonal trust is obtained exclusively through Chat & Exercise actions. This separation implements the distinction between:
Curiosity-driven social exploration (Chat & Exercise): Low-commitment interaction that reveals information about interpersonal relationships.
Expression-driven coordination (Express): High-commitment action focused on building collective trust and empowerment.
The information gain for interpersonal trust is:
where
is the average variance of beliefs about neighbors’ trust. Note that the term is scaled by
, reflecting that Chat & Exercise provides information about
all neighbors simultaneously. While this implies that agents with higher degrees (more connections) can potentially gain more total information, this bias is consistent with the social reality that well-connected individuals have more to learn from social interaction. In our simulations using Erdos-Rényi graphs, the degree distribution is relatively homogeneous, minimizing the impact of this potential bias.
Interpretation: Chat & Exercise provides observations that reduce uncertainty about neighbors’ trustworthiness (
), while Rest and Express do not provide such information (
). However,
both Chat & Exercise and Express update the true value of through action synchrony (Eq.
15): Express–Express synchrony strongly builds interpersonal trust (
), while Chat & Exercise synchrony moderately builds trust (
).
This creates a three-way tradeoff in action selection:
Rest: Stamina recovery, minimal information gain, weak synchrony effect on .
Chat & Exercise: Moderate stamina cost, (curiosity-driven exploration), moderate synchrony effect on .
Express: High stamina cost, (collective trust observation), empowerment gain, strong synchrony effect on but .
Theatre workshop interpretation: Joint improvisation (Express) builds solidarity through shared creative activity—interpersonal trust genuinely increases—but participants are absorbed in the performance itself, not attending to how others perceive them. Only through conversation and group exercises (Chat & Exercise) can an agent recognize, “Ah, that person trusts me.” This asymmetry between building trust and knowing about trust captures a realistic feature of experiential learning.
4.8.4. Differing Roles of Fully and Partially Observable Variables
This structure implies:
For S: Both information gain (via Express) and risk contribute.
For : Information gain (via Chat & Exercise) and risk contribute; Chat & Exercise enables curiosity-driven social exploration.
For : Risk contributes directly; epistemic drive enters indirectly via reduced uncertainty in S and W which improves multi-step prediction of .
Table 5.
Observability and EFE contributions of state variables.
Table 5.
Observability and EFE contributions of state variables.
| Variable |
Observability |
Contribution to Risk |
Information Gain |
|
S (Collective Trust) |
Partial |
Belief vs. preference |
Yes (Express) |
|
(Interpersonal Trust) |
Partial |
Aggregated belief vs. preference |
Yes (Chat & Ex. only) |
|
u (Empowerment) |
Full |
Prediction vs. preference |
No |
|
H (Stamina) |
Full |
Prediction vs. preference |
No |
4.8.5. N-Step Expected Free Energy
The
N-step EFE is defined as:
Implementation note: When computing the
N-step EFE, the variance at step
depends on expected observations at earlier steps. The exact update requires a Kalman update at each step; for computational efficiency, the implementation uses an exponential approximation:
This approximation exploits the relationship between information gain and variance reduction in Gaussian models.
4.8.6. Action Selection Rule
Action selection is based purely on EFE with softmax policy over the three-valued action space:
This is softmax action selection, where is the inverse temperature parameter controlling the exploration-exploitation tradeoff. Higher leads to more deterministic selection of the action with lowest EFE.
4.9. Propagation of Nonlinear Effects to Risk
Although the nonlinearities in this model are implemented in the state transitions (Hill-type collective effects and trust-gated learning), their behavioral consequence is expressed through the Risk component of EFE because Risk scores the mismatch between predicted trajectories and preferred setpoints. In particular, the empowerment Risk (Eq. for
) depends on the predicted future empowerment
under each candidate action. When collective trust
S is high in the environment, the true dynamics provide a stronger vicarious gain channel through the trust–empowerment coupling term in Eq.
12. As a result, policies that include Express (directly or indirectly by inducing others’ expression) tend to yield higher realized empowerment than in low-trust regimes. Even though the agent’s internal predictive model is intentionally simplified (
Section 4.8), this regime dependence still feeds back into planning via the accumulated prediction error and the subsequent updates of belief and preference parameters.
Comfort-zone expansion then changes the geometry of the same Risk term. Preference learning updates the empowerment setpoint
upward when a high empowerment experience is encountered under sufficient inferred trust (
Section 4.7). Because
penalizes deviations of
from
, an increase in
immediately reweights the EFE landscape: trajectories that would previously be “too high” in
u become less risky, making sustained high-expression/high-empowerment regimes more self-consistent under EFE minimization. Conversely, in low-trust phases the precision-gating mechanism suppresses updates of
, so the agent retains a lower setpoint and high-
u trajectories remain costly.
The interaction of these two mechanisms strengthens a key signature of chaotic itinerancy: prolonged residence near multiple quasi-stable (metastable) regimes and irregular transitions among them. High-trust episodes make it easier for expression to generate large empowerment excursions; if such excursions are consolidated into a higher , future expressive policies become less penalized by Risk and therefore more likely to be selected, stabilizing a high-activity mode. If the system remains in a low-trust region long enough, the same consolidation does not occur, and the low-activity mode remains comparatively stable. Thus, nonlinear transition structure and trust-gated preference plasticity jointly shape the Risk term so that the group can dwell near quasi-stable regimes for extended periods and then transition to other regimes as conditions shift, consistent with the defining “residence-and-switching” pattern of CI.