Preprint
Article

This version is not peer-reviewed.

Entropic Geometry and Symmetry Breaking in Lie-Group Free-Energy Minimization

Submitted:

04 November 2025

Posted:

06 November 2025

You are already at the latest version

Abstract
We present a geometric formulation of entropic free-energy minimization as Riemannian gradient descent on Lie-group orbits endowed with the Fisher information metric. This approach reveals how symmetry structures constrain the dynamics of information and entropy reduction, linking variational inference to geometric thermodynamics. We establish well-posedness, Lyapunov monotonicity, and convergence theorems, and derive a second-variation criterion explaining entropic symmetry breaking and bifurcations. Examples on Gaussian families under translations and rotations illustrate the interplay between group invariance and adaptive stability. The results provide a unified view connecting information geometry, thermodynamics, and the Free Energy Principle through a group-theoretic lens.
Keywords: 
;  ;  ;  ;  ;  

1. Introduction

The concept of free energy unifies thermodynamic, informational, and biological principles of organization. Within the Free Energy Principle (FEP), adaptive systems minimize an internal free-energy functional to resist disorder and maintain self-organization [1,2]. Yet, the geometric structure underlying this process—how symmetry and invariance constrain information flow—remains underexplored. Here we formulate free-energy minimization as an entropic gradient flow on Lie-group orbits, allowing a unified view of symmetry, stability, and self-organization. Recent expositions provide simplified overviews and technical clarifications that we leverage here [3,4].

Contributions

We provide: (i) an orbit-reduced formulation of variational free energy; (ii) first- and second-variation formulas on orbits; (iii) local well-posedness and Lyapunov monotonicity; (iv) convergence on compact groups and under the Kurdyka–Łojasiewicz (KŁ) property; (v) stability criteria via the orbit-restricted Hessian; (vi) a pitchfork scenario for SO ( 2 ) ; and (vii) worked Gaussian examples with explicit derivatives.

2. Preliminaries

2.1. Standing Assumptions and Notation

(Q1) Q = { q θ : θ Θ R n } is a C 2 statistical manifold with Fisher metric g F . (Q2) G is a finite-dimensional Lie group acting smoothly on Q by ρ , with stabilizer H at q 0 ; the orbit O q 0 G / H is an embedded submanifold. (Q3) The free-energy functional F : Q R is C 2 and bounded below on O q 0 . (Q4) G carries a right-invariant C 1 Riemannian metric γ . Throughout, F ( g ) : = F [ ρ ( g , q 0 ) ] , G denotes the gradient with respect to γ , and L X the infinitesimal action for X g .

2.2. Variational Free Energy

Given a generative model p ( s , u , θ ) and a variational posterior q ( u , θ ) ,
F [ q ] = E q [ log q ( u , θ ) ] E q [ log p ( s , u , θ ) ] = KL q ( u , θ ) p ( u , θ s ) log p ( s ) ,
so minimizing F recovers the evidence lower bound.

2.3. Statistical Manifolds

Let Q = { q θ : θ Θ } with log-density ( x , θ ) = log q θ ( x ) . The Fisher information metric is
g i j ( θ ) = E q θ i j = E q θ [ i j ] ,
with natural gradient nat F ( θ ) = G ( θ ) 1 F ( θ ) for smooth F : Q R .

2.4. Group Actions and Orbits

Optimization restricted to the orbit O q 0 = { ρ ( g , q 0 ) : g G } G / H converts min q Q F [ q ] to min g G F ( g ) with F ( g ) = F [ ρ ( g , q 0 ) ] .

3. Orbit Gradients

We make precise the identification between the natural gradient on Q and the Riemannian gradient on G.
Theorem 1
(Equivalence of natural and group gradients on the orbit). Under (Q1)–(Q4), let ξ g ( ξ ) = exp ( i ξ i X i ) parametrize G near e, and set q ( ξ ) = ρ ( g ( ξ ) , q 0 ) . Assume the Jacobian J ( ξ ) = q ( ξ ) / ξ has full column rank along O q 0 . Equip G with the induced metric G G ( ξ ) = J ( ξ ) G Q ( ξ ) J ( ξ ) from the Fisher metric G Q . Then for any smooth F ( g ) = F [ ρ ( g , q 0 ) ] , the natural gradient of F restricted to O q 0 corresponds exactly to the Riemannian gradient on ( G , G G ) ; i.e.
Proj T O q 0 nat F ( q ( ξ ) ) G F ( g ( ξ ) ) .
Proof. 
Let δ ξ R dim G and consider the variation g ( ξ + t δ ξ ) . By the chain rule, d F = d F J , where J = q / ξ . Denote by G Q the Fisher metric on T Q and define an induced inner product on parameter increments by δ ξ 1 , δ ξ 2 G G : = J δ ξ 1 , J δ ξ 2 G Q . This is positive definite by full column rank of J. The Riesz representation of d F with respect to · , · G G yields the unique η such that d F [ δ ξ ] = η , δ ξ G G for all δ ξ . But d F [ δ ξ ] = nat F , J δ ξ G Q = J G Q nat F , δ ξ Eucl . Thus η corresponds to G G 1 J G Q nat F , which is precisely the coordinate representation of the Riemannian gradient on ( G , G G ) . Projecting to T O q 0 accounts for any null directions associated with the stabilizer H. □
Remark 1
(Coordinate formula). In local coordinates, G F = G G 1 J Q F , with G G = J G Q J . This realizes a Gauss–Newton structure typical of natural-gradient methods.

4. Gradient Flows and Convergence

Define the gradient flow on G by
g ˙ ( t ) = G F g ( t ) , g ( 0 ) = g 0 G .
Theorem 2
(Local well-posedness). If F C 1 ( G ) and G F is locally Lipschitz with respect to γ, then (3) admits a unique maximal solution from any initial point.
Proof. 
Right-translate by R g ( t ) 1 and write g ˙ = d R g v with v ( t ) g . The map g v ( g ) : = d R g 1 G F ( g ) is locally Lipschitz on charts. This gives a locally Lipschitz ODE v ˙ = f ( v ) on g in coordinates; Picard–Lindelöf on manifolds (via charts and partition of unity) yields a unique solution. □
Theorem 3
(Lyapunov monotonicity). Along any solution of (3),
d d t F g ( t ) = d F , g ˙ = γ G F , g ˙ = G F γ 2 0 .
If F is bounded below, the limit lim t F g ( t ) exists.
Proof. 
The identity d F ( ξ ) = γ ( G F , ξ ) holds by definition of the gradient. Substitute g ˙ = G F . □
Theorem 4
(Asymptotics on compact groups). If G is compact and F C 2 ( G ) has only nondegenerate critical points, then every trajectory of (3) has ω-limit set contained in the (finite) set of critical points. In particular, every bounded trajectory approaches the set of critical points; moreover, if F additionally satisfies the KŁ property at its critical points (e.g. F is real-analytic), then the trajectory converges to a single critical point.
Proof. 
Compactness implies that sublevel sets { F c } are compact. By Theorem 3, F ( g ( t ) ) decreases and is bounded below, hence g ( t ) remains in a compact set and admits accumulation points, all critical by lim G F = 0 . Nondegeneracy and the stable manifold theorem imply convergence to a single critical point. □
Theorem 5
(Convergence under KŁ property). Assume F C 1 ( G ) is bounded below and satisfies the Kurdyka–Łojasiewicz property at every critical point (e.g., F is real-analytic on an analytic G). Then every bounded trajectory of g ˙ = G F ( g ) has finite length and converges to a single critical point as t .
Proof. 
Let g ( t ) be bounded with F ( g ( t ) ) F . The KŁ inequality provides φ ( F F ) G F 1 near the limit set for some desingularizing function φ . Integrating g ˙ = G F over time and applying the inequality shows 0 g ˙ d t < , hence g ( t ) has finite length and is Cauchy; completeness of G yields convergence to a critical point. □

5. Second Variation and Stability

We derive the orbit-restricted Hessian and a stability test.
Lemma 1
(First and second variations along one-parameter subgroups). For X g and g ( t ) = g exp ( t X ) ,
d d t F ( g ( t ) ) | 0 = d F | ρ ( g , q 0 ) , L X ρ ( g , q 0 ) .
If F C 2 , then along X , Y g at a critical g * with q * = ρ ( g * , q 0 ) ,
Hess G F ( g * ) [ X , Y ] = D 2 F | q * L X q * , L Y q * + 1 2 D F | q * L [ X , Y ] q * .
Proof. 
For (5), differentiate F ( g exp ( t X ) ) = F [ ρ ( g exp ( t X ) , q 0 ) ] and use d d t ρ ( g exp ( t X ) , q 0 ) | t = 0 = L X ρ ( g , q 0 ) . For (6), differentiate (5) again in the direction Y and invoke the symmetry of D 2 F plus the Lie-bracket identity d d s L X ρ ( g exp ( s Y ) , q 0 ) | s = 0 = L [ Y , X ] ρ ( g , q 0 ) . □
Proposition 1
(Stability on the orbit). Let h be the Lie algebra of the stabilizer at q * . If the quadratic form X Hess G F ( g * ) [ X , X ] is positive definite on g / h , then g * is a strict local minimum and asymptotically stable for (3). Negative/indefinite signatures yield maxima/saddles.
Proof. 
Positive definiteness implies that F ( g ) F ( g * ) + c dist ( g , g * ) 2 in a neighborhood (in exponential coordinates), hence F is a Lyapunov function with a strict minimum. The linearization of (3) at g * has spectrum in ( , 0 ) on g / h , yielding asymptotic stability by the Hartman–Grobman theorem on manifolds. □

6. Symmetry and Bifurcation

Definition 1
(Group invariance). The functional F is G-invariant if F [ ρ ( g , q ) ] = F [ q ] for all g G , q Q . Then F is constant on G, and every g is critical. Partial invariance or data terms induce nontrivial landscapes.
Theorem 6
(Pitchfork for SO ( 2 ) ). Let SO ( 2 ) act on planar Gaussians by covariance conjugation, and consider a one-parameter family F λ . If at λ = λ c the smallest nonzero orbit-restricted eigenvalue of the Hessian crosses zero while symmetry suppresses the cubic term, then two nontrivial minima bifurcate from the symmetric one as λ passes through λ c .
Sketch. 
Normal-form reduction on the one-dimensional orbit coordinate θ gives F λ ( θ ) = a ( λ ) θ 2 + b θ 4 + o ( θ 4 ) with a ( λ c ) = 0 , b > 0 , and the odd cubic suppressed by symmetry. The change of sign of a yields a supercritical pitchfork. □

7. Examples

7.1. Translations of Means (Abelian Case)

Let Q be d-dimensional Gaussians with mean μ and fixed covariance Σ . The additive group ( R d , + ) acts by ( ρ ( ϵ ) q ) ( x ) = q ( x ϵ ) . Consider
F [ q ] = x μ 2 q ( x ) d x + λ KL ( q p ) .
Proposition 2.
The orbit cost ϵ F ( ϵ ) = F [ ρ ( ϵ , q ) ] is strictly convex and admits a unique minimizer ϵ * with F ( ϵ * ) = 0 .
Proof. 
Convexity follows from convexity of the squared norm and the joint convexity of KL ( · p ) under translations. Strictness holds unless p is itself translation invariant. Differentiating under the integral sign gives the first-order condition. □

7.2. Planar Rotations of Covariances

For zero-mean Gaussians,
KL N ( 0 , Σ ) N ( 0 , Σ 0 ) = 1 2 tr ( Σ 0 1 Σ ) log det Σ det Σ 0 d .
Let Σ ( θ ) = R ( θ ) Σ R ( θ ) with R ( θ ) the 2 × 2 rotation. Then
d Σ ( θ ) d θ = [ Ω , Σ ( θ ) ] , Ω = 0 1 1 0 .
Hence
d d θ F ( θ ) = 1 2 tr Σ 0 1 [ Ω , Σ ( θ ) ] = 1 2 tr [ Σ 0 1 , Ω ] Σ ( θ ) .
Critical points satisfy tr ( [ Σ 0 1 , Ω ] Σ ( θ ) ) = 0 , i.e. simultaneous diagonalizability of Σ ( θ ) and Σ 0 ; anisotropy yields two symmetric minima modulo π .

7.3. Rigid Motions SE ( 2 ) and Outlook to SE ( 3 )

The special Euclidean group couples rotations and translations; the induced metric on G blends mean and covariance directions. The same formalism extends to SE ( 3 ) for 3D pose models.
Figure 1. (Top) Free-energy landscape along a Lie-group orbit. (Bottom) Pitchfork bifurcation under S O ( 2 ) symmetry breaking.
Figure 1. (Top) Free-energy landscape along a Lie-group orbit. (Bottom) Pitchfork bifurcation under S O ( 2 ) symmetry breaking.
Preprints 183665 g001

8. Entropy Reduction Under Lie-Group Symmetry

This section interprets free-energy descent as a physical process of entropy reduction constrained by Lie-group symmetry.

Thermodynamic entropy and informational stability.

On Lie-group orbits, entropy reduction corresponds to the dissipation of uncertainty along symmetry-constrained manifolds. The Fisher metric quantifies the local curvature of information, and its geodesic flow describes the most efficient direction of entropy decrease under free-energy descent. Thus, stability of a group orbit can be interpreted as a steady state with (possibly nonzero) steady entropic production balanced by symmetry-constrained fluxes.

Information-theoretic analogy.

Write the free energy as
F [ q ] = E q [ log q ] E q [ log p ] = S ( q ) E q [ log p ] .
Along any smooth evolution of q, we have
F ˙ = S ˙ d d t E q [ log p ] .
Hence entropy reduction and evidence gain are coupled but not identical in general. In our orbit-restricted dynamics, Lyapunov monotonicity yields F ˙ = G F γ 2 0 , which we interpret as a nonnegative “entropic production rate” on the group manifold. Lie-group symmetries then restrict admissible directions of this production, shaping the accessible information flows.

Physical interpretation.

In this light, the orbit flow g ˙ = G F ( g ) describes a dissipative process that drives the system toward minimal free energy under constraints imposed by G. Equilibria on orbits correspond to stationary nonequilibrium states, and bifurcations of F signal transitions between entropic basins—an informational analogue of phase transitions.

9. Related Work

Classical information geometry [7,8] endows models with the Fisher metric and dual connections; natural gradient methods exploit this geometry. In contrast, we constrain inference to Lie-group orbits and move optimization to the group manifold. This orbit-centric view enables stability and bifurcation analyses less transparent in parameter space. We also draw on recent discussions of the FEP and Bayesian mechanics [3,4,5,6].

10. Discussion

We presented a geometric reduction of FEP to optimization on Lie groups, with explicit stability and bifurcation analyses. Future work includes thermodynamic formulations of entropy reduction on noncommutative groups ( SE ( 3 ) , matrix groups), and numerical exploration of entropic symmetry breaking. This framework bridges information geometry, nonequilibrium thermodynamics, and adaptive inference in a single mathematical language.

Acknowledgments

The author thanks colleagues for discussions on information geometry, Lie theory, and variational principles.

References

  1. K. Friston, A theory of cortical responses, Philosophical Transactions of the Royal Society B 360, 815–836 (2005). [CrossRef]
  2. K. Friston, The free-energy principle: a unified brain theory?, Nature Reviews Neuroscience 11, 127–138 (2010). [CrossRef]
  3. K. Friston, The free energy principle made simpler but not too simple, Physics Reports 1024, 1–43 (2023). [CrossRef]
  4. K. J. Friston, L. Da Costa, T. Parr, Some Interesting Observations on the Free Energy Principle, Entropy 23(8), 1076 (2021). [CrossRef]
  5. L. Da Costa, K. Friston, C. Heins, G. A. Pavliotis, Bayesian mechanics for stationary processes, Proceedings of the Royal Society A 477(2256), 20210518 (2021). [CrossRef]
  6. P. Ao, Emerging of Stochastic Dynamical Equalities and Steady State Thermodynamics from Darwinian Dynamics, Communications in Theoretical Physics 49(5), 1073–1090 (2008). [CrossRef]
  7. S.-I. Amari and H. Nagaoka, Methods of Information Geometry, AMS/OUP (2000).
  8. S.-I. Amari, Information Geometry and Its Applications, Springer (2016).
  9. F. Otto, The geometry of dissipative evolution equations: the porous medium equation, Communications in Partial Differential Equations 26(1–2), 101–174 (2001). [CrossRef]
  10. C. Villani, Optimal Transport: Old and New, Springer, Berlin (2009).
  11. J. M. Lee, Introduction to Smooth Manifolds, 2nd ed., Springer (2012).
  12. F. W. Warner, Foundations of Differentiable Manifolds and Lie Groups, Springer (1983).
  13. S. Helgason, Differential Geometry, Lie Groups, and Symmetric Spaces, Academic Press (1978).
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.
Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

Disclaimer

Terms of Use

Privacy Policy

Privacy Settings

© 2025 MDPI (Basel, Switzerland) unless otherwise stated