1. Introduction
Statistical mechanics (SM), in the formulation developed by E.T. Jaynes [
1,
2], is founded on an entropy optimization principle. Specifically, the Boltzmann entropy is maximized under the constraint of a fixed average energy
:
The Lagrange multiplier equation defining the optimization problem is:
where
and
are Lagrange multipliers enforcing the normalization and average energy constraints. Solving this optimization problem yields the Gibbs measure:
where
is the partition function.
For comparison, quantum mechanics (QM) is not formulated as the solution to an optimization problem, but rather consists of a collection of axioms [
3,
4]:
- QM Axiom 1 of 5
State Space: Every physical system is associated with a complex Hilbert space, and its state is represented by a ray (an equivalence class of vectors differing by a non-zero scalar multiple) in this space.
- QM Axiom 2 of 5
Observables: Physical observables correspond to Hermitian (self-adjoint) operators acting on the Hilbert space.
- QM Axiom 3 of 5
Dynamics: The time evolution of a quantum system is governed by the Schrödinger equation, where the Hamiltonian operator represents the system’s total energy.
- QM Axiom 4 of 5
Measurement: Measuring an observable projects the system into an eigenstate of the corresponding operator, yielding one of its eigenvalues as the measurement result.
- QM Axiom 5 of 5
Probability Interpretation: The probability of obtaining a specific measurement outcome is given by the squared magnitude of the projection of the state vector onto the relevant eigenstate (Born rule).
Physical theories have traditionally been constructed in two distinct ways. Some, like QM, are defined through a set of mathematical axioms that are first postulated and then verified against experiments. Others, like SM, emerge as solutions to optimization problems with experimentally-verified constraints.
We propose to generalize the optimization methodology of E.T. Jaynes to encompass all of physics, aiming to derive a unified theory from a single optimization problem.
To that end, we introduce the following constraint: [Nature]
where
are operators, and
is the average of their traces. This constraint, as it replaces the scalar
with the operators
, extends E.T. Jaynes’ optimization method to encompass non-commutative observables and symmetry group generators required for fundamental physics.
We then construct an optimization problem:
Definition 1 (Physics).
Physics is the solution to:
where:
t is the Lagrange multiplier enforcing the natural constraint
p is the number of spatial dimensions
w is an information density. As such, it adheres to all but one axiom of probability theory: it is non-negative, but it is not normalized to unity
is a matrix or operator (its specific form depends on the dimension and will be given in the results section)
the constraint is the continuum version of Axiom 1.
In the results section, we will see that probability conservation will emerge from Noether’s theorem as a specific charge conservation, instead of as a constraint in the optimization problem.
This definition constitutes our complete proposal for reformulating fundamental physics—no additional principles will be introduced. By replacing the Boltzmann entropy with the relative Shannon entropy, the optimization problem extends beyond thermodynamic variables to encompass any type of experiment. This generalization occurs because relative entropy captures the essence of any experiment: the relationship between a final measurement and its initial preparation.
Two key constraints shape our framework. The normalization constraint ensures we are working with a proper predictive theory, while the natural constraint spawns the domain of applicability of the theory. The crucial insight is that because our formulation maintains complete generality in the structure of experiments while optimizing over all possible predictive theories, the resulting solution holds true, by construction, for all realizable experiments within its domain.
This approach reduces our reliance on postulating axioms through trial and error, and simplifies the foundations of physics. Specifically, when we employ the natural constraint —the most permissive constraint for this problem (see Discussion for proof)— the solution spawns its largest domain, pointing towards a unified physics where fundamental theories emerge naturally—e.g. general relativity (acting on spacetime) + Yang-Mills (acting on internal spaces) when .
As we will see in the results section, definition 1 yields a valid solution only in the specific case of 3+1 dimensions. In other dimensional configurations, various obstructions arise violating the axioms of probability theory. The following table summarizes the geometric cases and their obstructions:
where
means the geometric algebra of
dimensions, where
p is the number of positive signature dimensions and
q of negative signature dimensions. We will investigate the unobstructed case in
Section 2.1 and then demonstrate the obstructions in
Section 2.2.
2. Results
Theorem 1.
The general solution of the optimization problem is:
Proof. We solve Definition 1 by taking the variation of the Lagrange multiplier equation with respect to
w. (To improve the legibility, we will drop the explicit parametrization in
and
in the proof.)
Finally, restoring the explicit parametrization, we end with:
□
2.1. Spinors + Dirac Equation
We will now investigate this solution in the context of . We begin with a definition of the determinant for a multivector.
2.1.1. The Multivector Determinant
Our goal here will be to express the determinant of a real
matrix as a multivector self-product. To achieve that, we begin by defining a general multivector of
:
where
a is a scalar,
a vector,
a bivector,
is pseudo-vector and
a pseudo-scalar. Explicitly,
Definition 2 (Real-Majorana Algebra Isomorphism).
The map defined by:
extends linearly and multiplicatively to an isomorphism between and the algebra of real matrices.
To manipulate and analyze multivectors in , we introduce several important operations, such as the multivector conjugate, the pseudo-blade conjugate, and the multivector determinant.
Definition 4 (Multivector Conjugate—in
).
Definition 5 (Pseudo-Blade Conjugate—in
).
The pseudo-blade conjugate of is
Lundholm [
5] proposes a number of multivector norms, and shows that they are the
unique forms which carries the properties of the determinants such as
to the domain of multivectors:
Definition 6.
The self-products associated with low-dimensional geometric algebras are:
where is a conjugate that reverses the sign of pseudo-scalar blade (i.e. the highest degree blade of the algebra).
We can now express the determinant of the matrix representation of a multivector via a self-product. This choice is unique:
Theorem 2 (The Multivector Determinant—in GA (3,1)).
Proof. As the proof requires
multiplication steps, please find a computer-assisted proof of this equality in
Annex B. □
As can be seen from this theorem, the relationship between determinants and multivector products in is a quartic form that cannot be reduced to a simpler bilinear form.
2.1.2. The Optimization Problem
The relative Shannon entropy requires measures that are everywhere non-negative. Consequently, we will first identify the largest sub-algebra of ) whose determinant is non-negative.
Theorem 3 (Non-Negativity of Even Multivectors). Let be an even multivector of . Then its multivector determinant is non-negative.
which is non-negative—the sum of two squares of real numbers in
. □
To define the optimization problem in , we note the following:
In 3+1D, we are interested in the case where the states are an element of the even sub-algebra of
, whose determinant is non-negative:
In the continuum such elements are transformed by a connection which is valued in
:
We also consider translations
and
. Hence, we define the covariant derivative
as:
The term will be added to contract with , leaving no free indices. But since it produces an odd-multivector in the process, the term is also added converting the result back into an even-multivector. This selects a preferred frame—the laboratory frame.
The covariant derivative represents the set of all transformations (including translations) that can be done on the even element of .
Flat Spacetime:
The optimization problem will be as follows:
The base field is identified from the multivector determinant, and is written as:
where
As such, Equation (
63) is formally equivalent to Equation (
62).
This expression, obtained by removing the determinant
1, satisfies the solution:
where
is defined as:
This base field leads to a variant of the Schrödinger equation obtained by taking its derivative with respect to t:
Definition 8 (Spinor-valued Schrödinger Equation).
The above expression is simply the massless Dirac equation in Hamiltonian form. Specifically, the Dirac equation is obtained as follows:
where
is the covariant derivative over all 4 spacetime coordinates.
From Noether’s theorem, it is known that the Dirac equation contains a conserved charge current , which is the Dirac current.
Theorem 4 (Positive-Definite Inner Product). The inner product, defined as is positive definite.
Proof. Let
. The calculation is quite involved. Consequently, we provide a computer-assisted proof in
annex C. The product
is :
which is positive-definite. □
Consequently, the quantity can be understood as a probability density iff normalized.
Theorem 5 (Equivalence to David Hestenes’ [
6] formulation).
where R is a rotor
Proof. Let
. Then
which is a complex number. In polar form,
, which implies iff
, that
. □
We also note that the definition of the Dirac equation recovers David Hestenes’ formulation of the same in the massless case . Posing .
Curved Spacetime:
In curved spacetime, we consider the ADM formalism. We foliate spacetime in hypersurfaces
of constant
t:
The optimization problem Lagrangian remains similar, but the constraint now acquires lapse and shift functions:
where
is the normal gamma
is the trace of the extrinsic curvature
is the 3D spatial covariant derivative on the slice .
The problem is solved in a manner similar to the flat case and leads to the Schrödinger equation:
This is the Hamiltonian form of the massless Dirac equation with covariant derivative expressed with lapse and shift functions and containing a spin and pseudoscalar connection.
2.2. Dimensional Obstructions
In this section, we explore the dimensional obstructions that arise when attempting to solve the entropy maximization problem for other dimensional configurations. We found that all geometric configurations except the 3+1D case are obstructed. By obstructed, we mean that the solution to the entropy maximization problem,
, does not satisfy the non-negativity requirement of its interpretation as an information density (i.e. this would entail negative probabilities).
Let us now demonstrate the obstructions mentioned above.
Theorem 6 (Ill-defined probabilities). These geometric algebras are isomorphic to direct sums of matrix algebras, rather than a single matrix algebra. Consequently, the determinant operation, as required by the solution form , is ill-defined or inapplicable in this context, making these algebras unsuitable.
Proof. These geometric algebras are classified as follows:
The notion of determinant is ill-defined as we are dealing with a direct sum of matrices instead of a singular matrix. □
Theorem 7 (Non-real probabilities). The quantity resulting from the optimization procedure, when evaluated for the matrix representations of the geometric algebras in this category, is either complex-valued or quaternion-valued, making them unsuitable.
Proof. These geometric algebras are classified as follows:
Evaluating the function
derived from the entropy maximization procedure for operators
D associated with these algebras yields values in
(for GAs isomorphic to
) or
(for GAs isomorphic to
or
). Since
w must be real and non-negative, these are obstructed. □
Theorem 8 (Negative probabilities). The even sub-algebra of this dimensional configuration allows for negative probabilities, making it unsuitable.
Proof. This category contains one dimensional configuration:
-
:
Let
, then:
which is valued in
.
In this case the probability can be negative. □
Theorem 9 (Non-definability). The optimization problem is not definable for these dimensional configurations.
Proof. This category contains five dimensional configurations:
- GA(0):
Definition 1 requires 1 time parameter for the Lagrange multiplier, and at least 1 space parameter for the integration measure. This configuration has neither.
- GA(0, 1):
Definition 1 requires 1 time parameter for the Lagrange multiplier, and at least 1 space parameter for the integration measure. This configuration has the time parameter, but lacks a space parameter.
- GA(1, 0):
Definition 1 requires 1 time parameter for the Lagrange multiplier, and at least 1 space parameter for the integration measure. This configuration has the space parameter, but lacks a time parameter.
- GA(2, 0):
Definition 1 requires 1 time parameter for the Lagrange multiplier, and at least 1 space parameter for the integration measurement. This configuration has two space parameters, but lacks the time parameter.
- GA(2, 2):
Definition 1 requires 1 time parameter for the Lagrange multiplier, and at least 1 space parameter for the integration measurement. This configuration has two space parameters, but has more time parameters than Lagrange multipliers.
□
[No observables (6D)] The multivector representation of the norm in 6D restricts observables to the identity.
Argument In six dimensions and above, the self-product patterns found in Definition 6 collapse. The research by Acus et al. [
7] in 6D geometric algebra concludes that the determinant, so far defined through a self-products of the multivector, fails to extend into 6D. The crux of the difficulty is evident in the reduced case of a 6D multivector containing only scalar and grade-4 elements:
This equation is not a multivector self-product but a linear sum of two multivector self-products [
7].
The full expression is given in the form of a system of 4 equations, which is too long to list in its entirety. A small characteristic part is shown:
From Equation (
106), it is possible to see that no observable
can satisfy this equation because the linear combination does not allow one to factor it out of the equation.
Any equality of the above type between
and
is frustrated by the factors
and
, forcing
as the only satisfying observable. Since the obstruction occurs within grade-4, which is part of the even sub-algebra it is questionable that a satisfactory theory (with non-trivial observables) is constructible in 6D, using our method. □
This conjecture proposes that the multivector representation of the determinant in 6D does not allow for the construction of non-trivial observables, which is a crucial requirement for a relevant quantum formalism. The linear combination of multivector self-products in the 6D expression prevents the factorization of observables, limiting their role to the identity operator.
[No observables (above 6D)] The norms beyond 6D are progressively more complex than the 6D case, which is already obstructed.
These theorems and conjectures provide additional insights into the unique role of the unobstructed 3+1D signature in our proposal.
We also note that it is interesting that our proposal is able to rule out even if in relativity, the signature of the metric versus does not influence the physics. However, in geometric algebra, represents 1 space dimension and 3 time dimensions. Therefore, it is not the signature itself that is ruled out but rather the specific arrangement of 3 time and 1 space dimensions, as this configuration yields quaternion-valued "probabilities" (i.e. and ).
2.3. Gravity + Yang-Mills
So far we have recovered standard results in the form of spinors and the Dirac equation, yielding the Dirac current as the conserved current, and shown that it works (only) in 3+1D. In this section we will investigate extra structures that are made available by the quartic multivector determinant form that are not available with the Dirac theory alone. As these are new structures, they unavoidably contain an element of speculation.
2.3.1. Rationale
The nature of our hypothesis rests on the fact that the optimization problem replaces the Born rule with a quartic form acting on
, thus linking spinors to non-negative scalars. Indeed, the measure
requires a quartic application (
and
) of the evolution operator to produce a time evolution, when compared to the Born rule (which only requires a double application):
The energy density associated with the quartic form is quadratic in
D compared to the Born rule in which it is linear in
D:
Consequently, the equation of motions of the measure
are non-linear (quadratic in D) compared to the equation of motions of the wavefunction or field, which are linear in
D (i.e the Dirac equation).
We intend to show that this quartic form automatically leads to general relativity (acting on spacetime) and Yang-Mills (acting on internal spaces).
2.3.2. Notation
We will utilize the David Hestenes form of the wavefunction
, however since our form is a non-normalized field, we will change the symbols to:
We recall that the quartic form is as follows:
and that
.
2.3.3. Geometry
Theorem 10 (Metric Tensor Measurement).
Here we utilize the quartic form to define a measurement of the metric tensor:
Proof.
We now note that
anti-commutes with
, which implies that
:
□
2.3.4. Action
Definition 9 (Action Density Measurement). We utilize the quartic form to define a measurement of the action density:
where we used the overline as a notational replacement for the blade 3,4 conjugate, where C and are dimensional constants, where f is a smooth function and where is used to nullify the dimensions of the input to f.
This action density quantifies the dynamics of the self-interaction of due to the quartic form. In this section, we will use the special case, leaving the general case for the next section on Yang-Mills.
Theorem 11 (Dirac Equation). Varying the action yields the Dirac equation as a sufficient (but not necessary) equation of motion.
Proof.
By inspection we can see that the right most term of the numerator is the massless Dirac equation
, which if equal to 0 will satisfy the equation of motion. □
This theorem implies that the action is a generalization of the massless Dirac equation. We will now investigate this generalization in more details.
Theorem 12 (Quantum Action).
Let us investigate a special case where . Due to its non-linearity, the kinetic energy produces a quantum potential in addition to a kinetic energy term:
The quantum potential herein described is the relativistic version of the quantum potential found in the Bohm-Broglie reformulation of QM, whereas the quantum kinetics can be understood as the kinetic term of a relativistic diffusion process. When integrated, they define a quantity that we refer to as the quantum action:
Theorem 13 (Equation of Motion).
Varying the quantum action:
as the equation of motion.
Proof.
To proceed further, we are required to do integration by part for the last term:
Then, a second integration by part, yields:
□
To interpret this equation of motion, let us now introduce the surprisal field and associated definitions.
Definition 10 (Surprisal Field).
We define a change of variable:
We call φ the surprisal field.
Definition 11 (Surprisal Equation of Motion).
We note that the change of variable , changes the equation of motion as follows:
which is the Klein-Gordon equation in curved spacetime, applied to the surprisal field.
Definition 12 (Surprisal Conservation).
The following current:
identifies the surprisal current as the conserved current of this action.
Definition 13 (Surprisal Expectation Value).
The surprisal expectation value is merely the entropy H of a region of the manifold:
Interpretation:
In information theory, the surprisal of an event x with probability density is defined as , and the entropy represents its expectation value. As the unit of surprisal is the bit, it represents the quantity of information associated to the event—here, it is conserved by . In contrast, also in information theory, the units of entropy are the bits per symbol—here, it is not conserved. The type of units allows us to intuit why the former is conserved and the latter isn’t.
In our framework, the field replaces —it has most of its properties, but differs critically as follows:
is not a probability density—it lacks a conserved current () and is not normalized—but it is non-negative.
Instead, is interpreted as an information density, encoding spacetime’s local information content.
The surprisal is defined as , which in this theory satisfies the Klein-Gordon equation . This ensures:
Conservation: The current is conserved (), making a conserved charge.
Causal Propagation: Surprisal propagates at light speed, enforcing that bits of information cannot spread superluminally—a core tenet of relativity.
Before we continue with the interpretation, let us introduce another theorem, where we now assume .
Theorem 14 (Einstein-Hilbert Action Density).
where .
Proof.
Since
, we note that
. Finally,
.
□
We note some of the equations of motion:
Varying with respect to yields the EFE with the Einstein tensor from , and is sourced by the quantum action variation yielding the stress-energy tensor.
Varying with respect to gives equations of motion that define the flow of information quantity in spacetime.
Interpretation (cont’d):
Thus, while quantum mechanics relies on probabilistic amplitudes , our formulation recasts general relativity as a deterministic theory of information dynamics, where spacetime geometry and surprisal flux are dual aspects of and . The distribution of surprisal in spacetime dictates its geometric structure, which in turn dictates how it propagates. General relativity is to information, what quantum mechanics is to probability.
Revisiting General Relativity with this perspective shows that the natural constraint is sufficient to entail the theory through the principle of entropy maximization—in this formulation, the speed of light as a limit on the propagation of the quantity of information (via the surprisal obeying the Klein-Gordon equation), and even the Einstein field equations are not fundamental, but emerge as the solution to an optimization problem on entropy.
2.3.5. Yang-Mills
In QFT, the standard method to identify a local gauge symmetry is to start with a global symmetry of the action or probability measure and then localize it by introducing gauge fields. For example, the gauge symmetry arises naturally in electromagnetism as the group preserving the probability density (Born rule) under local phase transformations. However, the non-Abelian and gauge symmetries of the Standard Model are not derived from first principles in this way; rather their inclusion is empirically motivated by particle physics experiments.
Improvement via Multivector Determinant Formulation: Our framework demonstrates that Yang-Mills theories emerge naturally from constraints on the wavefunction’s probability measure and Dirac current. Specifically:
-
Probability Measure: The quartic form
enforces rotor invariance
, restricting transformations to those satisfying
, for some rotor
R of a geometric algebra of
n dimensions:
Solutions to are rotor transformations generated by bivectors in the Clifford algebra. For a -dimensional algebra, these generate , whose subgroups include .
Internal Space: For the gauge transformation to represent a purely internal symmetry that does not mix spacetime components defined by the basis (specifically preserving the time direction ), the generators must commute with , i.e., . This ensures the transformation acts orthogonally to the spacetime structure.
Spacetime: The origin of the multivector determinant from STA, parametrize the resulting internal space in spacetime.
These constraints limit the allowable symmetry to groups generated by bivector exponentials (which are compact Lie groups), and acting on the internal spaces of spacetime. Since , this framework inherently includes the Standard Model within its landscape but also generalizes to larger symmetries such as those found in condensed matter systems with emergent symmetries.
Total wavefunction:
The total wavefunction is a tensor product of spacetime (STA) and internal space components:
For the Standard Model
:
Covariant Derivative (Ex. SU(n)): The covariant derivative represents the set of all transformations that can be performed on the wavefunction. For SU(n) it is:
Covariant Derivative (Ex. Standard Model):
Now taking the Standard Model as an example, the covariant derivative in the language of non-commutative geometry (see A. H. Chamseddine and Alain Connes [
8]) incorporates spacetime curvature, gauge fields and Higgs fields:
where:
: Generators of .
: , , and gauge fields.
: Higgs field (SU(2) doublet).
It acts on the left/right split of the wavefunction .
Action:
We recall that the general form of the action is:
Expanding
f via a power series expansion yields the Standard Model and gravity field strengths, from the crucial term
of Equation (
158). The invariants recovered are:
-
Leading Terms:
- (a)
Cosmological constant: .
- (b)
Einstein-Hilbert term: .
-
Yang-Mills and Higgs:
- (a)
Gauge kinetic terms: .
- (b)
Higgs kinetic and potential terms:
Yukawa Couplings (from matter fields):
Key Notes:
Higher-Order Terms: Higher order field strength terms appear but are suppressed by , making them negligible at low energies.
Uniqueness: The Standard Model is not uniquely selected by the optimization problem but resides within the landscape of allowed Yang-Mills theories.
To show this explicitly, let us investigate the crucial term (Equation (
158)). When the covariant derivative has additional gauge connections, the term is:
The first power of the series expansion would reduce to
as before. But the second order series expansion would contain a term of this type:
which includes
and
.
For the Yukawa couplings, their presence is found from the Dirac bilinear
which is sufficient (but not necessary) to satisfy the equation of motion. Specifically, the cross-terms of
D combine to produce the terms:
where
The Higgs field comes from second order expansion:
and third order expansion:
Finally, the Higgs kinetic term is found in the cross terms of with , as .
2.3.6. Yang-Mills Axioms as Theorems
In this section, we intend to show that all 5 axioms of Yang-Mills theory are demonstrated. First, let us list the axioms:
Compact Gauge Group: The symmetry group is a compact Lie group G.
Local Gauge Invariance: Fields transform under spacetime-dependent (local) group elements .
Gauge Connections: Gauge fields are introduced as connections in the covariant derivative .
Field Strength: The curvature defines the dynamics.
Yang-Mills Action: The action depends on , e.g., .
Now for the theorems.
Theorem 15 (Compact Gauge Group). The allowed symmetries form a compact Lie group .
Proof.
Constraint: implies invariance of arbitrary n-dimentional rotors: .
Structure of Solutions: Rotor transformations in finite-dimensional Clifford algebras are generated by bivectors. These generate Spin() and its subgroups, which are compact Lie groups.
Thus, the gauge group G is inherently compact and derived from the algebra structure. □
Theorem 16 (Local Gauge Invariance). The theory is invariant under spacetime-dependent .
Proof.
Wavefunction Transformation: , where (exponentials of spacetime-dependent bivectors).
Probability Measure: .
Dirac Current: , since .
□
Theorem 17 (Gauge Connections). The covariant derivative emerges to maintain invariance under local .
Proof.
Minimal Coupling: To preserve , the derivative must transform as , where .
Gauge Field Definition: Let , then:
Clifford Algebra Embedding: The are bivector fields in , ensuring (the Lie algebra of G)).
□
Theorem 18 (Field Strength). The commutator defines the field strength.
Proof. Kinetic Energy: As we have shown in Equation (
158) and Equation
172, the action density expands to include the field strength tensor:
where
is the field strength. □
Theorem 19 (Yang-Mills Action). The action density includes the kinetic term .
Proof. Heat Kernel Expansion: As shown in Equation (
172), action expands into the field strength:
□
Revisiting Yang-Mills with this perspective shows that the natural constraint is sufficient to entail the theory through the principle of entropy maximization—in this formulation, Yang-Mills axioms 1, 2, 3, 4, and 5 are not fundamental, but the solution to the optimization problem.
3. Discussion
When asked to define what a physical theory is, an informal answer might be that it is a set of equations that applies to all experiments realizable within a domain, with nature as a whole being the most general domain. While physicists have expressed these theories through sets of axioms, we propose a more direct approach—mathematically realizing the fundamental definition itself. This definition is realized as a constrained optimization problem (Axiom 1 and Definition 1) that can be solved directly (Theorem 1). The solution to this optimization problem yields precisely those structures that realize the physical theory over said domain. Succinctly, physics is the solution to:
The relative Shannon entropy represents the basic structure of any experiment, quantifying the informational difference between its initial preparation and its final measurement.
The natural constraint is chosen to be the most general structure that admits a solution to this optimization problem. This generality follows from key mathematical requirements. The constraint must involve quantities that form an algebra, as the solution requires taking exponentials:
which involves addition, powers, and scalar multiplication of X. The use of the trace operation further necessitates that X must be represented by square
matrices. Thus Axiom 1 involves
matrices (and operators representable as matrices):
The trace is utilized because the constraint must be a scalar for use in the Lagrange multiplier equation. Finally, the operator is selected to be the set of all transformations that can be performed on the base field within the specific geometric configuration.
These mathematical requirements demonstrate that the natural constraint, as it admits the most permissive mathematical structure required to solve an arbitrary entropy maximization problem, can be understood as the most general extension to the standard entropy maximization problem of statistical mechanics.
Thus, having established both the mathematical structure and its generality, we can understand how this minimal ontology operates. Since our formulation keeps the structure of experiments completely general, our optimization considers all possible theories for that structure, and the constraint is the most general constraint possible for that structure, the resulting optimal physical theory applies, by construction, to all realizable experiments within its domain.
This ontology is both operational, being grounded in the basic structure of experiments rather than abstract entities, and constructive, showing how physical laws emerge from optimization over all possible predictive theories subject to the natural constraint. This represents a significant philosophical shift from traditional physical ontologies where laws are typically taken as primitive.
The next step in our derivation is to represent the determinant of the matrices through a self-product of multivectors involving various conjugate structures. By examining the various dimensional configurations of geometric algebras, we find that GA(3,1), representing real matrices, admits a sub-algebra whose determinant is non-negative for its invertible members. All other dimensional configurations fail to admit such a non-negative structure.
The solution reveals that the 3+1D case harbors a new type of field amplitude structure analogous to complex amplitudes, one that exhibits the characteristic elements of a quantum mechanical theory. Instead of complex-valued amplitudes, we have amplitudes valued in the invertible subset of the even sub-algebra of —as spinors. When normalized, this amplitude is identical to David Hestenes’ wavefunction, but comes with an extended Born rule represented by the determinant. The quartic structure of this rule automatically incorporates gravity via the connection and local gauge theories as Yang-Mills theories. Specifically, the powers of the Dirac operator, automatically generated by the Lagrangian, contains the invariants of gravity and of the Yang-Mills theory, which are made explicit via a heat kernel or power series expansion, along with the matter fields quantifying the system’s information density via surprisal and limiting its propagation speed.
3.1. Proposed Interpretation of QM
An experiment begins with a known initial preparation , evolves under a constraint (Axiom 1) and ends with a final measurement . By treating the experiment as the fundamental ontic entity, we resolve a redundancy inherent in traditional physical theories: Specifically, physics is not a set of laws that are simultaneously axiomatic and validated by experiments (i.e., a redundancy—that which is validated by something else is not axiomatic) but an optimal interpolation device connecting to under the constraint of nature. The experiment is fundamental, but the physical laws that are derived from it are not.
3.1.1. Demystifying the Measurement Problem
If we accept that our derivation demonstrates that QM is the optimal interpolation device that connects to , under the constraint of nature, and we recognize that a measure-to-measure interpolation ( to ) is different than a measure-to-element interpolation ( to some )—the latter would be required for a ’collapse’ to occur—then, we must conclude that the final sampling (from to ) exists outside of QM (defined only from to ).
Thus, if QM cannot account for the collapse, what can? Foundational to our framework is the notion of the experiment. This notion supersedes QM (the latter being its derived product) and is sufficient to demystify the collapse. In the introduction, we have stated that Definition 1 represents the set of all experiments realizable within a domain. In practice, however, we must perform each experiment atomically—the set of all realizable experiments is derived from many such experiments.
An
atomic experiment will be defined as a pair of elements of an ensemble
, where the first element of the pair is the initial measurement outcome, and the second element is the final measurement outcome. As an example, let us consider an experimental run comprising
n atomic experiments over a two-state ensemble
:
Assuming the law of large numbers, one can construct a representative probability measure and . Specifically, is obtained by counting the total occurrence of in the first element of the pairs and dividing by n, and by counting the total occurrence in the second element of the pairs and also dividing by n. This gives us the starting and ending points to define the set of all realizable experiments using the probability measure representation and .
We can show that the map from experimental runs to probability measure representation is many-to-one, making it non-invertible. Indeed, consider two experimental runs:
Since both of these runs, although different, produce the same
and the same
, the map must in general be non-invertible.
From this, we can deduce that the measurement problem is an artifact of idealized statistical inference. Specifically, claiming a probability measure representation from the law of large numbers allows us to discard the notion of atomic experiments, yielding a tractable but imperfect representation of reality. The measurement collapse problem is then an attempt to make this representation perfect again by inverting the map (i.e., to express reality in terms of atomic experiments rather than probability measures), but failing to do so because the map is non-invertible.
3.1.2. Dissolving the Measurement Problem
To dissolve the measurement problem, it is important to understand that our approach reframes the preparation of quantum states as an initial measurement—that is, the initial preparation is , not . Then fundamental physical evolution is understood to be in terms of atomic experiments mapping initial measurement outcome to final measurement outcome. At this fundamental level, the measurement problem is entirely dissolved. This operational perspective aligns with laboratory practice but challenges the standard formulation, which takes as its initial preparation instead of .
Core Argument:
We propose that a well-defined experiment begin with a measurement outcome , not an abstract quantum state .
-
Example: Preparing requires:
- (a)
Measure systems to collapse to or .
- (b)
Discard all systems in state .
- (c)
Apply a Hadamard gate H to .
- (d)
The preparation is complete.
Neglecting the initial measurement (a) implies that systems of unknown states are sent into the Hadamard gate—the resulting experiment is ill-defined.
Challenges and Solutions:
-
Objection 1: Preparation Without Collapse
- (a)
Issue: Traditional QM superficially appears to allow preparing without collapsing it (e.g., via unitary gates, cooling, etc.).
- (b)
Response: In practice, all preparations are validated by measurement (or an equivalent).
- (c)
-
Example:
- i
Cooling various qubits to is non-invertible (one cannot return to the initial because of dissipative effects). The end result is mathematically equivalent to a measurement or followed by a discard of .
- ii
Creating requires assuming the initial , validated by prior conditions.
-
Objection 2: Loss of Quantum Coherence
- (a)
Issue: If preparation starts with a measurement, how do we account for coherence (e.g., interference)?
- (b)
Response: Coherence emerges operationally.
- (c)
-
Example:
- i
Measure systems to collapse to or .
- ii
Discard all systems in state .
- iii
Apply H to many initial -verified states.
- iv
Aggregate final measurements () show interference patterns, even though individual experiments start with collapsed states.
-
Objection 3: Entanglement and Nonlocality
- (a)
Issue: Entangled states require joint preparation of superpositions.
- (b)
Response: Entanglement is preparable from an initial measurement like any other state.
- (c)
-
Example:
- i
Measure systems to collapse to , , , or .
- ii
Discard all systems in state , , and .
- iii
Apply a Hadamard gate to the first qubit:
- iv
Apply a gate (with first qubit as control, second as target):
The final state is an entangled state—specifically, it’s one of the Bell states (sometimes denoted as ).
In all cases, neglecting the initial measurement results in systems of unknown state entering the experiment and making it ill-defined. An ill-defined experiment is still potentially insightful but not sufficient to uniquely entail QM from entropy optimization—we may call an ill-defined experiment an observation
2,
3,
4.
The complete picture is that QM is an optimal interpolation device derived from a limiting case of atomic experiments mapping initial measurements to final measurements. The measurement problem is entirely dissolved at the level of atomic experiments, but emerges in QM proper due to the non-invertibility of the limiting process.
4. Conclusions
E.T. Jaynes fundamentally reoriented statistical mechanics by recasting it as a problem of inference rather than mechanics. His approach revealed that the equations of thermodynamics are not arbitrary physical laws but necessary consequences of maximizing entropy subject to constraints. This work extends Jaynes’ inferential paradigm to address a more fundamental question: what is a physical theory itself?
A physical theory, at its essence, is a set of equations that applies to all experiments realizable within a domain. While this definition is informal, our contribution lies in making this concept mathematical. By formulating it as an optimization problem—minimizing the relative entropy of measurement outcomes subject to the natural constraint—we transform an abstract definition into a precise, solvable mathematical problem.
This approach represents a profound methodological shift. Rather than constructing physical theories through trial-and-error enumerations of axioms, we derive them as necessary solutions to a well-defined optimization problem. Physics thus emerges not as a collection of independently discovered laws but as the unique optimal interpolation device between arbitrary experimental preparation and measurement under the constraint of nature.
The power of this formulation lies in its generality: by varying only the algebraic structure of the constraint, we recover established physical theories as special cases of the same optimization principle. Jaynes showed that statistical inference with minimal assumptions yields thermodynamics; we suggest that this same principle, properly generalized, may yield the foundation to all of physics.
Funding
This research received no specific grant from any funding agency in the public, commercial, or not-for-profit sectors.
Data Availability Statement
No datasets were generated or analyzed during the current study. During the preparation of this manuscript, we utilized a Large Language Model (LLM), for assistance with spelling and grammar corrections, as well as for minor improvements to the text to enhance clarity and readability. This AI tool did not contribute to the conceptual development of the work, data analysis, interpretation of results, or the decision-making process in the research. Its use was limited to language editing and minor textual enhancements to ensure the manuscript met the required linguistic standards.
Conflicts of Interest
The author declares that he has no competing financial or non-financial interests that are directly or indirectly related to the work submitted for publication.
Appendix A. SM
Here, we solve the Lagrange multiplier equation of SM.
We solve the maximization problem as follows:
The partition function, is obtained as follows:
Finally, the probability measure is:
Appendix B. SageMath Program Showing ⌊u ‡ u⌋ 3,4 u ‡ u=detϕ(u)
from sage.algebras.clifford_algebra import CliffordAlgebra
from sage.quadratic_forms.quadratic_form import QuadraticForm
from sage.symbolic.ring import SR
from sage.matrix.constructor import Matrix
# Define the quadratic form for GA(3,1) over the Symbolic Ring
Q = QuadraticForm(SR, 4, [−1, 0, 0, 0, 1, 0, 0, 1, 0, 1])
# Initialize the GA(3,1) algebra over the Symbolic Ring
algebra = CliffordAlgebra(Q)
# Define the basis vectors
e0, e1, e2, e3 = algebra.gens()
# Define the scalar variables for each basis element
a = var(’a’)
t, x, y, z = var(’t x y z’)
f01, f02, f03, f12, f23, f13 = var(’f01 f02 f03 f12 f23 f13’)
v, w, q, p = var(’v w q p’)
b = var(’b’)
# Create a general multivector
udegree0=a
udegree1=t*e0+x*e1+y*e2+z*e3
udegree2=f01*e0*e1+f02*e0*e2+f03*e0*e3+f12*e1*e2+f13*e1*e3+f23*e2*e3
udegree3=v*e0*e1*e2+w*e0*e1*e3+q*e0*e2*e3+p*e1*e2*e3
udegree4=b*e0*e1*e2*e3
u=udegree0+udegree1+udegree2+udegree3+udegree4
u2 = u.clifford_conjugate()*u
u2degree0 = sum(x for x in u2.terms() if x.degree() == 0)
u2degree1 = sum(x for x in u2.terms() if x.degree() == 1)
u2degree2 = sum(x for x in u2.terms() if x.degree() == 2)
u2degree3 = sum(x for x in u2.terms() if x.degree() == 3)
u2degree4 = sum(x for x in u2.terms() if x.degree() == 4)
u2conj34 = u2degree0+u2degree1+u2degree2-u2degree3-u2degree4
I = Matrix(SR, [[1, 0, 0, 0],
[0, 1, 0, 0],
[0, 0, 1, 0],
[0, 0, 0, 1]])
#MAJORANA MATRICES
y0 = Matrix(SR, [[0, 0, 0, 1],
[0, 0, −1, 0],
[0, 1, 0, 0],
[−1, 0, 0, 0]])
y1 = Matrix(SR, [[0, −1, 0, 0],
[−1, 0, 0, 0],
[0, 0, 0, −1],
[0, 0, −1, 0]])
y2 = Matrix(SR, [[0, 0, 0, 1],
[0, 0, −1, 0],
[0, −1, 0, 0],
[1, 0, 0, 0]])
y3 = Matrix(SR, [[−1, 0, 0, 0],
[0, 1, 0, 0],
[0, 0, −1, 0],
[0, 0, 0, 1]])
mdegree0 = a
mdegree1 = t*y0+x*y1+y*y2+z*y3
mdegree2 = f01*y0*y1+f02*y0*y2+f03*y0*y3+f12*y1*y2+f13*y1*y3+f23*y2*y3
mdegree3 = v*y0*y1*y2+w*y0*y1*y3+q*y0*y2*y3+p*y1*y2*y3
mdegree4 = b*y0*y1*y2*y3
m=mdegree0+mdegree1+mdegree2+mdegree3+mdegree4
print(u2conj34*u2 == m.det())
The program outputs
showing, by computer-assisted symbolic manipulations, that the determinant of the real Majorana representation of a multivector u is equal to the double-product: .
Appendix C. SageMath Program Showing detϕ(u) is Positive-Definite for Even Multivectors of GA (3,1)
reset()
from sage.algebras.clifford_algebra import CliffordAlgebra
from sage.quadratic_forms.quadratic_form import QuadraticForm
from sage.symbolic.ring import SR
from sage.matrix.constructor import Matrix
# Define the quadratic form for GA(3,1) over the Symbolic Ring
Q = QuadraticForm(SR, 4, [−1, 0, 0, 0, 1, 0, 0, 1, 0, 1])
# Initialize the GA(3,1) algebra over the Symbolic Ring
algebra = CliffordAlgebra(Q)
# Define the basis vectors
e0, e1, e2, e3 = algebra.gens()
# Define the scalar variables for each basis element
a = var(’a’)
f01, f02, f03, f12, f23, f13 = var(’f01 f02 f03 f12 f23 f13’)
b = var(’b’)
udegree0=a
udegree2=f01*e0*e1+f02*e0*e2+f03*e0*e3+f12*e1*e2+f13*e1*e3+f23*e2*e3
udegree4=b*e0*e1*e2*e3
M=udegree0+udegree2+udegree4
print(M.clifford_conjugate()*e0*M)
The program outputs:
(a^2 + b^2 + f01^2 + f02^2 + f03^2 + f12^2 + f13^2 + f23^2)*e0
+ (−2*a*f01 + 2*f02*f12 + 2*f03*f13 − 2*b*f23)*e1
+ (−2*a*f02 − 2*f01*f12 + 2*b*f13 + 2*f03*f23)*e2
+ (−2*a*f03 − 2*b*f12 − 2*f01*f13 − 2*f02*f23)*e3
Whose inner product with yields .
References
- Jaynes, E.T. Information theory and statistical mechanics. Physical review 1957, 106, 620. [Google Scholar] [CrossRef]
- Jaynes, E.T. Information theory and statistical mechanics. II. Physical review 1957, 108, 171. [Google Scholar] [CrossRef]
- Dirac, P.A.M. The principles of quantum mechanics; Number 27, Oxford university press, 1981.
- Von Neumann, J. Mathematical foundations of quantum mechanics: New edition; Vol. 53, Princeton university press, 2018.
- Lundholm, D. Geometric (Clifford) algebra and its applications. arXiv preprint math/0605280.
- Hestenes, D. Spacetime physics with geometric algebra (Page 6). American Journal of Physics 2003, 71, 691–714. [Google Scholar] [CrossRef]
- Acus, A.; Dargys, A. Inverse of multivector: Beyond p+ q= 5 threshold. arXiv preprint arXiv:1712.05204, arXiv:1712.05204 2017.
- Chamseddine, A.H.; Connes, A. The spectral action principle. Communications in Mathematical Physics 1997, 186, 731–750. [Google Scholar] [CrossRef]
| 1 |
The removal of the determinant implies an additional term , where . Furthermore, since implies , it is simply a gauge choice. We thus choose . |
| 2 |
The author suggests that observations, so defined, may constitute a broader conceptual category that could entail a richer landscape of effective theories beyond what experiments alone feasibly entail. Observations allow us to study parts of the universe whose complexity far exceeds our ability to precisely connect an initial preparation to a final measurement via unitary transformations in the laboratory. Accounting for this observed complexity suggests the development of effective theories across various domains, including biology, chemistry, complex systems theory, emergent phenomena, and cosmology. This extension of the optimization problem to observations, however, falls outside the scope of the current paper. |
| 3 |
As statistical mechanics’ optimization problem does not reference an initial preparation, it could be argued, from these definitions, that it is based on observations and not on experiments. |
| 4 |
This definition should not be taken as pejorative of observations. |
|
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).