2.1. Quantum Mechanics
In statistical mechanics (SM), the central observation is that energy measurements of a thermally equilibrated system tend to cluster around a fixed average value (Equation
1). In contrast, quantum mechanics (QM) is characterized by the presence of interference effects in measurement outcomes. To capture these features within an entropy maximization framework, we introduce the following special case of the universal physical constraint:
Definition 2 (U We reduce the universal physical constraint to the generator of the U(1) group. Specifically, we replace
Here, are scalar values (e.g., energy levels), are the probabilities of outcomes, and the matrices generate the U(1) group.
The general solution of the maximization problem reduces as follows
Though initially unfamiliar, this form effectively establishes a comprehensive formulation of quantum mechanics, as we will demonstrate.
To align our results with conventional quantum mechanical notation, we translate the matrices to complex numbers. Specifically, we consider that:
Then, we note the following equivalence with the complex norm:
Finally, substituting
analogously to
, and applying the complex-norm representation to both the numerator and to the denominator, consolidates the Born rule, normalization, and initial prepration into :
The wavefunction emerges by decomposing the complex norm into a complex number and its conjugate. It is then visualized as a vector within a complex n-dimensional Hilbert space. The partition function acts as the inner product. This relationship is articulated as follows:
where
We clarify that represents the probability associated with the initial preparation of the wavefunction, where .
We also note that Z is invariant under unitary transformations.
Let us now investigate how the axioms of quantum mechanics are recovered from this result:
The entropy maximization procedure inherently normalizes the vectors with . This normalization links to a unit vector in Hilbert space. Furthermore, as physical states associate to the probability measure, and the probability is defined up to a phase, we conclude that physical states map to Rays within Hilbert space. This demonstrates i.
-
In
Z, an observable must satisfy:
Since , then any self-adjoint operator satisfying the condition will equate the above equation, simply because . This demonstrates ii.
-
Upon transforming Equation
46 out of its eigenbasis through unitary operations, we find that the energy,
, typically transforms in the manner of a Hamiltonian operator:
The system’s dynamics emerge from differentiating the solution with respect to the Lagrange multiplier. This is manifested as:
which is the Schrödinger equation. This demonstrates iii.
-
From Equation
46 it follows that the possible microstates
of the system correspond to specific eigenvalues of
. An observation can thus be conceptualized as sampling from
, with the measured state being the occupied microstate
i. Consequently, when a measurement occurs, the system invariably emerges in one of these microstates, which directly corresponds to an eigenstate of
. Measured in the eigenbasis, the probability measure is:
In scenarios where the probability measure
is expressed in a basis other than its eigenbasis, the probability
of obtaining the eigenvalue
is given as a projection on a eigenstate:
Here, signifies the squared magnitude of the amplitude of the state when projected onto the eigenstate . As this argument hold for any observables, this demonstrates iv.
Finally, since the probability measure (Equation
44) replicates the Born rule, v is also demonstrated.
Revisiting quantum mechanics with this perspective offers a coherent and unified narrative. Specifically, the U(1) generating constraint is sufficient to entail the foundations of quantum mechanics (Axiom 1, 2, 3, 4 and 5) through the principle of entropy maximization. The following Lagrange multiplier equation
becomes the formulation’s new singular foundation, and QM Axioms 1, 2, 3, 4, and 5 are now promoted to theorems.
2.2. RQM in 2D
In this section, we investigate a model, isomorphic to quantum mechanics, that lives in 2D which provides a valuable starting point before addressing the more complex 3+1D case. In RQM 2D, the fundamental Lagrange Multiplier Equation is:
where
and
are the Lagrange multipliers, and where
is the
matrix representation of the multivectors of
.
In general a multivector
of
, where
a is a scalar,
is a vector and
a pseudo-scalar, is represented as follows:
This holds for any matrix and any multivectors of .
The basis elements are defined as:
To investigate this case in more detail, we introduce the multivector conjugate, also known as the Clifford conjugate, which generalizes the concept of complex conjugation to multivectors.
Definition 3 (Multivector conjugate).
Let be a multi-vector of the geometric algebra over the reals in two dimensions . The multivector conjugate is defined as:
The determinant of the matrix representation of a multivector can be expressed as a self-product:
Theorem 2 (Determinant as a Multivector Self-Product).
Proof. Let
, and let
be its matrix representation
. Then:
□
Building upon the concept of the multivector conjugate, we introduce the multivector conjugate transpose, which serves as an extension of the Hermitian conjugate to the domain of multivectors.
Definition 4 (Multivector Conjugate Transpose).
Let :
The multivector conjugate transpose of is defined as first taking the transpose and then the element-wise multivector conjugate:
Definition 5 (Bilinear Form).
Let and be two vectors valued in . We introduce the following bilinear form:
Theorem 3 (Inner Product). Restricted to the even sub-algebra of , the bilinear form is an inner product.
Proof.
This is isomorphic to the inner product of a complex Hilbert space, with the identification . □
Let us now solve the optimization problem for the even multivectors of , whose inner product is positive-definite.
We take
then
reduces as follows:
The Lagrange multiplier equation can be solved as follows:
The partition function
, serving as a normalization constant, is determined as follows:
Consequently, the least biased probability measure that connects an initial preparation
to a final measurement
, under the 2D universal measurement constraint, is:
Definition 6 (Spin (2)-valued Wavefunction).
where representing the square root of the probability and representing a rotor in 2D (or boost in 1+1D).
The partition function of the probability measure can be expressed using the bilinear form applied to the Spin(2)-valued Wavefunction:
Theorem 4 (Partition Function).
Definition 7 (Spin (2)-valued Evolution Operator).
Theorem 5. The partition function is invariant with respect to the Spin(2)-valued evolution operator.
Proof. We note that:
then, since
, the relation
is satisfied. □
We note that the even sub-algebra of , being closed under addition and multiplication and constituting an inner product through its bilinear form, allows for the construction of a Hilbert space. In this context, the Hilbert space is Spin(2)-valued. The primary distinction between a wavefunction in a complex Hilbert space and one in a Spin(2)-valued Hilbert space lies in the subject matter of the theory. Specifically, in the latter, the construction governs the change in orientation experienced by an observer (versus change in time), which in turn dictates the measurement basis used in the experiment, consistently with the rotational symmetry and freedom of the system.
The dynamics of observer orientation transformations are described by a variant of the Schrödinger equation, which is derived by taking the derivative of the wavefunction with respect to the Lagrange multiplier, :
Definition 8 (Spin (2)-valued Schrödinger Equation).
Here, represents a global one-parameter evolution parameter akin to time, which is able to transform the wavefunction under the Spin(2), locally across the states of the Hilbert space. This is an extremely general equation that captures all transformations that can be done consistently with the symmetries of the wavefunction for the Spin(2) group.
Definition 9 (David Hestenes’ Formulation).
In 3+1D, the David Hestenes’ formulation [5] of the wavefunction is , where is a Lorentz boost or rotation and where is a phase. In 2D, as the algebra only admits a bivector, his formulation would reduce to , which is the form we have recovered.
The definition of the Dirac current applicable to our wavefunction follows the formulation of David Hestenes:
Definition 10 (Dirac Current).
Given the basis and , the Dirac current for the 2D theory is defined as:
where and are a SO(2) rotated basis vectors.
2.2.1. 1+1D Obstruction
As stated in the introduction, of the dimensional cases, only 2D and 3+1D are free of obstructions. For instance, the 1+1D theory results in a split-complex quantum theory due to the bilinear form , which yields negative probabilities: for certain wavefunction states, in contrast to the non-negative probabilities obtained in the Euclidean 2D case. This is why we had to use 2D instead of 1+1D in this two-dimensional introduction. In the following section, we will investigate the 3+1D case, then we will show why all other dimensional cases are obstructed.
2.3. RQM in 3+1D
In this section, we extend the concepts and techniques developed for multivector amplitudes in 2D to the more physically relevant case of 3+1D dimensions. The Lagrange multiplier equation is as follows:
where
Here, , and b correspond to the generators of the Spinc(3,1) group, which includes both Lorentz transformations and U(1) phase rotations.
The solution (proof in Annex
Appendix B) is obtained using the same step-by-step process as the 2D case, and yields:
where
is a "twisted-phase" rapidity. (If the invariance group was Spin(3,1) instead of Spin
c(3,1), obtainable by posing
, then it would simply be the rapidity). As we will show in
Section 2.4, due to obstructions, this probability measure is the most sophisticated solution to the optimization problem that satisfy the axioms of probability theory.
2.3.1. Preliminaries
As we did in the 2D case, our initial goal here also will be to express the partition function as a self-product of elements of the vector space. As such, we begin by defining a general multivector in the geometric algebra .
Definition 11 (Multivector).
Let be a multivector of . Its general form is:
where are the basis vectors in the real Majorana representation.
A more compact notation for is
where a is a scalar, a vector, a bivector, is pseudo-vector and a pseudo-scalar.
This general multivector can be represented by a real matrix using the real Majorana representation:
Definition 12 (Matrix Representation of
).
To manipulate and analyze multivectors in , we introduce several important operations, such as the multivector conjugate, the 3,4 blade conjugate, and the multivector self-product.
Definition 13 (Multivector Conjugate (in 4D)).
Definition 14 (3,4 Blade Conjugate).
The 3,4 blade conjugate of is
Lundholm[
6] proposes a number the multivector norms, and shows that they are the
unique forms which carries the properties of the determinants such as
to the domain of multivectors:
Definition 15.
The self-products associated with low-dimensional geometric algebras are:
We can now express the determinant of the matrix representation of a multivector via the self-product . Again, this choice is not arbitrary, but the unique choice with allows us to represent the determinant of the matrix representation of a multivector within :
Theorem 6 (Determinant as a Multivector Self-Product).
Proof. Please find a computer assisted proof of this equality in Annex
Appendix C. □
Definition 16 (
-valued Vector).
These constructions allow us to express the partition function in terms of the multivector self-product:
Definition 17 (Double-Copy Product).
Instead of an inner product, we obtain what we call a double-copy product:
Theorem 7 (Partition Function).
Desirable properties for the double-copy product are introduced by addressing the issue of non-positivity. First, we establish non-negativity:
Theorem 8 (Non-negativity). The double-copy product, applied to the even subalgebra of , is always non-negative.
Proof. Let
. Then,
We note 1)
and 2)
We note that the terms are now complex numbers, which we rewrite as
and
which is always non-negative. □
Finally, positive-definiteness is automatically achieved because solving the optimization problem exponentiates the multivector, yielding a wavefunction:
Definition 18 (
-Valued Wavefunction).
where:
is a positive scalar factor ensuring non-negativity.
is a rotor representing Lorentz transformations (rotations and boosts in spacetime).
is a complex phase factor, as and .
In this representation:
The exponential map maps elements of the algebra to the connected component of the identity in the spin group , except at the zero vector, where the map is not injective.
The wavefunction captures both the amplitude (through ) and the phase (through and ) of the quantum state.
Thus, the double-copy product over wavefunction is positive-definite.
Now, let us turn our attention to the evolution operator, which leaves the partition function invariant:
Definition 19 (
Evolution Operator).
In turn, this leads to a variant of the Schrödinger equation obtained by taking the derivative of the wavefunction with respect to the Lagrange multiplier :
Definition 20 (
-valued Schrödinger equation).
In this case represents a one-parameter evolution parameter akin to time, which is able to transform the measurement basis under action of the group. This is an extremely general equation that captures all transformations that can be done consistently with the symmetries of the wavefunction.
Theorem 9 (Spin
c(3,1) invariance).
Let be a general element of Spinc(3,1). Then, the equality:
is always satisfied.
2.3.2. RQM
Definition 21 (David Hestenes’ Wavefunction).
The -valued wavefunction we have recovered is formulated identically to David Hestenes’[5] formulation of the wavefunction within GA(3,1).
where , and .
Before we continue the RQM investigation, let us note that the double-copy product contains two copies of a bilinear form
:
In the present and upcoming section, we will investigate the properties of each copy individually, leaving the properties specific to the double-copy for the section on quantum gravity.
Taking a single copy, the Dirac current is obtained directly from the gamma matrices, as follows:
Definition 22 (Dirac Current).
The definition of the Dirac current is the same as Hestenes’:
where is a SO(3,1) rotated basis vector.
2.3.3. Standard Model Gauge Symmetries
We will now demonstrate that the double-copy product is automatically invariant under transformations corresponding to the , , and symmetries, as well as under unitary transformations satisfying , all of which play fundamental roles in the Standard Model of particle physics. These symmetries constitute the set of transformations that leave the Dirac current invariant, i.e., with T valued in .
Theorem 10 (U Let (1) Invariance). be a general element of U(1). Then, the equality
is satisfied, yielding a U(1) symmetry for each copied bilinear form.
Proof. Equation
138 is invariant if this expression is satisfied:
This is always satisfied simply because □
Theorem 11 (SU Let (2) Invariance). be a general element of Spin(3,1). Then, the equality:
is satisfied for if (which generates SU(2)), yielding a SU(2) symmetry for each copied bilinear form.
Proof. Equation
140 is invariant if this expression is satisfied[
7]:
We now note that moving the left-most term to the right of the gamma matrix yields:
Therefore, the product reduces to if and only if , leaving :
Finally, we note that generates . □
Theorem 12 (SU (3)).
The generators of SU(3) in GA(3,1) are given by Anthony Lesenby in [8] and are as follows:
This defines the 9 generators of U(3).
With the additional restriction on
the number generators is reduced to 8, consistently with SU(3).
We now must show that the following equation is satisfied for all 8 generators:
Proof. First, we note the following action:
which we can rewrite as follows:
The first three terms anticommute with
, while the last three commute with
:
This can be written as:
where
and
.
Thus, for
, we require: 1)
and 2)
. The first requirement expands as follows:
which is the defining conditions for the
symmetry group.
Finally, as the SU(3) norm is a consequence of preserving the Dirac current, it follows that the SU(3) generators provided by Lasenby, acting on , cannot change the SU(3) norm, hence must also preserve the Dirac current. □
Theorem 13 (Unitary invariance).
Let U be unitary matrices. Then unitary invariance:
is individually satisfied for each copied bilinear form.
Proof. Equation
157 is satisfied if
. Since
U is valued in complex numbers, then
, and since
, it follows that:
which is satisfied when
. □
The invariances SU(3), SU(2) and U(1) discussed above can be promoted to local symmetries using standard gauge theory construction techniques.
In conventional QM, the Born rule naturally leads to a U(1)-valued gauge theory due to the following symmetry:
However, the and symmetries do not emerge from the probability measure in the same straightforward manner and are typically introduced by hand, justified by experimental observations. This raises the question: why these specific symmetries and not others? In contrast, within our framework, all three symmetry groups—, , and —as well as the and unitary symmetries, follow naturally from the invariance of the probability measure, in the same way that the symmetry follows from the Born rule. This suggests a deeper underlying principle governing the symmetries in fundamental physics.
2.3.4. A Starting Point for a Theory of Quantum Gravity
In the previous section, we developed a quantum theory valued in Spinc(3,1), which served as the arena for RQM. We then demonstrated how a single copy of this theory leads to the gauge symmetries of the standard model, the Dirac current and other features of RQM. The goal of this section is to extend this methodology to basis vectors, in which the metric tensor emerges as an observable. To achieve this, we will utilize both copies of the double-copy product.
We recall the definition of the metric tensor in terms of basis vectors of geometric algebra, as follows:
Then, we note that the double-copy product acts on a pair of basis element
and
, as follows:
where
and
are SO(3,1) rotated basis vectors, and where
is a probability measure.
As one can swap and and obtain the same metric tensor, the double-copy product guarantees that is symmetric.
Furthermore, since
, we get:
which allows us to conclude that
and
are self-adjoint within the double-copy product, entailing the interpretation of
as an observable.
In the double-copy product, the metric tensor emerges as a double copy of Dirac currents. This formulation suggests that the metric tensor encodes the probabilistic structure of a quantum theory of gravity in the form of a symmetric rank-2 tensor, analogous to how the Dirac current encodes the probabilistic structure of a special relativistic quantum theory in the form of a 4-vector.
Let us now investigate the dynamics. We recall that the evolution operator (Definition 19) is:
Acting on the wavefunction, the effect of this operator cascades down to the basis vectors via the double-copy product:
which realizes an
transformation of the metric tensor via action of the exponential of a bivector, and a double-copy unitary invariant transformation via action of the exponential of a pseudo-scalar:
In summary, this initial investigation has identified a scenario in which the metric tensor is measured using basis vectors. The evolution operator, governed by the Schrödinger equation, dynamically realizes SO(3,1) transformations on the metric tensor. Furthermore, the amplitudes associated with possible metric tensors are derived from a double-copy of unitary quantum theories acting on the basis vectors. This formulation simultaneously preserves the SO(3,1) symmetry, essential for describing spacetime structure, and the unitary symmetry, fundamental to quantum mechanics. It describes all changes of basis transformations that an observer in 3+1D spacetime can perform prior to measuring (in the quantum sense) a basis system in spacetime, and attributes a probability to the outcome (the outcome being the metric tensor).
2.3.5. The Einstein Field Equation
In the previous section, we established that the metric tensor emerges as an observable through the double-copy mechanism acting on basis vectors. We also determined that this probability measure transforms covariantly under SO(3,1) Lorentz transformations.
To derive the dynamical equations governing this metric tensor, we seek the simplest possible Lagrangian whose equations of motion are a function of
and that respects this SO(3,1) covariance. The Einstein-Hilbert action naturally emerges as this simplest choice:
where
with
G being Newton’s gravitational constant,
g is the determinant of the metric tensor, and
R is the Ricci scalar. Varying this simplest possible covariant action yields the Einstein field equations
, where
is the Einstein tensor, which automatically satisfies the Bianchi identities ensuring conservation of energy-momentum.
2.4. Dimensional Obstructions
In this section, we explore the dimensional obstructions that arise when attempting to resolve the entropy maximization problem for other dimensional configurations. We found that all geometric configurations except those we have explored here (e.g.
,
,
and
) are obstructed. By obstructed, we mean that the solution to the entropy maximization problem,
, does not satisfy all axioms of probability theory.
Let us now demonstrate the obstructions mentioned above.
Theorem 14 (Not isomorphic to a real matrix algebra). The determinant of the matrix representation of the geometric algebras in this category is either complex-valued or quaternion-valued, making them unsuitable as a probability.
Proof. These geometric algebras are classified as follows:
The determinant of these objects is valued in or in , where are the complex numbers, and where are the quaternions. □
Theorem 15 (Negative Probabilities in the RQM). The even sub-algebra, which associates to the RQM part of the theory, of these dimensional configurations allows for negative probabilities, making them unsuitable as a RQM.
Proof. This category contains three dimensional configurations:
-
:
Let
, then:
which is valued in
.
-
:
Let
, then:
which is valued in
.
-
:
-
Let
, where
, then:
We note that
, therefore:
which is valued in
.
In all of these cases the RQM probability can be negative. □
Conjecture 1 (No probability measures as a self-product (in 6D)). The multivector representation of the norm in 6D cannot satisfy any observables.
Proof (Argument). In six dimensions and above, the self-product patterns found in Definition 15 collapse. The research by Acus et al.[
9] in 6D geometric algebra demonstrates that the determinant, so far defined through a self-products of the multivector, fails to extend into 6D. The crux of the difficulty is evident in the reduced case of a 6D multivector containing only scalar and grade-4 elements:
This equation is not a multivector self-product but a linear sum of two multivector self-products[
9].
The full expression is given in the form of a system of 4 equations, which is too long to list in its entirety. A small characteristic part is shown:
From Equation
206, it is possible to see that no observable
can satisfy this equation because the linear combination does not allow one to factor it out of the equation.
Any equality of the above type between and is frustrated by the factors and , forcing as the only satisfying observable. Since the obstruction occurs within grade-4, which is part of the even sub-algebra it is questionable that a satisfactory theory (with non-trivial observables) be constructible in 6D, using our method. □
This conjecture proposes that the multivector representation of the determinant in 6D does not allow for the construction of non-trivial observables, which is a crucial requirement for a relevant quantum formalism. The linear combination of multivector self-products in the 6D expression prevents the factorization of observables, limiting their role to the identity operator.
Conjecture 2 (No probability measures as a self-product (above 6D)). The norms beyond 6D are progressively more complex than the 6D case, which is already obstructed.
These theorems and conjectures provide additional insights into the unique role of the unobstructed 3+1D signature in our proposal.
It is also interesting that our proposal is able to rule out even if in relativity, the signature of the metric versus does not influence the physics. However, in geometric algebra, represents 1 space dimension and 3 time dimensions. Therefore, it is not the signature itself that is ruled out but rather the specific arrangement of 3 time and 1 space dimensions, as this configuration yields quaternion-valued "probabilities" (i.e. and ).
Consequently, the most sophisticated dimensional configuration in which a least biased solution to the problem of maximizing the Shannon entropy of universal measurements relative to an initial preparation exists is 3+1D.