Preprint
Article

This version is not peer-reviewed.

Maxwell and Artificial Intelligence: Preliminary QED Models of AI (LLM) Dynamics

Submitted:

30 April 2025

Posted:

06 May 2025

You are already at the latest version

Abstract
The present paper aims to set up preliminary conceptual structures of two electrodynamic models of new AI (LLM) dynamics, one is based on the interaction of a charge filed with Maxwell vector potential field, and another is based on the interaction of a spinor field with Maxwell field. For this approach, it introduces a number of new concepts. The AI demand is defined as a two-components unit, a task charge and the corresponding intelligence. Neural network embeds the features of the task into a vector space, which is defined as Maxwell vector potential field. The gradience descent conducts the field strength. For spinor field, it makes the human intelligence and machine intelligence orthogonal; the former is the AI demand and latter the AI supply. The Maxwell field becomes the photon field, which reflects the cost function. The interactions between a pair of AI demand and AI supply and the cost function are pictured by the Feynman Diagram.
Keywords: 
;  ;  ;  ;  ;  ;  ;  ;  ;  

1. Maxwell Vector Potential

1.1. Consider a Verbal Task A for LLM

Assume A has a family of n features. Note that the task A may involve more than n features, and together these residues features are treated as gauge freedom later. In LLM, the transformer generates a weight for each of n features, and embeds A as vectors. This step is called embedding. Solving a task may involve many rounds of embeddings, each updates the vector space to certain extensions.
Definition 1. As an LLM procedure, the transformer embeds a verbal task to a vector space, denoted by A μ . Call A μ the Maxwell vector potential; it is also called the Maxwell field. In other words, a verbal task brings in the external potential. A μ can be characterized by exterior form as 1-form, which stands for f x d x .
In LLM, gradience descent is a major technique for deep learning. The gradience, F μ υ , can be characterized as 2-form in terms of the exterior form. The relation between A μ and F μ υ is as below.
Definition 2. The gradience of A μ , denoted by F μ υ . We have,
F μ υ = μ A ν ν A μ
In gauge field theory, F μ υ stands for the gauge field strength; it is a density function.
There are three paths to build QED models for LLM. [1] The first path is by interactions of a charge field with Maxwell field. The second path is by the interactions of a spinor field (Direc field) with Maxwell field. The third path is by the interactions of Klein-Gordon field with Maxwell field. In this theoretical note, we only consider the first two paths, i.e., we follow the first path and the second path to build QED models for LLM framework.

1.2. From the Angle to the Dynamic Phase

In LLM, the inner product is introduced in the vector space. Thus, from one vector can produce another vector and the superposition of two vectors. Between any two vectors, there is an angle which is currently called the distance. This is exactly the illusion effected H. Wyle that delated him from establishing gauge field theoretic dynamics. To correct it, we have,
Definition 3. The angle between two potential vectors is defined as the dynamic phase. Thus, from one potential state ψ to another potential state ψ can be represented as
ψ ψ = e i θ ψ
where θ stands for dynamic phase. This is the key move from LLM to AI dynamics. This is called the gauge transformation of the first kind at the global level. This gauge transformation also requires keeping the transformation conformal at the local level, where θ   is no longer an any given constant but a function θ (x). The gauge transformation at the local level is called the gauge transformation of the second kind. The gauge principle says that if the gauge transformation of the first kind does not hold, then the gauge transformation of the second kind can not be hold. In order to achieve the local gauge symmetry, it needs to introduce new notions, namely, covariate derivative and gauge field. We will develop this point latter.

2. Charge Field Interacts with Maxwell Potential

Dynamic analysis is the sourced analysis, and this source is called the charge. We have
Postulate 1. The LLM tasks carry the task charge, denoted as E. A task can be characterized by a family of features. By Anne Treisman [2], the integration of these features demands attention. Call this integration the charge potential and denote it as ϕ .
To solve a task demands some intentions and solving a task demands a number of intelligence factors, such as reasoning, learning, understanding, integrating, etc. We have
Postulate 2. Let A i are intelligence factors. The integration of A i is defined as the intelligence potential, denoted by A ) .
Definition 3. The notion of AI vector potential is defined as a unit with two components,
A A I = ϕ , A
This idea can be seen as a Platonic version of Maxwell vector potential; it is a field. In quantum field theory, the charge is characterized as a field. Here it is called the task-charge field. denoted as TCF. The TCF interacts with A A I yields a dynamic model, logically skin to that an electric field interacts with the Maxwell potential field yields quantum electrodynamics (QED).

3. Spinor Field Interacts with Maxwell Potential

3.1. Wyle Spinor

We make a clear distinction of the human intelligence ( α )   and the machine intelligence ( β ) . Further, we make a mathematical trivialization treatment such that α and β are orthogonal:
δ α β = 1 ,   α = β 0 ,   α β
Now we define α as the intelligence demand ( D α ) and β as the intelligence supply S β . For an any given task φ , we have
Definition 5. e =   D α   ,   φ , and e + =   S β   ,   φ + .
The above definitions can be characterized as a Wyle spinor below,
ψ = ψ 1 = e ψ 2 = e +

3.2. Dirac Spinor

In the neural network for deep learning, the hidden layers indicate that both human intelligence and machine intelligence hesitate. This kind of intellectual hesitation reflects a certain kind of rationality. Such a hesitation is characterized as spin, which has two eigenstates, spin up ( )   or else spin down ( ); thus, in a sense, both e   and   e + spin ½. This can be seen at the activating stage in the architecture of neural network, which returns only 0 or 1. We need hidden layers to calibrate and update neural weights (probabilities), during which the intelligent processes experience all possible superpositions of the two eigenstates. Taking spins into account, the Wyle spinor is developed into Dirac spinor, represented as below,
ψ = ψ 1   = e ψ 2   = e ψ 3   = e + ψ 4   = e +

3.3. Photon Field

In quantum field theory, the Maxwell field is the photon field, which is denoted by λ . In market dynamics [3], the phone field is used to model the price field. Here, in LLM, the photon field is used to mode the cost function of neural network. In quantum field theory, all particles are described as a field, and a field is characterized as an operator. In addition, all the particles have its life span, they are created and annihilated. We need to introduce the creation operators here, the creation operator ( a ^ ) and the annihilation operator ( a ^ ) . They are adjoint operators. A simple interaction is that a pair of e and e + are created and they fly in inverse directions to each other. Then, both are annihilated by creating a photon field λ . This process can be pictured by Feynman diagram shown in Figures 1,Preprints 157886 i001
One may start from here to develop a detailed model of LLM skin to spinor electrodynamics.

4. Wavefunction and Measurement Problem

The major processes in LLM learning/training are not directly observable; thus, it should be characterized as the U-procedure (Unitary procedure) of a wavefunction [4]. Hence, the probability used in LLMs should be Born probability. Note that at the activating stage of the neural network, the feature weights are converted to 0 or 1; this is actually a Yes/No measurement [5], which is called the R-procedure (reduction procedure). This reflects the measurement problem in quantum physics. Yang [6] proposed a unified account of the U-Procedure and the R-Procedure as an alternative solution of the measurement problem.

References

  1. Wang, Z. Elementary Quantum Field Theory; Peking University Press (China), 2008. [Google Scholar]
  2. Treisman, A. A Feature-Integration Theory of Attention. Cognitive Psychology 1980, 12, 97–136. [Google Scholar] [CrossRef] [PubMed]
  3. Yang, Y. Principles of Market Dynamics: Economic Dynamics and Standard Model (II). Science, Economics, and Society 2022, 40, No. 5 (China). [Google Scholar]
  4. Penrose, R. The Road to Reality: A Complete Guide to the Laws of the Universe; Random House, Inc.: New York, 2005. [Google Scholar]
  5. von Neumann, J. Mathematical Foundations of Quantum Mechanics; Princeton University Press: Princeton, New Jersey, 1955. [Google Scholar]
  6. Yang, Y. The revised Schrödinger equation as a solution of measurement paradox: A unified model of the U-procedure and the R-procedure. 2024. [Google Scholar] [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.
Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

Disclaimer

Terms of Use

Privacy Policy

Privacy Settings

© 2025 MDPI (Basel, Switzerland) unless otherwise stated