Submitted:
06 July 2025
Posted:
09 July 2025
You are already at the latest version
Abstract
Keywords:
1. Motivation and Formal Setup
|
Definition (Helical Embedding). Let , and let for . Define the helical embedding as the smooth mapping: |
| Symbol | Description |
|---|---|
| x | Scalar input value (e.g., integer token or real number) |
| T | Period of the helical embedding |
| Helical embedding function: | |
| Flat or linear embedding (e.g., ) | |
| Periodic or coiled embedding (same as ) | |
| Helical manifold defined by period T; image of | |
| Angular phase: | |
| Number of completed coil turns | |
| Ceiling function: smallest integer | |
| Modulo projection of x: | |
| Aliasing error from multiple x mapping to same angle | |
| Inverse projection (if defined), maps back from helix to scalar x | |
| Indexing turn number in the coil (e.g., ) |
2. Coiling Arithmetic in Large Language Models
2.1. Unveiling Hidden Geometric Structure
| LLMs, without explicit programming, encode numbers through trigonometric structure. |
2.2. Reconstructing Arithmetic as Translation in Phase Space
2.3. Implications Across Disciplines
| Domain | Implication |
| AI interpretability | Structured embeddings allow direct probing of reasoning mechanisms |
| Neuroscience | Supports phase-based models (e.g., grid cells, oscillatory coding) |
| Cognitive science | Suggests arithmetic and logic may emerge from spatial mappings |
| AI safety | Enables geometric control and auditing of internal reasoning |
2.4. Related Representational Architectures
2.4.1. Neural ODEs and Helical Dynamics
Implication: Helical arithmetic can be embedded within ODE solvers, where the latent state evolves along sinusoidal manifolds.
2.4.2. Fourier Transformers and Helix Encodings
Implication: The helix serves as a multiscale positional embedding scheme and may enhance arithmetic generalization when incorporated into transformer attention blocks.
2.4.3. Group Equivariance in Helical Space
Implication: Helical arithmetic can be analyzed using group-equivariant frameworks, suggesting a bridge between neural arithmetic and representation theory.

3. Tegmark’s Procedure to Discover Helical Structure in LLMs
- Numbers were not encoded as symbols with inherent mathematical meaning.
- Yet models could still perform approximate addition and counting.
-
Step 1: Extract EmbeddingsThey used pretrained LLMs (e.g., GPT-2) and extracted the final token embeddings of numerical tokens—such as "3", "5", "100"—from the model’s vocabulary.
-
Step 2: Dimensionality Reduction
-
Step 3: Fit Periodic FunctionsThey then modeled these embeddings as functions of the form:This fit remarkably well, especially for smaller numerical tokens (1–100). The curve followed a helical geometry, with periodic sinusoidal modulations.
-
Step 4: Arithmetic GeneralizationThey demonstrated that:
4. Helical Axioms for Number Representation
4.1. Euclidean Axioms (Helical Version)
-
Axiom H1 (Linearity)For every number , there exists a unique projection onto the linear dimension:
-
Axiom H2 (Periodicity)For every and every period , the following components exist:which preserve equality modulo T.
-
Axiom H3 (Helicity)Every point representing a number a in the model space lies on a curve of the form:with a defined set .

4.2. Peano Axioms (Helical Version)
-
Axiom H-P1 (Zero Point of the Helix)There exists a number 0 whose helical representation is:corresponding to the values .
-
Axiom H-P2 (Successor Function)There exists a function such that:representing a unit shift along the helix.
-
Axiom H-P3 (No Looping for Naturals)For every a, we have and ; the helix never loops back for natural numbers (no periodicity in the linear dimension).
-
Axiom H-P4 (Helical Induction)If a set contains 0, and for every we also have , then .
4.3. Zermelo–Fraenkel Axioms (Helical Version as Fourier Set Space)
-
Axiom H-ZF1 (Existence of the Helical Empty Set)There exists an object , such that:
-
Axiom H-ZF2 (Set Composed of Helices)For every representation , one can construct a set that is a valid set in the representational space.
-
Axiom H-ZF3 (Helical Function as Set Transformation)If is continuous, then:for additive and harmonic functions f.
4.4. General Interpretation
| Classical System | Helical Equivalent |
| Euclidean line | Linear axis of the helix |
| Point | Helical vector |
| Peano successor | Translation along the helix |
| ZF set | Collection of helical vectors |
| Element of a set | Encoded point in Fourier-representational space |
5. Helical Axioms in First-Order Logic
5.1. Euclidean Axioms (Helical Version)
- Linearity:
- Periodicity:
- Helicity:
5.2. Peano Axioms (Helical Version)
- Zero Point:
- Successor:
- Non-Looping:
- Induction:
5.3. Zermelo–Fraenkel Axioms (Helical Version)
- Helical Empty Set:
- Helical Set Existence:
- Function Application:
6. Group Structure of Helical Arithmetic
6.1. Addition as Vector Translation
6.2. Subtraction as Reverse Translation
6.3. Modular Equivalence via Phase Overlap
6.4. Interpretation
- Identity:
- Inverse:
- Closure (approximate):

7. Gödel-Style Encodings in Helical Arithmetic
- Gödelization of Helical Terms:where each exponent corresponds to a coefficient or frequency term in the helix vector.
- Decodability:
- Arithmetization of Helical Arithmetic:and thusassuming multiplicative encoding of vector components.
7.1. Gödel Incompleteness in the Helical Framework
- (H-Incompleteness Theorem I): If is consistent and sufficiently expressive, there exists a well-formed formula such that:
- (H-Incompleteness Theorem II): The consistency of cannot be proven within itself:
- Helical Gödel Sentence:
7.2. Fixed-Point Construction of
7.3. Formal Proof Sketch of Undecidability
- Suppose . Then, by definition of , we have:which contradicts the fact that under soundness.
- Suppose . Then:implying proves a falsehood, violating consistency.
8. Principles of Helical Differentiation
8.1. Formal Definition: Helical Embedding
| Definition |
| Let , and let for each . Define the helical embedding as the smooth mapping: |
- The function is infinitely differentiable:
- The first coordinate is linear:
- The remaining coordinates are trigonometric, encoding periodic features with independent frequencies
- The trajectory of lies on a smooth, non-self-intersecting spiral in high-dimensional space
8.2. Helical Function Space
8.3. First-Order Helical Derivative
9. Helical Gradient
9.1. Helical Chain Rule
9.2. Second-Order Helical Derivative (Helical Hessian)
- Linear term:
- Periodic terms:
9.3. Helical Tangent Space and Directional Derivative
9.4. Applications
- Interpretation of neurons in LLMs as helical differential detectors
- Addition operations modeled via tangent-space translation
- Harmonic periods encode multi-scale structure, akin to Fourier analysis
10. Logic Operations on Helical Representations
10.1. Axiom H-¬(Negation: Phase Inversion)
10.2. Axiom H-∧(Conjunction: Harmonic Product)
10.3. Axiom H-∨(Disjunction: SoftMax)
Axiom H-→ (Implication: Cosine Similarity)
10.4. Axiom H-∀(Universal Quantifier)
10.5. Axiom H-∃(Existential Quantifier)
11. Modal and Temporal Logic on Helical Embeddings
11.1. Axiom H-□ (Necessity: Phase Invariance)
11.1.0.1. Axiom H-F (Eventually): Future Activation
11.1.0.2. Axiom H-G (Always): Periodic Recurrence
11.2. Interpretation Table
| Operator | Helical Interpretation | Description |
| Phase-stable across local domain | Necessity as local smooth invariance | |
| Match exists in helix space | Possibility via phase compatibility | |
| Reachable activation point | Eventually true in the future | |
| Periodically stable truth | Always true along time-like axis |
11.3. Torus Embeddings: Concept and Construction
- Multi-scale representation: Different isolate patterns at varying “length-scales” of x.
- Reduced aliasing: Incommensurate periods make large-x collisions unlikely.
- Parameter cost: Each extra period adds two dimensions; a embedding has size .
11.4. Helical Arithmetic Meets Quantum Phase Encoding
- Enhanced representational power: Quantum-inspired phases allow modeling interference effects, potentially capturing higher-order periodic patterns unattainable by real-valued helices alone.
- Unitary constraints: Imposing orthonormal (unitary) transformations on the phase channels can improve stability and invertibility of embeddings.
- Quantum simulator experiments: Prototype a small quantum circuit that encodes integers as phases, applies QFT, and measures output probabilities for addition-related tasks—then compare with a classical helix-based network.
12. Summary and Outlook
12.1. Latent Geometry in Neural Representations
12.2. Reconstructing Arithmetic from Geometry
12.3. Axiomatic Reformulation
- Successor as unit translation along the helix
- Sets as collections of helically embedded vectors
- Logical axioms reformulated through trigonometric transformations
12.4. Differentiation and Calculus on the Helix
- Helical Gradient
- Helical Hessian for second-order derivatives
- Chain rule and tangent spaces defined over latent manifolds
12.5. Latent-Space Logic and Modal Operators
- Negation = phase inversion (-shift)
- Conjunction = harmonic product with ReLU
- Disjunction = softmax across embeddings
- Quantifiers = min/max operations over embedded domains
12.6. Torus and Quantum Extensions
13. Future Work Commentary
-
Empirical Embedding AnalysisApply the coiling framework to newer models (e.g., GPT-4o, Claude 3) to verify whether their internal number embeddings follow a helical pattern, and if so, with what frequencies, dimensions, and precision.
-
Architecture-Level EnhancementsDesign helix-aware modules or positional encodings that explicitly model arithmetic as translation in phase space. Evaluate improvements on benchmarks requiring symbolic and numerical generalization.
-
Latent-Space Logic OperationsInvestigate the practical implementation of differentiable logic gates and quantifiers within neural architectures. Could a network be trained to perform logic directly in helix space, using geometric rules?
-
Integration with QAL (Qualia Abstraction Language)Explore how QAL, a formal language for encoding introspective, cognitive, or modal states, could be layered atop helical embeddings to represent non-numeric but structured latent concepts. The compositional and gradient-based nature of QAL makes it an ideal testbed for mapping abstract qualia to geometric embeddings.
-
Gödelian Numbering Over HelicesAlthough the paper sets aside incompleteness, future work could revisit Gödel numbering schemes specifically adapted to the helical setting. Encoding syntactic structures as multi-scale Fourier components could yield new results in formal verification, latent program tracing, and logic circuit emulation.
-
Quantum-Inspired RepresentationsExplore complex phase-based embeddings (e.g., ) as alternatives to real-valued helices. These may better simulate phenomena such as superposition, interference, and entanglement within classical systems.
-
Cross-Disciplinary ConnectionsCoiling arithmetic aligns with:
- Grid cell encoding in neuroscience [16],
- Oscillatory cognition in psychology [17],
- Signal decomposition in physics [18].
Collaboration across fields could deepen our understanding of whether nature itself uses coiling-like encodings for abstract reasoning.
References
- Androsiuk, J. Kułak, and K. Sienicki. "Neural network solution of the Schrödinger equation for a two-dimensional harmonic oscillator." Chemical physics 173, no. 3 (1993): 377-383.
- Kułak, L., K. Sienicki, and C. Bojarski. "Neural network support of the Monte Carlo method." Chemical physics letters 223, no. 1-2 (1994): 19-22.
- Chen, Ricky T. Q., Yulia Rubanova, Jesse Bettencourt, and David Duvenaud. Neural Ordinary Differential Equations. Advances in Neural Information Processing Systems (NeurIPS), 2018. https://arxiv.org/abs/1806.07366.
- Lee-Thorp, James, Joshua Ainslie, Ilya Eckstein, and Santiago Ontañón. Fourier Transformer: A new attention mechanism for long-range sequences. arXiv preprint 2021. https://arxiv.org/abs/2105.03824. arXiv:2105.03824.
- Cohen, Taco S., and Max Welling. Group Equivariant Convolutional Networks. Proceedings of the 33rd International Conference on Machine Learning (ICML), 2016. https://arxiv.org/abs/1602.07576.
- Kantamneni, Subhash, Ziming Liu, and Max Tegmark. "How Do Transformers" Do" Physics? Investigating the Simple Harmonic Oscillator." arXiv preprint arXiv:2405.17209 (2024). https://arxiv.org/pdf/2405.17209. arXiv:2405.17209.
- Kantamneni, A. and Tegmark, M., (2025). Emergent Helical Representations in Large Language Models. arXiv preprint arXiv:2310.02255. https://arxiv.org/abs/2310.02255. arXiv:2310.02255.
- Chang, F. C., Lin, Y. C., & Wu, P. Y. (2024). Unraveling Arithmetic in Large Language Models: The Role of Algebraic Structures. arXiv preprint https://arxiv.org/abs/2411.16260. arXiv:2411.16260.
- Tancik, Matthew, Pratul Srinivasan, Ben Mildenhall, Sara Fridovich-Keil, Nithin Raghavan, Utkarsh Singhal, Ravi Ramamoorthi, Jonathan Barron, and Ren Ng. "Fourier features let networks learn high frequency functions in low dimensional domains." Advances in neural information processing systems 33 (2020): 7537-7547. https://proceedings.neurips.cc/paper_files/paper/2020/file/55053683268957697aa39fba6f231c68-Paper.pdf. Liu, Jiaheng, Dawei Zhu, Zhiqi Bai, Yancheng He, Huanxuan Liao, Haoran Que, Zekun Wang et al. "A comprehensive survey on long context language modeling." arXiv preprint arXiv:2503.17407 (2025). https://arxiv.org/pdf/2503.17407?. Abbasi, Jassem, Ameya D. Jagtap, Ben Moseley, Aksel Hiorth, and Pål Østebø Andersen. "Challenges and advancements in modeling shock fronts with physics-informed neural networks: A review and benchmarking study." arXiv preprint (2025). https://arxiv.org/pdf/2503.17379? arXiv:2503.17407.
- Yuan, Z., & Yuan, H. (2024). How Well Do Large Language Models Perform in Arithmetic Tasks? arXiv preprint arXiv:2401.01175. https://arxiv.org/abs/2401.01175. Forootani, Ali. "A survey on mathematical reasoning and optimization with large language models." arXiv preprint arXiv:2503.17726 (2025). https://arxiv.org/pdf/2503.17726. Heneka, Caroline, Florian Nieser, Ayodele Ore, Tilman Plehn, and Daniel Schiller. "Large Language Models–the Future of Fundamental Physics?." arXiv preprint arXiv:2506.14757 (2025). https://arxiv.org/pdf/2506.14757. Schorcht, Sebastian, Franziska Peters, and Julian Kriegel. "Communicative AI Agents in Mathematical Task Design: A Qualitative Study of GPT Network Acting as a Multi-professional Team." Digital Experiences in Mathematics Education 11, no. 1 (2025): 77-113. https://link.springer.com/content/pdf/10.1007/s40751-024-00161-w.pdf.
- Wang, C., Zheng, B., Niu, Y., & Zhang, Y. (2021). Exploring Generalization Ability of Pretrained Language Models on Arithmetic and Logical Reasoning. Journal of Chinese Information Processing. https://link.springer.com/article/10.1007/s11618-021-0645-8.
- Geva, M., Schuster, T., & Berant, J. (2021). Transformer Feed-Forward Layers Are Key-Value Memories. arXiv preprint arXiv:2106.05313. https://arxiv.org/abs/2106.05313. Ji, Ziwei, Nayeon Lee, Rita Frieske, Tiezheng Yu, Dan Su, Yan Xu, Etsuko Ishii, Ye Jin Bang, Andrea Madotto, and Pascale Fung. "Survey of hallucination in natural language generation." ACM computing surveys 55, no. 12 (2023): 1-38. https://arxiv.org/pdf/2202.03629. Liu, Sijia, Yuanshun Yao, Jinghan Jia, Stephen Casper, Nathalie Baracaldo, Peter Hase, Yuguang Yao et al. "Rethinking machine unlearning for large language models." Nature Machine Intelligence (2025): 1-14. https://arxiv.org/pdf/2402.08787. Zhao, Haiyan, Hanjie Chen, Fan Yang, Ninghao Liu, Huiqi Deng, Hengyi Cai, Shuaiqiang Wang, Dawei Yin, and Mengnan Du. "Explainability for large language models: A survey." ACM Transactions on Intelligent Systems and Technology 15, no. 2 (2024): 1-38. https://dl.acm.org/doi/pdf/10.1145/3639372.
- T. Hafting, M. Fyhn, S. Molden, M.-B. Moser, and E. I. Moser, “Microstructure of a spatial map in the entorhinal cortex,”. Nature 2005, 436, 801–806. [CrossRef]
- M. A. Nielsen and I. L. Chuang, Quantum Computation and Quantum Information, Cambridge University Press, 2010.
- A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin,“Attention is All You Need,” Advances in Neural Information Processing Systems (NeurIPS), vol. 30, 2017. https://arxiv.org/abs/1706.03762. Das, Badhan Chandra, M. Hadi Amini, and Yanzhao Wu. "Security and privacy challenges of large language models: A survey." ACM Computing Surveys 57, no. 6 (2025): 1-39. https://arxiv.org/pdf/2402.06196, Xi, Zhiheng, Wenxiang Chen, Xin Guo, Wei He, Yiwen Ding, Boyang Hong, Ming Zhang et al. "The rise and potential of large language model based agents: A survey." Science China Information Sciences 68, no. 2 (2025): 121101. https://arxiv.org/pdf/2309.07864. Hayes, Thomas, Roshan Rao, Halil Akin, Nicholas J. Sofroniew, Deniz Oktay, Zeming Lin, Robert Verkuil et al. "Simulating 500 million years of evolution with a language model." Science (2025): eads0018. https://www.biorxiv.org/content/biorxiv/early/2024/12/31/2024.07.01.600583.full.pdf.
- Dang, Suogui, Yining Wu, Rui Yan, and Huajin Tang. "Why grid cells function as a metric for space.". Neural Networks 2021, 142, 128–137. [CrossRef] [PubMed]
- Lundqvist, Mikael, and Andreas Wutz. "New methods for oscillation analyses push new theories of discrete cognition." Psychophysiology 59, no. 5 (2022): e13827. https://onlinelibrary.wiley.com/doi/pdfdirect/10.1111/psyp.13827.
- Eriksen, Thomas. "Data-driven Signal Decomposition Approaches: A Comparative Analysis." arXiv preprint (2022).https://arxiv.org/pdf/2208.10874. arXiv:2208.10874.
| 1 |
Principal Component Analysis |
| 2 |
t-distributed Stochastic Neighbor Embedding |
| 3 | The Pauli-Z operator is defined as . It flips the phase of but leaves unchanged: , . It is Hermitian, unitary, and satisfies . In quantum embeddings, it generates phase shifts as in . |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).