Submitted:
23 January 2026
Posted:
26 January 2026
Read the latest preprint version here
Abstract
Keywords:
1. The Methodological Principles
2. The Patching Fallacy: Mathematical Anatomy
3. Case Study: The Voynich Manuscript
3.1. The Evidence Vector (E)
- 1.
- (Extreme Uniqueness): Prior quantitative analyses (e.g., using EVA consensus transcriptions [10]) report that approximately of word types are singletons (hapax legomena), with uncertainty estimated via resampling over pages/folios. (Note: this percentage refers to the proportion of distinct word types that appear exactly once in the corpus, not the proportion of tokens that are hapax; the latter is typically much lower and less diagnostic of the anomaly.)
- 2.
- (Rigid Morphology): Published morphological characterizations (e.g., [10]) report that roughly of tokens conform to a constrained Prefix–Root–Suffix (or comparable) template under standard EVA-style segmentation and community-accepted tokenization conventions.
- 3.
- (Sectional Disjointness): Reported vocabulary overlap between major manuscript sections is low; typical summaries give an average Jaccard overlap , consistent with Currier’s observation of distinct “languages”/dialects [9].
- 4.
- (Positional Rigidity): Multiple analyses of EVA-style transcriptions (e.g., [10]) report glyph(s) with near-zero positional variance (e.g., restricted to line-initial position), indicating strong layout-conditioned constraints.
- 5.
- (Contextual Compression): Published comparisons of illustration labels vs. running text (e.g., [10]) report systematic reduction: labels preferentially use a strict subset of the morphological components found in body text.
3.2. Rejection of Postulated (Patched) Models
- Natural Language (): Contradicted by and the lack of Zipfian function words [8]. Maintained only by patching with “Unknown Dialect,” “Extreme Abbreviation,” or “Polyglot” parameters. These unconstrained parameters incur massive Occam penalties. Verdict: Strongly disfavored.
- Cipher (): Contradicted by (simple substitution destroys morphological regularity; homophonic substitution contradicts ). Maintained only by patching with “Anomalous Plaintext” or “Stochastic Nulls.” Verdict: Strongly disfavored.
3.3. The Deduced Prior: Structured Reference System ()
4. Formal Specification of the Deduced Model ()
4.1. Distinction between Calibration and Patching
- Calibration: Determining the value of a constant required by the deduced structure (e.g., measuring G in ). This fixes the specific realization of the model but does not alter its complexity class.
- Patching: Introducing new structural terms or auxiliary rules to force a fit (e.g., adding to the gravity equation because the data deviates). This increases model complexity to absorb error.
4.2. Model Components and Topology
- P: Prefix sequences (Metadata/Classifiers).
- R: Root sequences (Primary Identifiers/Keys).
- S: Suffix sequences (Status/State markers).
- D: Delimiters (Record separators).
4.3. The Generative Template
4.4. Likelihood and Zero-Patch Constraint
4.5. Specific Falsifiers for (The Kill List)
- 1.
- Low Root Uniqueness: If the set R (Roots) is found to follow a standard natural-language Zipfian curve (small core vocabulary) rather than the reported high-hapax profile ( uniqueness), the “Primary Key” interpretation collapses.
- 2.
- High Intra-Token Entropy: If Mutual Information is high (implying grammatical agreement or vowel harmony), the orthogonality of ID vs. Metadata is disproven.
- 3.
- Delimiter Mobility: If glyphs identified as D are shown to have high positional variance (scattering randomly), the record-structure hypothesis fails.
- 4.
- Significant Sectional Overlap: If , the thematic partitioning hypothesis fails.
- 5.
- Failure of Label Compression: If labels do not show systematic stripping of P (Prefixes) relative to the body text, the “Contextual Compression” prediction fails.
5. Conclusions
Acknowledgments
Appendix A. Operational Falsification Protocol
References
- Bayes, T. An Essay towards solving a Problem in the Doctrine of Chances . In Philosophical Transactions of the Royal Society of London; 1763. [Google Scholar]
- Shannon, C. E. A Mathematical Theory of Communication . In Bell System Technical Journal.; 1948. [Google Scholar]
- Popper, K. The Logic of Scientific Discovery; Hutchinson & Co, 1959. [Google Scholar]
- Schwarz, G. Estimating the dimension of a model . Annals of Statistics 1978, 6(2), 461–464. [Google Scholar] [CrossRef]
- Jaynes, E. T. Probability Theory: The Logic of Science; Cambridge University Press, 2003. [Google Scholar]
- MacKay, D. J. C. Information Theory, Inference and Learning Algorithms; Cambridge University Press, 2003. [Google Scholar]
- Friston, K. The free-energy principle: a unified brain theory? . Nature Reviews Neuroscience 2010. [Google Scholar] [CrossRef] [PubMed]
- Zipf, G. K. Human Behavior and the Principle of Least Effort; Addison-Wesley, 1949. [Google Scholar]
- Currier, P. Papers on the Voynich Manuscript . In New Research on the Voynich Manuscript; 1976. [Google Scholar]
- Voynich. The EVA Transcription Consensus . nu. 2024. Available online: https://www.voynich.nu/analysis.html.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
