Preprint
Article

This version is not peer-reviewed.

Assembly Theory - Formalizing Assembly Spaces, Discovering Patterns and Bounds

Submitted:

27 December 2025

Posted:

29 December 2025

You are already at the latest version

Abstract
Assembly theory defines structural complexity as the minimum number of steps required to construct an object in an assembly space. We formalize the assembly space as an acyclic digraph of strings. Key results include analytical bounds on the minimum and maximum assembly indices as functions of string length and alphabet size, and relations between the assembly index (ASI), assembly depth, depth index, Shannon entropy, and expected waiting times for strings drawn from uniform distributions. We identify patterns in minimum- and maximum-ASI strings and provide construction methods for the latter. While computing ASI is NP-complete, we develop efficient implementations that enable ASI computation of long strings. We establish a counterintuitive, inverse relationship between a string ASI and its expected waiting time. Geometric visualizations reveal that ordered decimal representations of low ASI bitstrings of even length N naturally cluster on diagonals and oblique lines of the squares with sides equal to 2N/2. Comparison with grammar-based compression (Re-Pair) shows that ASI provides superior compression by exploiting global combinatorial patterns. These findings advance complexity measures with applications in computational biology (where DNA sequences must violate Chargaff's rules to achieve minimum ASI), graph theory, and data compression.
Keywords: 
;  ;  ;  ;  ;  ;  ;  
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.
Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

Disclaimer

Terms of Use

Privacy Policy

Privacy Settings

© 2025 MDPI (Basel, Switzerland) unless otherwise stated