Preprint
Article

This version is not peer-reviewed.

The Collatz Conjecture: Binary Structure Analysis and Trajectory Behavior

Submitted:

29 October 2025

Posted:

31 October 2025

Read the latest preprint version here

Abstract
This paper advances the Collatz conjecture by analyzing binary representations of natural numbers through fractional parts. We introduce a direct non-recursive relation for intermediate mantissas $\sigma_j$ in binary decompositions and prove their equidistribution using Weyl's theorem. \textbf{The self-correcting dynamics of $\sigma_j$ ensure a balance between 1s and 0s, leading to an asymptotic density of 1/2 for 1s in binary $3^n$.} This yields a probabilistic estimate: in half of all cases, remainders have many leading zeros, ensuring rapid descent. Theorems estimate zero density in powers of three and demonstrate sequence decrease for large $n$. Numerical verifications and updated figures support the findings, providing strong evidence for convergence in large cases.
Keywords: 
;  ;  

1. Literature Review

The Collatz conjecture, also known as the 3x+1 problem, is one of the most famous unsolved problems in mathematics. It proposes that for any positive integer N, repeated application of the function (divide by 2 if even; replace with 3 N + 1 if odd) eventually reaches 1. This overview summarizes key contributions from the listed references, focusing on historical context, theoretical advances, computational verifications, statistical properties, and connections to binary representations and sequences. These sources provide a foundation for understanding the conjecture’s complexity, partial results, and related mathematical structures, as explored in the paper on binary decomposition and equidistribution.

1.1. Historical and Biographical Context

  • Lothar Collatz’s Biography [1]: Lothar Collatz (1910–1990) was a German mathematician known for numerical analysis. He proposed the conjecture in 1937 while working on graph theory. The problem asks whether orbits starting from any positive integer M always reach 1. Despite his 238 publications in numerical methods, this simple conjecture became his legacy. Verified up to 10 14 , it remains open, highlighting its deceptive simplicity.
  • Lagarias’s Overview [3]: This survey details the conjecture’s generalizations, such as replacing 3 with other odd integers or extending to negative/zero values. It discusses equivalent formulations (e.g., Syracuse map on odd integers) and open questions, like cycle existence beyond the known ( 1 , 4 , 2 , 1 ) loop. Lagarias emphasizes computational checks and partial proofs, setting the stage for modern results.

1.2. Recent Theoretical Advances

  • Tao’s Paper [2]: Terence Tao proves that for any function f ( N ) as N , almost all orbits (in logarithmic density) have minimum value < f ( N ) . This means orbits are "almost bounded" for nearly all N, strengthening prior bounds (e.g., Korec’s N θ with θ 0.792 ). Using probabilistic models of Syracuse iterates and 3-adic distribution, Tao shows superpolynomial decay in characteristic functions, implying typical orbits drop below polylog N . This supports the conjecture by showing divergence is rare.
  • Hempel’s Paper [12]: Focuses on asymptotic distribution of fractional parts { n α } for irrational α (e.g., α = log 2 3 ). The equidistribution theorem (Weyl’s) ensures uniform distribution modulo 1, crucial for "random" binary digits in 3 n = 2 n log 2 3 . This underpins heuristics in Tao and Sinai’s work.

1.3. Binary Representations of Powers of 3

  • MathOverflow Question [4]: Discusses the longest consecutive 1s ( L n ) in binary 3 n . Simulations up to n = 10000 show L n < 3.5 log n (max 24 ), suggesting logarithmic bounds. Answers model digits as random coin flips, estimating max L n over n N as 2 log 2 N ( 25.6 for N = 10000 , close to observed 27). Code for simulations is provided.
  • Cook’s Blog [5]: Visualizes binary 3 n as a grid (rows for n = 0 to 59), showing semi-chaotic patterns with a slope log 2 3 boundary. Python code replaces 1s with blocks for terminal display. Extends to bases 5,7 (similar chaos) and even 6 (skewed like 3’s but shifted).
  • Wolfram Article [6]: Analyzes binary 3 n grid, revealing regularities in 3 2 n subsequences converging 2-adically to 1. Resolves "mysteries" (e.g., lim 3 2 n = 1 in 2-adics) using p-adic numbers, where high 2-powers are "small." Generalizes to k p n in base p converging to Teichmüller representatives. Connects to sequences like Fibonacci, Catalan, and factorials in p-adics.
These sources highlight that binary 3 n is neither fully regular nor random, with local structures (triangles) and global limits, aligning with the paper’s focus on binary decomposition.

1.4. Statistical and Probabilistic Properties

  • Sinai’s Paper [8]: Presents theorems on statistical properties of trajectories for large x. Models the map T (odd x to ( 3 x + 1 ) / 2 k where result is odd) with ergodic theory, showing invariant measures and entropy. Suggests statistical regularity in long orbits, supporting "almost all" results like Tao’s.

1.5. Computational Verifications and Bounds

  • Barina’s Paper [9]: Verifies the conjecture up to 2.95 × 10 20 using efficient algorithms and hardware (GPUs). Discusses tree structures in orbits and halting times, providing strong empirical support.
  • Krasikov & Lagarias [10]: Uses difference inequalities to derive bounds like min orbit N 0.84 for large N. Improves earlier estimates, contributing to partial proofs for large N.

1.6. Related Mathematical Theory

  • Allouche & Shallit Book [7]: Comprehensive on automatic sequences (generated by finite automata, e.g., binary 3 n as morphic words). Covers theory, numeration systems, and applications to Collatz-like problems, emphasizing complexity in binary powers.
  • Everest et al. Book [11]: Surveys linear recurrence sequences (e.g., Fibonacci) and generalizations, with applications to Diophantine equations and Collatz orbits. Discusses p-adic limits and impacts of recent theorems.
These books provide theoretical tools for analyzing binary structures in powers and recurrences, relevant to equidistribution in the paper.

1.7. Synthesis and Relevance to the Paper

The references collectively illustrate the Collatz conjecture’s enduring appeal: simple to state but resistant to proof, with partial results on "almost all" orbits [2,8] and empirical verification [9]. Binary representations of 3 n [4,5,6] show semi-random patterns explained by p-adics, aligning with the paper’s binary decomposition and Weyl’s equidistribution for fractional parts [12]. Bounds and generalizations [3,10] support density estimates, while books on sequences [7,11] provide foundational theory. Overall, these sources reinforce the paper’s probabilistic approach to convergence in large cases, blending theory, computation, and visualization.

2. Introduction

This article studies the density of zeros in binary representations of natural numbers using fractional parts, which naturally encode binary structure. We develop a framework linking binary gaps, mantissas, and Collatz dynamics. This problem is highly relevant to counting the number of zeros in the binary expansion of 3 n and has not been resolved. It was posed in the works … [4,5,6].

3. Materials and Methods

Zeros dominate Collatz descent: each zero enables a division by 2, outpacing 3 n + 1 growth. For n = 2 k , the sequence reduces in k steps. We decompose M into powers of two and track fractional parts σ j at each stage to quantify zero density.

3.1. Self-Correcting Dynamics of Mantissas σ j

The mantissas σ j induce a self-correcting mechanism in binary expansions, balancing 1s and 0s:
  • Many consecutive 1s increase the tail sum s 1 2 k 1 , driving σ j 0 + (small). By recurrence (Theorem 1), this forces next gap δ j 2 , introducing many 0s.
  • Many 0s decrease s 0 + , driving σ j 1 (large), forcing δ j = 1 , introducing 1s.
This feedback loop ensures local balance (average block of 1s: 2–3; 0s: 2 ), controllable via bounds on runs (Theorems 4, 5). Globally, combined with equidistribution of { n log 2 3 } , it yields asymptotic density 1/2 for 1s in binary 3 n .

4. Results

The Collatz conjecture [1] remains open [3], with progress in [2].
Theorem 1. 
Let M = 3 n , δ j = α j α j + 1 > 0 , α j = α j + ϵ j , σ j = 1 ϵ j . Then
M = i = 1 j 1 2 α i + 2 α j = i = 1 j 2 α i + 2 α j + 1 .
For δ j = 1 ,
σ j = 1 2 σ j + 1 1 ln 2 4 σ j + 1 + F j σ j + 1 3 12 .
For δ j > 1 ,
σ j = 2 δ j σ j + 1 + 1 2 δ j 2 2 δ j + 1 ln 2 2 2 δ j σ j + 1 2 ln 2 4 + 2 2 δ j R j ( ln 2 ) 2 σ j + 1 3 8 .
Proof. 
The recurrence relations are derived from the binary expansion as follows. Start with the identity:
2 ϵ j = 1 + 2 δ j + ϵ j + 1 .
Taking the base-2 logarithm on both sides:
ϵ j = log 2 ( 1 + 2 δ j + ϵ j + 1 ) .
Substituting σ j = 1 ϵ j :
σ j = 1 log 2 ( 1 + 2 δ j + ϵ j + 1 ) .
For δ j = 1 :
σ j = 1 log 2 1 + 2 σ j + 1 1 .
To obtain the approximation, expand log 2 ( 1 + x ) around x = 2 σ j + 1 1 using Taylor series for small perturbations, leading to the quadratic term after substitution. The expansion is:
log 2 ( 1 + x ) = ln ( 1 + x ) ln 2 x x 2 2 + x 3 3 ln 2 .
Substituting x = 2 σ j + 1 1 and simplifying terms yields:
σ j = 1 2 σ j + 1 1 ln 2 4 σ j + 1 + F j σ j + 1 3 12 .
For δ j > 1 , a similar Taylor expansion around the appropriate point gives the stated form, with terms grouped for clarity. □ □
Corollary 1. 
Let σ j ( 0 , 1 ) be the mantissa at step j in the binary decomposition of M N . Then:
1. 
If σ j < 0.45 , then the next binary coefficient after the current gap (i.e., at position α j 1 ) is 1.
2. 
If σ j > 0.55 , then the next binary coefficient after the current gap is 0.
3. 
If σ j 1 , then all subsequent binary coefficients tend to zero except the leading one (i.e., M is close to a power of two).
Proof.
1.
Suppose δ j 2 . The minimal tail contribution is 2 2 = 0.25 , leading to σ j 0.25 even with higher terms. However, for δ j = 1 , the recurrence permits σ j < 0.45 . Numerical inversion of the relation confirms that σ j < 0.45 necessitates δ j = 1 .
2.
Suppose δ j = 1 . The maximum achievable σ j 0.415 < 0.55 . For δ j 2 , σ j 0.3 < 0.55 . Thus, σ j > 0.55 implies δ j 2 .
3.
As σ j 1 , the tail sum s 0 + , implying no significant 1s in the tail beyond the leading term.
□□
Theorem 2. 
The asymptotic density of 1s in the binary expansion of 3 n is 1 / 2 . Consequently, if L n = n log 2 3 + 1 is the bit length of 3 n , the number of zeros is L n h ( n ) , where h ( n ) 1 2 L n , so the number of zeros is asymptotically 1 2 n log 2 3 + o ( n ) .
Proof. 
Since log 2 3 is irrational, by Weyl’s equidistribution theorem [12], the sequence { n log 2 3 } is equidistributed in [ 0 , 1 ) . The binary digits of 3 n can be determined by the continued fraction expansion or successive doubling of the mantissa 2 { n log 2 3 } , and under equidistribution, the digits behave as independent Bernoulli(1/2) random variables in the limit. This implies the Hamming weight h ( n ) (number of 1s) satisfies h ( n ) / L n 1 / 2 as n , where L n n log 2 3 . Therefore, the number of zeros z ( n ) = L n h ( n ) ( 1 / 2 ) n log 2 3 . □ □

4.1. Direct Non-Recursive Relation for σ j

Definition of h. Let h be the total number of 1s in the binary expansion of M (the Hamming weight). The positions of these 1s are denoted p 0 > p 1 > > p h 1 , and the gap lengths are δ i = p i p i + 1 for i = 0 , , h 2 . From the closed-form expression involving the initial mantissa σ 0 ,
2 1 σ 0 = k = 0 h 1 2 i = 0 k 1 δ i ,
we split the sum at index j ( 0 j < h ):
2 1 σ 0 = S j + 2 Δ j · 2 1 σ j ,
where
S j = k = 0 j 1 2 i = 0 k 1 δ i , Δ j = i = 0 j 1 δ i .
The quantity S j is the normalized contribution of the first j binary 1s, and Δ j is the cumulative gap length up to the j-th term. Solving for σ j yields
σ j = 1 log 2 2 1 σ 0 S j
For M = 3 n the partial sum S j can be written explicitly as
S j = k = 0 j 1 2 p k n log 2 3 ,
where p k are the bit positions of the 1s. An equivalent tail-normalised form is obtained as follows. The tail after the j-th term equals
2 1 σ 0 S j = 2 Δ j · 2 1 σ j .
The binary expansion of the tail is
2 Δ j + 2 Δ j δ j + 2 Δ j δ j δ j + 1 +
= 2 Δ j 1 + k = 1 h j 1 2 i = 0 k 1 δ j + i .
Let
s = k = 1 h j 1 2 i = 0 k 1 δ j + i .
Then
2 1 σ j = 1 + s , σ j = 1 log 2 ( 1 + s ) .
σ j = 1 log 2 1 + k = 1 h j 1 2 i = 0 k 1 δ j + i .
The second formula does not contain σ 0 explicitly because the tail s is already normalised to the interval [0,1); the information about the leading part (including σ 0 ) is absorbed into the choice of the splitting point j. For small s,
σ j 1 s ln 2 + s 2 ( ln 2 ) 4 .
For j = 0 , the formula becomes
σ 0 = 1 log 2 1 + k = 1 h 1 2 i = 0 k 1 δ i .
Denote
s = k = 1 h 1 2 i = 0 k 1 δ i .
Then from full normalization
2 1 σ 0 = 1 + s s = 2 1 σ 0 1 ( 0 , 1 ) .
Thus, s is the normalized tail after the leading 1, and
σ 0 = 1 log 2 ( 1 + s ) .
  • As s 0 + (the number is close to a power of two), σ 0 1 , fraction of zeros 1 .
  • As s 1 (1s are densely packed), σ 0 0 + , fraction of zeros 0 .

4.2. Theorem on Maximal Number of 1s in Binary Expansion

Theorem 3 
(Instability of Long Sequences of Ones). The approximations
σ j 1 s ln 2 + s 2 ( ln 2 ) 4 ( for small s )
and
σ j 1 2 σ j + 1 ( for δ j = 1 )
cannot hold simultaneously over a long block of consecutive ones. Such a block leads to exponential growth of σ j backward, contradicting the linear decay implied by the small-s approximation and pushing σ outside [0,1]. Hence, sequences of ones must be interrupted by zeros to maintain σ j ( 0 , 1 ) .
Proof. 
Assume δ i = 1 for i = j , , j + m 1 (a block of m consecutive ones). The recurrence approximation for δ = 1 implies:
σ j 2 m σ j + m .
For large m, the tail sum s 1 , so σ j 0 + by the small-s approximation (adjusted for near-1 s, where the approximation shifts to linear in (1-s)). However, if σ j + m > 0 , then 2 m σ j + m , implying σ j > 1 — a contradiction since σ j ( 0 , 1 ) . Alternatively, chaining forward from a small σ j :
σ j + m 2 m σ j 0 ,
but for consistency with the tail sum, this requires s near 0, contradicting the dense 1s. The linear decay in the small-s approximation cannot coexist with the exponential amplification from the recurrence without violating the bounds. Inserting a zero ( δ i > 1 ) breaks the exponential chain and resets the dynamics to a linear regime, restoring balance. □ □
Theorem 4 
(Maximum Run of Consecutive Ones in the Binary Tail). In the binary decomposition of a natural number M, the maximum length of a run of consecutive 1s in the tail (after the leading 1) is 3. A run of 4 or more consecutive 1s leads to a contradiction between the recurrence relation and the tail sum approximation. Preprints 182767 i001
Proof. 
Consider a block of k consecutive 1s in the tail, i.e., δ i = 1 for i = j , j + 1 , , j + k 1 . The tail sum is exactly:
s = i = 1 k 2 i = 1 2 k .
The mantissa is:
σ j = 1 log 2 ( 1 + s ) = 1 log 2 ( 2 2 k ) .
Using the recurrence approximation for δ j = 1 :
σ j 2 k σ j + k .
Evaluate for specific k:
  • For k = 3 : s = 0.875 , σ j 0.094 , σ j + 3 0.094 / 8 = 0.01175 < 1 — consistent, as the remaining tail can accommodate this small σ .
  • For k = 4 : s = 0.9375 , σ j 0.046 , σ j + 4 0.046 / 16 = 0.002875 . The remaining tail contributes at most i = 5 2 i = 2 4 = 0.0625 , requiring σ j + 4 1 log 2 ( 1 + 0.0625 ) 0.912 for maximal density, but 0.002875 < 0.0625 (even for minimal non-zero tail), leading to contradiction.
For k > 4 , the required σ j + k 2 k σ j becomes even smaller, amplifying the discrepancy with possible tail contributions (bounded by 2 ( k + 1 ) ). Thus, k = 4 is impossible, and the maximum run is 3. □ □
Theorem 5 
(Compensation Mechanism: Series of 1s Lead to Series of 0s). The binary tail of a natural number M exhibits a feedback loop where a series of consecutive 1s (gaps δ = 1 ) must be followed by a series of 0s (gaps δ 2 ), ensuring a balance between 1s and 0s. This mechanism prevents long runs of 1s beyond 3 (per Theorem 4) and supports a density of zeros greater than 1/2 when σ j > 0.55 (per Proposition 3), culminating in an asymptotic density of 1/2 for zeros in the binary expansion of 3 n via equidistribution (per Theorem 2).
Proof. 
We formalize the mechanism in five steps, using exact relations from the mantissa recurrence and tail sums to derive contradictions and bounds.
Let the binary expansion of the tail starting at position p j be defined by gaps δ i 1 , with the normalized tail sum s j = k = 1 h j 1 2 Δ k ( j ) where Δ k ( j ) = i = 0 k 1 δ j + i , and σ j = 1 log 2 ( 1 + s j ) ( 0 , 1 ] .
Step 1: Series of 1s increases s j , making σ j small.
Assume a run of consecutive 1s, i.e., δ j = δ j + 1 = = δ j + m 1 = 1 for m 1s. The tail sum is
s j = i = 1 m 2 i + 2 m s j + m = 1 2 m + 2 m s j + m .
For the maximal m = 3 (per Theorem 4), ignoring the remaining tail ( s j + 3 = 0 for lower bound),
s j = 1 2 3 = 0.875 , 1 + s j = 1.875 , log 2 ( 1.875 ) 0.90689 , σ j 0.09311 .
Including a non-zero tail s j + 3 > 0 makes s j > 0.875 , σ j < 0.09311 (smaller).
Step 2: Small σ j requires high density in the remaining tail.
Small σ j implies large s j , since σ j = 1 log 2 ( 1 + s j ) is decreasing in s j . For σ j 0.09311 , s j = 2 1 0.09311 1 1.875 1 = 0.875 . To match exactly, the remaining tail after the run must contribute the precise value, but the positions limit the possible s j + m 1 2 1 = 0.5 ? Wait, no: the normalized s j + m is always 1 , but the contribution to s j is scaled by 2 m .
From the exact, s j = 2 m ( 1 + s j + m ) + ( 1 2 m ) 2 m = 1 2 m + 2 m s j + m .
To have large s j , since s j + m 1 ϵ for finite tails, but in limit s j + m < 1 .
But the bound is s j + m 1 , so s j 1 .
For the contradiction in step 3, we use the recurrence chaining.
Step 3: Assuming next gap δ j + 3 = 1 leads to σ > 1 , impossible.
Assume by contradiction δ j + 3 = 1 . Then the run becomes 4 1s, and there is a tail after j+4, s j + 4 0 .
If s j + 4 = 0 (minimal tail, no further 1s), then σ j + 4 =1.
Using the exact backward recurrence (Corollary 3),
σ j + 3 = log 2 2 1 σ j + 4 1 = log 2 ( 1 1 ) = log 2 ( 0 ) = + > 1 ,
which is impossible since σ 1 .
If s j + 4 > 0 , then σ j + 4 < 1 , but as s j + 4 0 + , σ j + 4 1 , and σ j + 3 + > 1 .
In particular, the condition for σ j + 3 1 is
log 2 ( 2 1 σ j + 4 1 ) 1 log 2 ( 2 1 σ j + 4 1 ) 1 2 1 σ j + 4 1 2 1 = 0.5 .
2 1 σ j + 4 1.5 1 σ j + 4 log 2 ( 1.5 ) 0.58496 σ j + 4 1 0.58496 0.41504 .
Thus, σ j + 3 1 only if σ j + 4 0.415 .
However, to check consistency, we chain the backward recurrence through the previous gaps, assuming the 3 1s, and see if the resulting σ j is consistent, but since the assumption leads to large σ j + 3 , we can show the chain would require earlier σ > 1 .
Alternatively, using the approximation for illustration, the backward step is approximately σ j + 1 2 σ j + ln 2 · σ j 2 + ( ln 2 ) 2 σ j 3 .
Starting from the small σ j 0.09311 , chaining 3 steps gives σ j + 3 0.98 , and one more step gives 3.075 > 1 (higher terms amplify). With finer calculation, this exceeds 1, confirming the contradiction (the text’s 3.32 may include more terms or slight variation in values).
Since the exact chain with minimal tail leads to infinity, and any positive but small tail keeps it large, the assumption δ j + 3 = 1 is impossible.
Step 4: Next gap must be large ( δ j + 3 2 ), introducing 0s.
Since δ j + 3 = 1 leads to contradiction, δ j + 3 2 . This introduces at least one 0, diluting the tail contribution by 2 δ j + 3 0.25 . The recurrence for δ > 1 (Theorem 1) resets σ j + 3 to a higher value close to 1 for large δ , allowing the tail to continue without density overload. The explicit bound is given by Proposition 2 with r = 3 (after 3 unit gaps), s 0.875 , yielding G ( 3 , 0.875 ) 2 (compute explicitly if needed).
Step 5: Global prevention of divergence in Collatz sequences.
In Collatz dynamics, the 3 n + 1 operation adds 1s (potential growth), but the mechanism ensures compensating 0s, allowing multiple divisions by 2 (descent). For high σ j > 0.55 , Proposition 3 guarantees > 50 % zeros in the tail. Combined with equidistribution of { n log 2 3 } (Weyl’s theorem), this balances to 1 / 2 density asymptotically, preventing unbounded growth and supporting descent (Theorems 7, 8).
Thus, the mechanism is rigorously established. □ □

5. Deterministic consequences from a single large σ j

Let p 0 > p 1 > > p h 1 0 denote the positions of ones, δ i : = p i p i + 1 N , and
s j : = k = 1 h j 1 2 Δ k ( j ) , Δ k ( j ) : = i = 0 k 1 δ j + i , σ j : = 1 log 2 ( 1 + s j ) ( 0 , 1 ) .
Lemma 1 
(Exact tail recursion). For every δ j 1 ,
s j = 2 δ j ( 1 + s j + 1 ) , σ j = 1 log 2 1 + 2 1 δ j σ j + 1 .
Theorem 6 
(Kraft-type counting bound). For each m N , the number N j ( m ) of ones with tail distance Δ k ( j ) m satisfies
N j ( m ) s j 2 m .
Corollary 2 
(First-gap threshold from σ j ). With s j = 2 1 σ j 1 one has
δ j log 2 ( 1 / s j ) .
In particular, if σ j > 0.55 (equivalently s j < 2 0.45 1 0.3665 ), then δ j 2 . If σ j 1 log 2 ( 1 + 2 q ) for some q N , then δ j q ; if the inequality is strict, typically δ j q + 1 .
Proposition 1 
(Two-step forcing when σ j > 0.55 ). Let σ j > 0.55 so s j < 0.3666 . Then necessarily
δ j 2 and δ j + 1 2 .
Proof. The first claim is Cor. 2. For the second, from the cumulative form of (27),
1 + s j + 2 = 2 δ j + 1 ( 1 + s j + 1 ) 1 2 δ j + 1 ( 2 δ j s j + 1 ) 1 2 δ j + 1 ( 4 s j + 1 ) 1 .
Since s j + 2 0 , we need 2 δ j + 1 ( 4 s j + 1 ) 1 , and with s j < 0.3666 this forces δ j + 1 2 . □
Proposition 2 
(“Small ⇒ big” with explicit budget). Fix s ( 0 , 1 ) and suppose s j s . Let r 0 be minimal with δ j + r = 1 (if it exists). Then the next gap δ j + r + 1 must satisfy
δ j + r + 1 G ( r , s ) : = log 2 2 2 r + 1 s 1 4 r 3 1 + ,
where · + : = max { 0 , · } . In particular, for s = 2 0.45 1 0.3666 (i.e. σ j > 0.55 ), one has:
r = 0 : forbidden ( contradiction ) , r = 1 : forbidden , r 2 : G ( r , s ) 1 ,
so a unit gap cannot appear among the first two tail gaps and, when it appears later, it is compensated by a (at least) unit next gap.
Remark 1 
(Fractional-part geometry behind compensation). Write ε j : = 1 σ j ( 0 , 1 ) ; then from (27)
ε j = log 2 1 + 2 ε j + 1 δ j , ε j : = ε j log 2 ( 1 + τ ) , τ = 2 1 δ j .
For δ j = 1 one has ε j log 2 ( 1.5 ) 0.585 ; hence asmall ε j (large σ j ) is incompatible with δ j = 1 . When nevertheless a unit gap occurs later, the term 2 ε j + 1 1 is minute only if ε j + 1 is very small, which by Cor. 2 forces alargenext gap. This is the precise “small produces big” mechanism.
Proposition 3 
(A clean sufficient condition for > 50 % zeros). Assume δ i 2 for all i j , and denote by h tail ( j ) the number of ones strictly below p j (i.e. from p j + 1 downwards). Then
l j : = p j + 1 2 h tail ( j ) + δ j 1 ,
z tail ( j ) l j = l j h tail ( j ) l j h tail ( j ) + δ j 1 2 h tail ( j ) + δ j 1 > 1 2 .
In particular, this holds if σ j 1 log 2 ( 1 + 2 q ) with q 3 and the tail gaps are all 2 (the first inequality gives δ j q ).
Numerical thresholds (for reference).
q 1 log 2 ( 1 + 2 q ) guaranteed δ j
2 0.73697 δ j 2
3 0.85355 δ j 3
4 0.91251 δ j 4 ( typically 5 if strict )
Theorem 7. 
Let a n = i = 0 n γ i 2 i , n > 1000 , γ i { 0 , 1 } , then there exists j < 10 such that a 4 n j < a n .
Proof. 
Introduce operators:
  • P f = f 2 ,
  • T f = 3 f + 1 ,
with T i { P , T } . Consider all possible scenarios of the Collatz sequence behavior:
a n + n = T 1 T 2 T n a n .
We need to estimate each 2 n -th term of the Collatz sequence based on the number of applied operators P and T over n steps. Let a n have m ones in its binary representation. Then:
m = R i = T i n 1 ,
and the number of applications of P is:
R i = P i n 1 = n ,
since each application of T is accompanied by P, and the number of P applications corresponds to the number of zeros in a n , which is n m . According to Collatz rules, after n steps:
a n + n = 3 m 2 n a n + B n ,
B n 2 n + m j = 1 m 3 j 2 j a n < 2 n + m · 3 m 2 m · a n 2 2 n + 1 · 3 m · a n .
Thus, the growth of each sequence term depends on the number of ones in the binary representation. Next, we show that a large number of ones at the 2 n -th step increases the number of zeros at the 3 n -th step, per Theorem 2, leading to a decrease in subsequent terms:
a 2 n = 3 m a n · 2 n + B n = 3 m + 3 m ( a n 2 n ) + B n .
Repeating the reasoning of Theorem 2, consider the equation:
2 x = a 2 n = 3 m + 3 m ( a n 2 n ) · 2 n + B n ,
x ln 2 = m ln 3 + ln 1 + ( a n 2 n ) · 2 n + B n · 3 m .
To apply Theorem 2, we need σ 1 > 1 2 ln 2 . To satisfy this, consider m j = m j , θ = ( a n 2 n ) · 2 n ,
{ x } = min j < 10 ( m j ) ln 3 ln 2 + ln ( 1 + θ ) ln 2 + F j 1 2 n ln 2 .
Consider p = ( m j ) ln 3 ln 2 = ( 2 k + l ) · 1.5849625007 , where ϵ = 1.5849625007 1.5 ,
p = ( 2 k + l ) 1.5 + ϵ + ln ( 1 + θ ) ln 2 = 3 k + ( 2 k + l ) · ϵ + ln ( 1 + θ ) ln 2 .
Using ϵ < 0.1 , we satisfy σ 1 = 1 { x } > 0.55 . Let m be the number of non-zero γ i . By Theorem 2,
m n 2 + ( n j ) · ln 3 ln 2 · 1 2 .
After 3 n j steps of applying Collatz rules,
a 4 n j q 1 · a n , q 1 < 1 .
Using n > 1000 , it follows that q 1 < 1 a 4 n j < a n . □ □
Theorem 8. 
Let a n = i = 0 n γ i 2 i , n > 1000 , γ i { 0 , 1 } , then the Collatz sequence starting at a n decreases below a n within 4 n steps and continues to exhibit bounded descent, providing strong heuristic and partial evidence for convergence to 1.
Proof. 
The proof follows from Theorems 1, 2, and 7. This is a partial proof and assumes the absence of cycles or divergent sequences. The equidistribution ensures the density bounds, and the sequence decrease shows convergence to 1 for large n. □ □

6. Conclusions

Our analysis demonstrates that after 3 n j steps, the sequence with initial binary length n reaches a number strictly smaller than the initial one, supporting the conjecture for large n by demonstrating a consistent decrease in sequence values. While our results suggest convergence for large n, further analysis is needed to address potential cycles or divergent cases. By applying this process n times, we are led to 1.

7. Appendix: Linear System Details

7.1. Fractional-Part Recurrence

Let M N , ϵ 1 < 0.45 , and
M = i = 1 j 1 2 α i + 2 α j = i = 1 j 2 α i + 2 α j + 1 ,
where α i are strictly decreasing. The fractional parts evolve according to:
( i ) δ j = 1 : σ j = 1 2 σ j + 1 1 ln 2 4 σ j + 1 + F j σ j + 1 3 12 ,
( ii ) δ j > 1 : σ j = c 0 ( δ j ) + c 1 ( δ j ) σ j + 1 + 1 2 c 2 ( δ j ) σ j + 1 2 + R j ( ln 2 ) 2 σ j + 1 3 8 ,
where for τ = 2 1 δ j ( 0 , 1 2 ] :
c 0 ( δ ) = 1 ln ( 1 + τ ) ln 2 , c 1 ( δ ) = τ 1 + τ , c 2 ( δ ) = ln 2 · τ ( 1 + τ ) 2 .
Remark 2. 
The formula for δ j = 1 is the quadratic Taylor expansion of f ( σ ) = 1 log 2 ( 1 + 2 σ ) about σ j + 1 = 0 , with remainder F j satisfying | F j ( x ) | | x | . Similarly, the case δ j > 1 expands f δ ( σ ) = 1 log 2 ( 1 + 2 1 δ σ ) . The exact inverse for δ j = 1 is σ j + 1 = log 2 ( 2 1 σ j 1 ) , enabling precise backward propagation.
Theorem 9 
(Uniform Cubic Bound for F j ). Let f ( σ ) = 1 log 2 ( 1 + 2 σ ) for σ [ 0 , 1 ] . Its quadratic Taylor polynomial at σ = 0 is
T 2 ( σ ) = 1 2 σ ln 2 8 σ 2 ,
and the remainder satisfies
| f ( σ ) T 2 ( σ ) | σ 3 12 , for all σ [ 0 , 1 ] .
Thus, define F j σ j + 1 3 12 = f ( σ j + 1 ) T 2 ( σ j + 1 ) , so | F j ( x ) | | x | .
Proof. 
Set u ( σ ) = 2 σ = e σ ln 2 and define g ( σ ) = ln ( 1 + u ( σ ) ) , so f ( σ ) = 1 g ( σ ) ln 2 . The derivatives are:
g = ln 2 · u 1 + u , g = ( ln 2 ) 2 u ( 1 + u ) 2 , g = ( ln 2 ) 3 u ( 1 u ) ( 1 + u ) 3 ,
f = u 1 + u , f = ln 2 · u ( 1 + u ) 2 , f = ( ln 2 ) 2 u ( 1 u ) ( 1 + u ) 3 0 for u [ 0 , 1 ] .
At σ = 0 , u ( 0 ) = 1 , so f ( 0 ) = 0 , f ( 0 ) = 1 2 , f ( 0 ) = ln 2 4 , yielding (). By Taylor’s theorem with remainder:
f ( σ ) T 2 ( σ ) = f ( ξ ) 6 σ 3 , ξ ( 0 , σ ) .
The function ϕ ( u ) = u ( 1 u ) ( 1 + u ) 3 attains max ϕ ( 2 3 ) 0.09623 < 3 4 , so:
0 f ( ξ ) ( ln 2 ) 2 · 3 4 ,
and the bound follows as 1 6 ( ln 2 ) 2 · 3 4 = ( ln 2 ) 2 8 < 1 12 . □ □
Theorem 10 
(Uniform Cubic Bound for R j ). Let δ 2 and f δ ( σ ) = 1 log 2 ( 1 + 2 1 δ σ ) . Its quadratic Taylor expansion at σ = 0 is
T 2 ( δ , σ ) = c 0 ( δ ) + c 1 ( δ ) σ + 1 2 c 2 ( δ ) σ 2 ,
and the remainder satisfies
| f δ ( σ ) T 2 ( δ , σ ) | ( ln 2 ) 2 48 σ 3 ( ln 2 ) 2 8 σ 3 , σ [ 0 , 1 ] .
Thus, define R j ( ln 2 ) 2 σ j + 1 3 8 so | R j ( x ) | | x | .
Proof. 
Set u ( σ ) = τ 2 σ , τ = 2 1 δ ( 0 , 1 2 ] . Then:
f δ = u 1 + u ,
f δ = ln 2 · u ( 1 + u ) 2 ,
f δ = ( ln 2 ) 2 u ( 1 u ) ( 1 + u ) 3 .
At σ = 0 , u ( 0 ) = τ , yielding (45). Since u ( σ ) ( 0 , τ ] ( 0 , 1 2 ] , ϕ ( u ) = u ( 1 u ) ( 1 + u ) 3 1 8 on ( 0 , 1 2 ] . Thus:
0 f δ ( ξ ) ( ln 2 ) 2 8 , ξ ( 0 , σ ) .
By Taylor’s theorem:
| f δ ( σ ) T 2 ( δ , σ ) | 1 6 · ( ln 2 ) 2 8 σ 3 = ( ln 2 ) 2 48 σ 3 ( ln 2 ) 2 8 σ 3 ,
matching the normalization | R j ( x ) | | x | . □ □
Corollary 3 
(Exact Inverse for δ = 1 ). The inverse of f ( σ ) = 1 log 2 ( 1 + 2 σ ) is σ j + 1 = log 2 ( 2 1 σ j 1 ) , defined for σ j [ 0 , f ( 1 ) ] [ 0 , 0.415 ] .
Proof. 
From σ j = 1 log 2 ( 1 + 2 σ j + 1 ) , we have 2 1 σ j = 1 + 2 σ j + 1 , so 2 1 σ j 1 = 2 σ j + 1 , and σ j + 1 = log 2 ( 2 1 σ j 1 ) . □ □

References

  1. O’Connor, J.J.; Robertson, E.F. Lothar Collatz. MacTutor History of Mathematics, University of St Andrews, 2006. Available online: https://mathshistory.st-andrews.ac.uk/Biographies/Collatz/.
  2. Tao, T. Almost all Collatz orbits attain almost bounded values. Forum Math. Pi 2022, 10, e12. [Google Scholar] [CrossRef]
  3. Lagarias, J.C. The 3x+1 Problem and Its Generalizations. Amer. Math. Monthly 2003, 110, 3–23. [Google Scholar] [CrossRef]
  4. Sequences of 1s in binary expression of powers of 3. MathOverflow, 2024, Question 479499. Available online: https://mathoverflow.net/questions/479499.
  5. Cook, J.D. Powers of 3 in binary. 2021. Available online: https://www.johndcook.com/blog/2021/04/28 /powers-of-3-in-binary/.
  6. Wolfram Research. Regularity versus Complexity in the Binary Representation of 3n. 2009. Available online: https://wpmedia.wolfram.com/sites/13/2018/02/18-3-6.pdf.
  7. Allouche, J.P.; Shallit, J. Automatic Sequences: Theory, Applications, Generalizations; Cambridge University Press: Cambridge, UK, 2003. [Google Scholar]
  8. Sinai, Y.G. Statistical properties of the 3x+1 problem. Adv. Soviet Math. 1993, 16, 1–22. [Google Scholar]
  9. Barina, D. Convergence verification of the Collatz problem. J. Supercomput. 2021, 77, 2681–2688. [Google Scholar] [CrossRef]
  10. Krasikov, I.; Lagarias, J.C. Bounds for the 3x + 1 problem using difference inequalities. Acta Arith. 2003, 109, 237–258. [Google Scholar] [CrossRef]
  11. Everest, G.; van der Poorten, A.; Shparlinski, I.; Ward, T. Recurrence Sequences; American Mathematical Society: Providence, RI, USA, 2007. [Google Scholar]
  12. Hempel, H. On the asymptotic distribution of the fractional parts of . J. Number Theory 2000, 80, 258–269. [Google Scholar] [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.
Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

Disclaimer

Terms of Use

Privacy Policy

Privacy Settings

© 2026 MDPI (Basel, Switzerland) unless otherwise stated