Collatz Conjecture: Binary Structure Analysis and Trajectory Behavior

A. A. Durmagambetov; A. A. Durmagambetova

doi:10.20944/preprints202401.0227.v26

Submitted:

08 November 2025

Posted:

12 November 2025

You are already at the latest version

Abstract

This paper advances the Collatz conjecture by analyzing binary representations of natural numbers through fractional parts. We introduce a direct non-recursive relation for intermediate mantissas $\sigma_j$ in binary decompositions and prove their equidistribution using Weyl's theorem. The self-correcting dynamics of $\sigma_j$ ensure a balance between 1s and 0s, leading to an asymptotic density of 1/2 for 1s in binary expansions of $3^n$. This yields a probabilistic estimate: in approximately half of all cases, the binary expansions have many leading zeros, ensuring rapid descent. Theorems estimate zero density in powers of three and demonstrate sequence decrease for large $n$. Numerical verifications and updated figures support the findings, providing strong evidence for convergence in large cases.

Keywords:

Collatz conjecture

;

binary representations

;

fractional parts

Subject:

Computer Science and Mathematics - Algebra and Number Theory

1. Literature Review

The Collatz conjecture, also known as the

3 x + 1

problem, is one of the most famous unsolved problems in mathematics. It posits that for any positive integer N, repeated application of the function (division by 2 if the number is even; replacement by

3 N + 1

if odd) eventually leads to 1. This review summarizes key contributions from the listed references, focusing on historical context, theoretical advancements, computational verifications, statistical properties, and connections to binary representations and sequences. These sources provide a foundation for understanding the complexity of the conjecture, partial results, and related mathematical structures, as explored in the article on binary decomposition and uniformity of distribution.

1.1. Historical and Biographical Context

Biography of Lothar Collatz [1]: Lothar Collatz (1910–1990) was a German mathematician known for contributions to numerical analysis. He proposed the conjecture in 1937 while working on graph theory. The problem asks whether orbits starting from any positive integer M always reach 1. Despite his 238 publications in numerical methods, this simple conjecture became his legacy. The conjecture has been verified up to $2^{71}$ ( $\approx 2.36 \times 10^{21}$ ) [10], but remains open, highlighting its deceptive simplicity.
Lagarias’ Survey [3]: This survey details generalizations of the conjecture, such as replacing 3 with other odd integers or extending to negative and zero values. It discusses equivalent formulations (e.g., the Syracuse mapping on odd integers) and open questions, such as the existence of cycles beyond the known cycle (1,4,2,1). Lagarias emphasizes computational verifications and partial proofs, setting the foundation for modern results.

1.2. Recent Theoretical Advancements

Tao’s Article [2]: Terence Tao proves that for any function $f (N) \to \infty$ as $N \to \infty$ , almost all orbits (in logarithmic density) have a minimum value $< f (N)$ . This means that orbits are "almost bounded" for almost all N, strengthening previous bounds. Using probabilistic models of Syracuse iterations and 3-adic distribution, Tao shows superpolynomial decay in characteristic functions, implying that typical orbits fall below $polylog N$ .
Weyl’s Theorem [13]: On the asymptotic distribution of fractional parts ${n α}$ for irrational $α$ (e.g., $α = {log}_{2} 3$ ). Uniformity modulo 1 is key to the "quasirandomness" of binary digits in $3^{n} = 2^{n {log}_{2} 3}$ .

1.3. Binary Representations of Powers of 3

MathOverflow Question [4]: On the longest sequence of 1s ( $L_{n}$ ) in binary $3^{n}$ . Simulations up to $n = 10000$ show $L_{n} < 3.5 log n$ (observed maximum $\sim 24$ ), suggesting logarithmic bounds; models like coin tossing give $max L_{n} \sim 2 {log}_{2} N$ .
Cook’s Blog [5]: Visualization of binary $3^{n}$ as a grid (rows for $n = 0 . . 59$ ) with a slope boundary ${log}_{2} 3$ ; local structures and "semi-chaos" are observed.
Wolfram Research [6]: Regularities in subsequences $3^{2^{n}}$ , 2-adic convergence to 1; discussion of p-adic perspective and general patterns.

1.3.1. Examples of Binary Decompositions

Here is a table with binary decompositions of

3^{n}

for

n = 1 . . 10

:

Table 1. Binary representations of 3ⁿ for n = 1..10

n	Binary Representation
1	11
2	1001
3	11011
4	1010001
5	11110011
6	1011011001
7	100010001011
8	1100110100001
9	100110011100011
10	1110011010101001

1.4. Statistical and Probabilistic Properties

Sinai [8]: Ergodic properties of the Syracuse mapping and statistical regularity of long orbits.

1.5. Computational Verifications and Bounds

Barina (2021) [9]: Verification up to $2.95 \times 10^{20}$ , using GPU and algorithmic optimizations.
Barina (2025) [10]: Verification up to $2^{71}$ ( $\approx 2.36 \times 10^{21}$ ).
Krasikov–Lagarias [11]: Bounds through difference inequalities.

1.6. Related Mathematical Theory

Allouche–Shallit [7]: Automatic sequences and numeration systems.
Everest et al. [12]: Recurrent sequences, p-adic limits, connections to orbits.

1.7. Synthesis and Relevance to the Article

The references demonstrate the persistent difficulty of the problem, strong heuristics, and computational bounds [2,8,9,10]. Binary representations of

3^{n}

[4,5,6] align with the uniformity of fractional parts [13].

2. Introduction

We study the density of zeros in binary representations of natural numbers using fractional parts, encoding the binary structure. A framework is developed linking binary gaps, mantissas, and Collatz dynamics. In particular, the task of counting zeros in binary

3^{n}

remains open, see [4,5,6].

3. Materials and Methods

Zeros dominate in the Collatz descent: each zero allows division by 2, outpacing the growth of

3 n + 1

. For

n = 2^{k}

, the sequence reduces to 1 in k steps. We decompose M into powers of two and track fractional parts

σ_{j}

at each stage to quantify the density of zeros.

3.1. Self-Correcting Dynamics of Mantissas $σ_{j}$

Mantissas

σ_{j}

induce a self-correcting mechanism, balancing 1s and 0s:

Long series of 1s increase the tail $s \approx 1 - 2^{- k}$ , giving small $σ_{j} \to 0^{+}$ ; by recurrent formulas (Theorem 1), this requires $δ_{j} \geq 2$ , adding 0.
Series of 0s decrease $s \to 0^{+}$ , giving $σ_{j} \to 1^{-}$ and forcing $δ_{j} = 1$ , i.e., a one.

Locally, this gives balance of blocks; globally, together with the uniformity of

{n {log}_{2} 3}

, an asymptotic density of

1 / 2

for ones in binary

3^{n}

is obtained.

4. Results

The Collatz conjecture [1] remains open [3], see also [2].

Theorem 1.

Let

M = 3^{n}

,

δ_{j} = ⌊ α_{j} ⌋ - ⌊ α_{j + 1} ⌋ > 0

,

α_{j} = ⌊ α_{j} ⌋ + ϵ_{j}

,

σ_{j} = 1 - ϵ_{j}

. Then

M = \sum_{i = 1}^{j - 1} 2^{⌊ α_{i} ⌋} + 2^{α_{j}} = \sum_{i = 1}^{j} 2^{⌊ α_{i} ⌋} + 2^{α_{j + 1}} .

(1)

For

δ_{j} = 1

,

σ_{j} = \frac{1}{2} σ_{j + 1} (1 - \frac{ln 2}{4} σ_{j + 1}) + F_{j} (\frac{σ_{j + 1}^{3}}{12}),

(2)

where

| F_{j} (x) | \leq | x |

(see Theorem 9). For

δ_{j} > 1

,

σ_{j} = 2^{- δ_{j}} σ_{j + 1} + 1 - 2^{- δ_{j}} - \frac{2^{- 2 δ_{j} + 1}}{ln 2} - \frac{2^{- 2 δ_{j}} σ_{j + 1}^{2} ln 2}{4} + 2^{- 2 δ_{j}} R_{j} (\frac{{(ln 2)}^{2} σ_{j + 1}^{3}}{8}) .

(3)

Proof.

We start with the basic equality, which follows from the binary decomposition of the number

M = 3^{n} = 2^{α_{1}} + 2^{α_{2}} + \dots + 2^{α_{h}}

, where

α_{1} > α_{2} > \dots > α_{h}

are the positions of ones:

\begin{matrix} M = \sum_{i = 1}^{j - 1} 2^{⌊ α_{i} ⌋} + 2^{α_{j}} = \sum_{i = 1}^{j} 2^{⌊ α_{i} ⌋} + 2^{α_{j + 1}}, \end{matrix}

where

α_{j} = ⌊ α_{j} ⌋ + ϵ_{j}

,

ϵ_{j} \in (0, 1)

. From this equality, we obtain:

\begin{matrix} 2^{α_{j}} = 2^{⌊ α_{j} ⌋} + 2^{α_{j + 1}}, \end{matrix}

whence

\begin{matrix} 2^{ϵ_{j}} = 1 + 2^{α_{j + 1} - ⌊ α_{j} ⌋} = 1 + 2^{- δ_{j} + ϵ_{j + 1}} . \end{matrix}

We take the logarithm base 2:

\begin{matrix} ϵ_{j} = {log}_{2} (1 + 2^{- δ_{j} + ϵ_{j + 1}}) . \end{matrix}

Substitute

σ_{j} = 1 - ϵ_{j}

,

σ_{j + 1} = 1 - ϵ_{j + 1}

:

\begin{matrix} 1 - σ_{j} = {log}_{2} (1 + 2^{1 - δ_{j} - σ_{j + 1}}), \\ σ_{j} = 1 - {log}_{2} (1 + 2^{1 - δ_{j} - σ_{j + 1}}) . \end{matrix}

For

δ_{j} = 1

:

\begin{matrix} σ_{j} = 1 - {log}_{2} (1 + 2^{- σ_{j + 1}}) . \end{matrix}

Expand the function

f (σ) = 1 - {log}_{2} (1 + 2^{- σ})

in Taylor series around

σ = 0

:

\begin{matrix} f (σ) = \frac{1}{2} σ - \frac{ln 2}{8} σ^{2} + F_{j} (\frac{σ^{3}}{12}), \end{matrix}

where

| F_{j} (x) | \leq | x |

by Theorem 9. For

δ_{j} > 1

, similarly expand

f_{δ} (σ) = 1 - {log}_{2} (1 + 2^{1 - δ - σ})

, obtaining coefficients

c_{0}, c_{1}, c_{2}

from the appendix and remainder

R_{j}

by Theorem 10. □

Corollary 1.

Let

σ_{j} \in (0, 1)

. Then:

1.: If $σ_{j} < 0.415$ , then the next bit (after the current gap) is 1 (i.e., $δ_{j} = 1$ ).
2.: If $σ_{j} > 0.415$ , then the next bit is 0 (i.e., $δ_{j} > 1$ ).
3.: If $σ_{j} \to 1^{-}$ , then all subsequent bits (except the leading one) tend to zero.

Proof.

1.: From the definition of the recurrent relation for $δ_{j} = 1$ (Corollary 8), $σ_{j} \leq f (1) \approx 0.415$ is necessary for a real and positive $σ_{j + 1}$ . This corresponds to the next bit being 1.
2.: For $σ_{j} > 0.415$ , the recurrent relation for $δ_{j} = 1$ has no solution in $(0, 1)$ , so $δ_{j} > 1$ , meaning a zero bit.
3.: As $σ_{j} \to 1^{-}$ , the tail $s = 2^{1 - σ_{j}} - 1 \to 0^{+}$ , implying no significant terms in the tail sum, i.e., all subsequent bits (except the leading one) are zeros.

□

4.1. Direct Non-Recursive Relation for $σ_{j}$

Definition 1

(Binary Structure and Intervals). Let h be the number of ones in the binary representation of M (Hamming weight). Let the positions of ones be

p_{0} > p_{1} > \dots > p_{h - 1}

, and gaps

δ_{i} = p_{i} - p_{i + 1}

for

i = 0, \dots, h - 2

.

Theorem 2

(Direct Non-Recursive Relation). Let

σ_{0}

satisfy

2^{1 - σ_{0}} = \sum_{k = 0}^{h - 1} 2^{- \sum_{i = 0}^{k - 1} δ_{i}} .

Then for

0 \leq j < h

σ_{j} = 1 - {log}_{2} (2^{1 - σ_{0}} - S_{j}), S_{j} = \sum_{k = 0}^{j - 1} 2^{- \sum_{i = 0}^{k - 1} δ_{i}}, Δ_{j} = \sum_{i = 0}^{j - 1} δ_{i} .

Proof.

From the definition of

σ_{0}

, we have the full tail sum from

j = 0

:

\begin{matrix} 2^{1 - σ_{0}} = \sum_{k = 0}^{h - 1} 2^{- \sum_{i = 0}^{k - 1} δ_{i}} = S_{j} + \sum_{k = j}^{h - 1} 2^{- \sum_{i = 0}^{k - 1} δ_{i}} . \end{matrix}

The tail from j is normalized relative to position

p_{j}

:

\begin{matrix} \sum_{k = j}^{h - 1} 2^{- \sum_{i = 0}^{k - 1} δ_{i}} = 2^{- Δ_{j}} \cdot 2^{1 - σ_{j}}, \end{matrix}

where

2^{1 - σ_{j}}

is the normalized tail from j. Hence

\begin{matrix} 2^{1 - σ_{0}} = S_{j} + 2^{- Δ_{j}} \cdot 2^{1 - σ_{j}} . \end{matrix}

Isolating the tail:

\begin{matrix} 2^{1 - σ_{j}} = 2^{Δ_{j}} (2^{1 - σ_{0}} - S_{j}) . \end{matrix}

Taking

{log}_{2}

:

\begin{matrix} 1 - σ_{j} = Δ_{j} + {log}_{2} (2^{1 - σ_{0}} - S_{j}), \\ σ_{j} = 1 - {log}_{2} (2^{1 - σ_{0}} - S_{j}) . \end{matrix}

□

Corollary 2

(Case

M = 3^{n}

).

S_{j} = \sum_{k = 0}^{j - 1} 2^{p_{k} - n {log}_{2} 3}

, where

p_{k}

are the positions of ones.

Proof.

For

M = 3^{n} = 2^{n {log}_{2} 3}

, the leading position

p_{0} = n {log}_{2} 3

, and the normalized sum:

\begin{matrix} 2^{1 - σ_{0}} = \sum_{k = 0}^{h - 1} 2^{p_{k} - p_{0}} = \sum_{k = 0}^{h - 1} 2^{p_{k} - n {log}_{2} 3} . \end{matrix}

Prefix

S_{j} = \sum_{k = 0}^{j - 1} 2^{p_{k} - n {log}_{2} 3}

. □

Theorem 3

(Normalized Tail Form).

2^{1 - σ_{j}} = 1 + \sum_{k = 1}^{h - j - 1} 2^{- \sum_{i = 0}^{k - 1} δ_{j + i}}, σ_{j} = 1 - {log}_{2} (1 + \sum_{k = 1}^{h - j - 1} 2^{- \sum_{i = 0}^{k - 1} δ_{j + i}}) .

Proof.

The tail from position j starts with the current one (contribution 1) plus the sum from subsequent ones, normalized by gaps relative to

p_{j}

:

\begin{matrix} 2^{1 - σ_{j}} = 1 + \sum_{k = 1}^{h - j - 1} 2^{- Δ_{k}^{(j)}}, \end{matrix}

where

Δ_{k}^{(j)} = \sum_{i = 0}^{k - 1} δ_{j + i}

. Taking

{log}_{2}

:

\begin{matrix} σ_{j} = 1 - {log}_{2} (1 + \sum_{k = 1}^{h - j - 1} 2^{- Δ_{k}^{(j)}}) . \end{matrix}

□

Corollary 3.

Setting

s = \sum_{k = 1}^{h - j - 1} 2^{- \sum_{i = 0}^{k - 1} δ_{j + i}}

, we have

2^{1 - σ_{j}} = 1 + s

and

σ_{j} = 1 - {log}_{2} (1 + s)

.

Proof.

Directly follows from Theorem 3, where s is the tail sum without the leading 1. □

Lemma 1

(General Tail Decomposition). For fixed j and any

m \geq 1

with

j + m \leq h

, denote

Δ_{k}^{(j)} : = \sum_{i = 0}^{k - 1} δ_{j + i}, s_{j} : = \sum_{k = 1}^{h - j - 1} 2^{- Δ_{k}^{(j)}} .

Then

s_{j} = \sum_{k = 1}^{m} 2^{- Δ_{k}^{(j)}} + 2^{- Δ_{m}^{(j)}} s_{j + m} .

Proof.

Split the tail sum:

\begin{matrix} s_{j} = \sum_{k = 1}^{m} 2^{- Δ_{k}^{(j)}} + \sum_{k = m + 1}^{h - j - 1} 2^{- Δ_{k}^{(j)}} . \end{matrix}

The second sum equals the tail from

j + m

, shifted by

Δ_{m}^{(j)}

:

\begin{matrix} \sum_{k = m + 1}^{h - j - 1} 2^{- Δ_{k}^{(j)}} = 2^{- Δ_{m}^{(j)}} \sum_{l = 1}^{h - j - m - 1} 2^{- Δ_{l}^{(j + m)}} = 2^{- Δ_{m}^{(j)}} s_{j + m} . \end{matrix}

□

Corollary 4

(Block of Ones). If

δ_{j} = \dots = δ_{j + m - 1} = 1

, then

s_{j} = \sum_{i = 1}^{m} 2^{- i} + 2^{- m} s_{j + m} = 1 - 2^{- m} + 2^{- m} s_{j + m} .

Proof.

With

δ_{j + i} = 1

for

i = 0 \dots m - 1

, we have

Δ_{k}^{(j)} = k

for

k = 1 \dots m

. Then

\begin{matrix} \sum_{k = 1}^{m} 2^{- Δ_{k}^{(j)}} = \sum_{k = 1}^{m} 2^{- k} = 1 - 2^{- m} . \end{matrix}

By Lemma 1, the tail

2^{- m} s_{j + m}

completes the expression. □

Corollary 5

(Small Tail). If

s ≪ 1

, then

σ_{j} \approx 1 - \frac{s}{ln 2} + \frac{s^{2} ln 2}{4}

.

Proof.

Expansion

{log}_{2} (1 + s) = \frac{ln (1 + s)}{ln 2}

. Taylor for

ln (1 + s) = s - s^{2} / 2 + s^{3} / 3 - \dots

, so

\begin{matrix} {log}_{2} (1 + s) \approx \frac{s}{ln 2} - \frac{s^{2}}{2 ln 2} + O (\frac{s^{3}}{ln 2}) . \end{matrix}

Hence

\begin{matrix} σ_{j} = 1 - {log}_{2} (1 + s) \approx 1 - \frac{s}{ln 2} + \frac{s^{2}}{2 ln 2} . \end{matrix}

The coefficient

ln 2 / 4

comes from the exact quadratic term after normalization. □

Example 1

(Case

j = 0

).

σ_{0} = 1 - {log}_{2} (1 + \sum_{k = 1}^{h - 1} 2^{- \sum_{i = 0}^{k - 1} δ_{i}}), s = \sum_{k = 1}^{h - 1} 2^{- \sum_{i = 0}^{k - 1} δ_{i}} = 2^{1 - σ_{0}} - 1 \in (0, 1) .

4.2. Theorem on Maximum Number of 1s

Theorem 4

(Instability of Long Sequences of Ones). Approximations

σ_{j} \approx 1 - \frac{s}{ln 2} + \frac{s^{2} (ln 2)}{4} (small s), σ_{j} \approx \frac{1}{2} σ_{j + 1} (δ_{j} = 1)

cannot hold simultaneously on a long block of consecutive ones. Such a block gives exponential growth of

σ_{j}

backward, conflicting with the linear decay of small s, forcing interruption of blocks of 1s by zeros.

Proof.

Assume a block

δ_{i} = 1

for

i = j, \dots, j + m - 1

with large m. From the recurrent relation for

δ = 1

:

\begin{matrix} σ_{j} \approx \frac{1}{2} σ_{j + 1} (1 - \frac{ln 2}{4} σ_{j + 1}) \approx \frac{1}{2} σ_{j + 1}, \end{matrix}

ignoring higher terms. Iteratively:

\begin{matrix} σ_{j} \approx 2^{- m} σ_{j + m} . \end{matrix}

Conversely:

\begin{matrix} σ_{j + m} \approx 2^{m} σ_{j} . \end{matrix}

For small tail

s_{j + m} ≪ 1

after the block:

\begin{matrix} σ_{j + m} \approx 1 - \frac{s_{j + m}}{ln 2} + \frac{s_{j + m}^{2} ln 2}{4} \approx \frac{s_{j + m}}{ln 2} . \end{matrix}

From Corollary 4 for the block:

\begin{matrix} s_{j} \approx 1 - 2^{- m} + 2^{- m} s_{j + m} . \end{matrix}

For large m,

s_{j} \approx 1

, so

σ_{j} \approx 1 - {log}_{2} 2 = 0

, but more precisely

σ_{j} \approx \frac{2^{- m}}{ln 2}

. Then

\begin{matrix} σ_{j + m} \approx 2^{m} \cdot \frac{2^{- m}}{ln 2} = \frac{1}{ln 2} \approx 1.442 > 1, \end{matrix}

which contradicts

σ_{j + m} \in (0, 1)

. This requires interrupting the block with a gap

δ > 1

(insertion of zeros). □

Theorem 5.

The asymptotic density of 1s in binary

3^{n}

is at most

1 / 2

. If

L_{n} = n {log}_{2} 3 + 1

is the length of the representation of

3^{n}

, then the number of ones

h (n) \leq \frac{1}{2} L_{n} + o (L_{n})

, and the number of zeros

\geq \frac{1}{2} n {log}_{2} 3 + o (n)

.

Proof.

Assume the contrary, that for some

ε > 0

and infinitely many n,

h (n) > (1 / 2 + ε) L_{n}

. Then the number of zeros

L_{n} - h (n) < (1 / 2 - ε) L_{n}

, average gap

δ = (L_{n} - 1) / (h (n) - 1) < 2

. Let

d = 1 / (1 / 2 + ε) < 2

. The fraction f of gaps with

δ_{i} = 1

satisfies

δ \geq 2 - f

, so

f > 2 - d > 0

(constant

c = 2 - d > 0

). This implies many series of ones. However, modulo 8:

3^{n} mod 8 = 3

for odd n, 1 for even (check:

3^{1} = 3

,

3^{2} = 1

,

3^{3} = 3

, etc.). For

k \geq 3

ones from LSB,

2^{k} - 1 \equiv 7 (mod 8)

, contradiction. Thus, series of ones from LSB

\leq 2

, i.e., the third or second bit from LSB is zero. By Theorem 6, each series of ones (even

\leq 2

) requires compensation by a series of zeros (

δ \geq 2

). The restriction on series at LSB propagates globally through the dynamics of

σ_{j}

: frequent short series of 1s for excess

h > 1 / 2 L_{n}

force frequent series of zeros, leading to local density of zeros

> 1 / 2

. Globally, the uniformity of

{n {log}_{2} 3}

by [13] ensures uniform distribution of compensations, making the overall density of zeros

\geq 1 / 2 + o (1)

, contradicting the assumption. LSB=1 fixes the end, deviations

o (L_{n})

from uniformity. The contradiction completes the proof. □

Theorem 6.

A series of consecutive 1s (i.e.,

δ = 1

) must be followed by a series of zeros (steps with

δ \geq 2

), ensuring balance of ones and zeros.

Proof.

By Lemma 1 and Corollary 4, a series of 1s increases

s_{j}

to

1 - 2^{- m} + 2^{- m} s_{j + m}

, which decreases

σ_{j} = 1 - {log}_{2} (1 + s_{j})

. To avoid

σ

going outside

[0, 1]

in backward propagation through

δ = 1

steps, a step

δ \geq 2

must occur, which "resets"

σ

according to the formulas of Theorem 1. □

4.3. Complete Proof by Series

4.4. Operators T and P: Trajectory Decomposition and Decay Estimate

Definitions.

Consider two primitive steps:

P (x) = \frac{x}{2} (division by 2), T (x) = 3 x + 1 (step for odds) .

One primitive Collatz step is either T (when the current number is odd) or P (when even). The composition of the pair

T P

is conveniently viewed as one affine transformation

F (x) : = (P \circ T) (x) = \frac{3 x + 1}{2} .

Intuitively: each one in the lower bits generates one step T, immediately followed by P (since

3 x + 1

is even). Therefore, series of ones correspond to blocks

T P, T P, \dots

, and series of zeros — "pure" divisions P.

Lemma 2

(Iteration of

T P

Blocks). For any integer

m \geq 1

and

x \geq 0

F^{m} (x) = {(\frac{3}{2})}^{m} x + {(\frac{3}{2})}^{m} - 1 .

In particular,

F^{m}

is monotonically increasing in x.

Proof.

Induction on m. For

m = 1

, true by definition. Transition

m \to m + 1

:

F^{m + 1} (x) = F (F^{m} (x)) = \frac{3}{2} ({(\frac{3}{2})}^{m} x + {(\frac{3}{2})}^{m} - 1) + \frac{1}{2} = {(\frac{3}{2})}^{m + 1} x + {(\frac{3}{2})}^{m + 1} - 1 .

The coefficient for x is positive, so

F^{m}

is monotonic. □

Lemma 3

(Exact Decomposition by Number of T and P). Let the first L primitive steps of the trajectory from

X_{0}

contain M applications of T and Q divisions P

(M + Q = L)

in arbitrary order. Then there exists an integer b,

0 \leq b \leq 2^{Q} - 1

, such that

X_{L} = \frac{3^{M}}{2^{Q}} X_{0} + \frac{b}{2^{Q}} .

(4)

Proof.

Expand the composition: each step T multiplies the current value by 3 and adds 1; each step P divides by 2. Ultimately, the multiplier for

X_{0}

is

3^{M} / 2^{Q}

. All additions "+1" from T steps, after passing through some divisions by 2, give a sum of the form

\sum_{j} 2^{- e_{j}}

with integers

e_{j} \in {0, 1, \dots, Q}

. Thus, this sum is of the form

b / 2^{Q}

for some integer

b \in [0, 2^{Q} - 1]

. □

Lemma 4

(Connection to Bit Series). Consider the lower L bits of the number

X_{0}

and break them into series

\underset{k_{1}}{\underset{︸}{1 \dots 1}} \underset{ℓ_{1}}{\underset{︸}{0 \dots 0}} \underset{k_{2}}{\underset{︸}{1 \dots 1}} \underset{ℓ_{2}}{\underset{︸}{0 \dots 0}} \dots \underset{k_{r}}{\underset{︸}{1 \dots 1}} \underset{ℓ_{r}}{\underset{︸}{0 \dots 0}},

where

k_{j}, ℓ_{j} \geq 0

, and summarily

M : = \sum_{j} k_{j}

and

Z : = \sum_{j} ℓ_{j}

equal the numbers of ones and zeros among these L bits

(M + Z = L)

. Then in the first L steps:

number of T P pairs \geq M - O (1), number of " pure " divisions P \geq Z - O (1) .

In particular,

Q \geq M + Z - O (1) = L - O (1), Q \geq 2 M - O (1) .

(5)

Proof.

Each one in the lower bits generates a step T; immediately after it, a P inevitably follows (since

3 x + 1

is even). Therefore, each one accounts for at least one

T P

pair — except possibly for end/junction effects between series (giving

O (1)

). Similarly, each series of zeros is realized as consecutive "pure" divisions P (again with error at the junction

O (1)

). Summing over series gives the desired estimates and (5). □

Theorem 7

(Finite Compression with Balanced Ones and Zeros). Let among the lower L bits of

X_{0}

,

Z \geq M - C_{0}

hold for some absolute constant

C_{0}

. Then there exists a constant

C_{1}

such that

X_{L} \leq {(\frac{6^{1 / 3}}{2})}^{L} 6^{C_{1}} X_{0} + 1 .

(6)

In particular, since

6^{1 / 3} / 2 \approx 0.908 < 1

, for sufficiently large L, we have

X_{L} < X_{0}

.

Proof.

By (4) and (5)

\frac{3^{M}}{2^{Q}} \leq \frac{3^{M}}{2^{2 M - C}} = {(\frac{3}{4})}^{M} 2^{C}

for some constant

C = O (1)

. Since

M \leq L / 3 + O (1)

from

M + Z = L

and

Z \geq M - C_{0}

, we get

\frac{3^{M}}{2^{Q}} \leq 2^{C} {(\frac{3}{4})}^{L / 3 + O (1)} = {(\frac{6^{1 / 3}}{2})}^{L} \cdot 6^{O (1)} .

Substituting into (4) and accounting for

b / 2^{Q} \leq 1

, we obtain (6). □

Corollary 6

(Window Length Selection). Taking

L = ⌊n {log}_{2} 3⌋

and using that on typical windows the fraction of zeros is no less than the fraction of ones (balance by series: after each series of ones comes a series of zeros no shorter), we apply Theorem 7 and conclude: there exists

n_{0}

such that for all

n \geq n_{0}

,

X_{L} < X_{0}

.

Remark 1

(Worst Growth and Its Suppression). Maximum growth is achieved on m consecutive

T P

blocks (series of ones), where by Lemma 2

F^{m} (x) = {(\frac{3}{2})}^{m} x + {(\frac{3}{2})}^{m} - 1

. Any subsequent series of zeros give additional divisions P (multiplier

2^{- Z}

), and the additive remainder from Lemma 3 does not exceed 1. It is precisely the no less frequent "pure" P after series of ones that ensure global compression (6).

4.5. Deterministic Window Inequality: Accounting by Number of Odd Steps

Instead of fixing the window length in advance by the total number of primitive steps L, we work with a window containing exactly M odd steps T (and, respectively, Q divisions P). This eliminates overestimation and gives a correct deterministic estimate consistent with numerical observations.

Lemma 5

(Exact Decomposition by M and Q). Let in the first primitive steps from

X_{0}

exactly M applications of T and Q divisions P occur (in any order). Then there exists an integer b with

0 \leq b \leq 2^{Q} - 1

, such that

X_{M + Q} = \frac{3^{M}}{2^{Q}} X_{0} + \frac{b}{2^{Q}} \leq \frac{3^{M}}{2^{Q}} X_{0} + 1 .

(7)

Lemma 6

(Deterministic Lower Bound on Q). Let M be the number of T applications (odd steps) in the considered window. Then

Q \geq 2 M - C_{0},

(8)

where the absolute constant

C_{0}

depends only on the initial remainder

X_{0} mod 8

and is at most 4.

Proof Idea.

For odd x, let

ν_{2} (3 x + 1)

be the exponent of 2 in

3 x + 1

. The table of remainders modulo 8 gives

ν_{2} (3 x + 1) \in {1, 2, 4}

and allows tracking transitions to the next odd:

x \equiv 3, 7 (mod 8) \Rightarrow ν = 1, x \equiv 1 (mod 8) \Rightarrow ν = 2, x \equiv 5 (mod 8) \Rightarrow ν = 4,

with

7 \to 3 \to 5 \to 1 \to 1

and then staying in class

1 mod 8

(where

ν \geq 2

). Hence, each T step is accompanied by at least one division, and after entering class

1 mod 8

— at least two. In total, this gives (8) with a small error for the initial "entry" (at most three transitions to class 1). □

Theorem 8

(Deterministic Compression after M Odd Steps). There exists an absolute constant

C_{1}

such that for any

M \geq 1

X_{M + Q} \leq {(\frac{3}{4})}^{M} 2^{C_{1}} X_{0} + 1 .

(9)

In particular, the multiplier for

X_{0}

decays exponentially in M, since

3 / 4 < 1

.

Proof.

From (7) and Lemma 6:

\frac{3^{M}}{2^{Q}} \leq \frac{3^{M}}{2^{2 M - C_{0}}} = {(\frac{3}{4})}^{M} 2^{C_{0}} \leq {(\frac{3}{4})}^{M} 2^{C_{1}},

which gives (9). □

Corollary 7

(Translation to Estimate by Total Number of Steps). Let in the window M odd steps and Q divisions occur (total

L = M + Q

). By (8) we have

L \leq M + (2 M - C_{0}) = 3 M - C_{0}

, i.e.,

M \geq (L + C_{0}) / 3

. Then from (9) it follows

X_{L} \leq {(\frac{3}{4})}^{\frac{L + C_{0}}{3}} 2^{C_{1}} X_{0} + 1 \leq {(\frac{3^{1 / 3}}{2^{2 / 3}})}^{L} C X_{0} + 1,

(10)

where C depends only on

C_{0}, C_{1}

. The number

c_{*} : = \frac{3^{1 / 3}}{2^{2 / 3}} = \frac{6^{1 / 3}}{2} \approx 0.908

gives a conservative constant of average compression per primitive step.

Comments on Accuracy.

Formula (9) is deterministic and directly confirmed numerically: it does not attempt to estimate behavior by a fixed window length L "forward", but speaks of compression after exactly M odd steps, where Q is then counted (not roughly bounded above).
Translation (9) to (10) inevitably worsens the constant, since we replace exact Q with a rough lower bound (8). In practice, observed Q are usually larger than minimal, and actual compression is better than (10).
If strengthening is needed, one can account for frequencies of classes modulo 8 on the real trajectory (for example, fix the first exit to class $1 mod 8$ and then use $ν_{2} \geq 2$ on each odd step); this improves $C_{0}$ and gives numerically stronger constants without probabilistic assumptions.

Appendix: Details of the Linear System

5.1. Recurrence of the Fractional Part

Let

M \in N

. Set

ϵ_{j} = 1 - σ_{j}

and use the domain

σ \in [0, f (1)] \approx [0, 0.415]

from Corollary 8. Then

M = \sum_{i = 1}^{j - 1} 2^{⌊ α_{i} ⌋} + 2^{α_{j}} = \sum_{i = 1}^{j} 2^{⌊ α_{i} ⌋} + 2^{α_{j + 1}},

(11)

where

α_{i}

strictly decrease. The fractional parts evolve according to:

\begin{matrix} (i) δ_{j} = 1 : & σ_{j} = \frac{1}{2} σ_{j + 1} (1 - \frac{ln 2}{4} σ_{j + 1}) + F_{j} (\frac{σ_{j + 1}^{3}}{12}), \end{matrix}

(12)

\begin{matrix} (ii) δ_{j} > 1 : & σ_{j} = c_{0} (δ_{j}) + c_{1} (δ_{j}) σ_{j + 1} + \frac{1}{2} c_{2} (δ_{j}) σ_{j + 1}^{2} + R_{j} (\frac{{(ln 2)}^{2} σ_{j + 1}^{3}}{8}), \end{matrix}

(13)

where for

τ = 2^{1 - δ_{j}} \in (0, \frac{1}{2}]

:

c_{0} (δ) = 1 - \frac{ln (1 + τ)}{ln 2}, c_{1} (δ) = \frac{τ}{1 + τ}, c_{2} (δ) = - \frac{ln 2 \cdot τ}{{(1 + τ)}^{2}} .

(14)

Remark 2.

The case

δ_{j} = 1

is the quadratic expansion of

f (σ) = 1 - {log}_{2} (1 + 2^{- σ})

around

σ = 0

, remainder

F_{j}

such that

| F_{j} (x) | \leq | x |

. For

δ_{j} > 1

,

f_{δ} (σ) = 1 - {log}_{2} (1 + 2^{1 - δ - σ})

is expanded. Exact inversion for

δ_{j} = 1

:

σ_{j + 1} = - {log}_{2} (2^{1 - σ_{j}} - 1)

.

Theorem 9

(Uniform Cubic Bound for

F_{j}

). Let

f (σ) = 1 - {log}_{2} (1 + 2^{- σ})

for

σ \in [0, 1]

. Its quadratic Taylor polynomial at

σ = 0

:

T_{2} (σ) = \frac{1}{2} σ - \frac{ln 2}{8} σ^{2},

(15)

and the remainder satisfies

| f (σ) - T_{2} (σ) | \leq \frac{σ^{3}}{12}, σ \in [0, 1] .

(16)

Theorem 10

(Uniform Cubic Bound for

R_{j}

). For

δ \geq 2

and

f_{δ} (σ) = 1 - {log}_{2} (1 + 2^{1 - δ - σ})

it holds:

| f_{δ} (σ) - T_{2} (δ, σ) | \leq \frac{{(ln 2)}^{2}}{48} σ^{3} \leq \frac{{(ln 2)}^{2}}{8} σ^{3}, σ \in [0, 1],

where

T_{2} (δ, σ) = c_{0} (δ) + c_{1} (δ) σ + \frac{1}{2} c_{2} (δ) σ^{2}

, and

c_{k} (δ)

are given in (14).

Corollary 8

(Exact Inversion for

δ = 1

). From

σ_{j} = 1 - {log}_{2} (1 + 2^{- σ_{j + 1}})

it follows

σ_{j + 1} = - {log}_{2} (2^{1 - σ_{j}} - 1),

valid for

σ_{j} \in [0, f (1)] \approx [0, 0.415]

.

References

O’Connor, J.J.; Robertson, E.F. Lothar Collatz. MacTutor History of Mathematics, University of St Andrews, 2006. Available: https://mathshistory.st-andrews.ac.uk/Biographies/Collatz/.
Tao, T. Almost all Collatz orbits attain almost bounded values. Forum Math. Pi 2022, 10, e12. [CrossRef]
Lagarias, J.C. The 3x+1 Problem and Its Generalizations. Amer. Math. Monthly 1985, 92, 3–23. [CrossRef]
Sequences of 1s in binary expression of powers of 3. MathOverflow, 2024, Question 479499. Available: https://mathoverflow.net/questions/479499.
Cook, J.D. Powers of 3 in binary. 2021. Available: https://www.johndcook.com/blog/2021/04/28/powers-of-3-in-binary/.
Wolfram Research. Regularity versus Complexity in the Binary Representation of 3ⁿ. 2009. Available: https://wpmedia.wolfram.com/sites/13/2018/02/18-3-6.pdf.
Allouche, J.P.; Shallit, J. Automatic Sequences: Theory, Applications, Generalizations; Cambridge University Press: Cambridge, UK, 2003.
Sinai, Y.G. Statistical properties of the 3x+1 problem. Adv. Soviet Math. 1993, 16, 1–22.
Barina, D. Convergence verification of the Collatz problem. J. Supercomput. 2021, 77, 2681–2688. [CrossRef]
Barina, D. Improved verification limit for the convergence of the Collatz conjecture. J. Supercomput. 2025. [CrossRef]
Krasikov, I.; Lagarias, J.C. Bounds for the 3x + 1 problem using difference inequalities. Acta Arith. 2003, 109, 237–258. [CrossRef]
Everest, G.; van der Poorten, A.; Shparlinski, I.; Ward, T. Recurrence Sequences; American Mathematical Society: Providence, RI, 2007.
Weyl, H. Über die Gleichverteilung von Zahlen mod. Eins. Math. Ann. 1916, 77, 313–352. [CrossRef]

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.

Collatz Conjecture: Binary Structure Analysis and Trajectory Behavior

Abstract

Keywords:

Subject:

1. Literature Review

1.1. Historical and Biographical Context

1.2. Recent Theoretical Advancements

1.3. Binary Representations of Powers of 3

1.3.1. Examples of Binary Decompositions

1.4. Statistical and Probabilistic Properties

1.5. Computational Verifications and Bounds

1.6. Related Mathematical Theory

1.7. Synthesis and Relevance to the Article

2. Introduction

3. Materials and Methods

3.1. Self-Correcting Dynamics of Mantissas $σ_{j}$

4. Results

4.1. Direct Non-Recursive Relation for $σ_{j}$

4.2. Theorem on Maximum Number of 1s

4.3. Complete Proof by Series

4.4. Operators T and P: Trajectory Decomposition and Decay Estimate

Definitions.

4.5. Deterministic Window Inequality: Accounting by Number of Odd Steps

Comments on Accuracy.

Appendix: Details of the Linear System

5.1. Recurrence of the Fractional Part

References

MDPI Initiatives

Important Links

Subscribe

Collatz Conjecture: Binary Structure Analysis and Trajectory Behavior

Abstract

Keywords:

Subject:

1. Literature Review

1.1. Historical and Biographical Context

1.2. Recent Theoretical Advancements

1.3. Binary Representations of Powers of 3

1.3.1. Examples of Binary Decompositions

1.4. Statistical and Probabilistic Properties

1.5. Computational Verifications and Bounds

1.6. Related Mathematical Theory

1.7. Synthesis and Relevance to the Article

2. Introduction

3. Materials and Methods

3.1. Self-Correcting Dynamics of Mantissas σ j

4. Results

4.1. Direct Non-Recursive Relation for σ j

4.2. Theorem on Maximum Number of 1s

4.3. Complete Proof by Series

4.4. Operators T and P: Trajectory Decomposition and Decay Estimate

Definitions.

4.5. Deterministic Window Inequality: Accounting by Number of Odd Steps

Comments on Accuracy.

Appendix: Details of the Linear System

5.1. Recurrence of the Fractional Part

References

MDPI Initiatives

Important Links

Subscribe

3.1. Self-Correcting Dynamics of Mantissas $σ_{j}$

4.1. Direct Non-Recursive Relation for $σ_{j}$