Preprint
Article

This version is not peer-reviewed.

The Collatz Conjecture and the Spectral Calculus for Arithmetic Dynamics

Submitted:

28 January 2026

Posted:

28 January 2026

You are already at the latest version

Abstract
We develop a complete operator-theoretic and spectral framework for the Collatz map by analyzing its backward transfer operator on weighted Banach spaces of arithmetic functions. The associated Dirichlet transforms form a holomorphic family that isolates a zeta-type pole at s=1, while on a finer multiscale space adapted to the dyadic-triadic geometry of the Collatz preimage tree we establish a two-norm Lasota-Yorke inequality with an explicit contraction constant, yielding quasi-compactness, a spectral gap, and a Perron-Frobenius theorem in which the eigenvalue 1 is algebraically and geometrically simple, no other spectrum meets the unit circle, and the unique invariant density is strictly positive. The fixed-point relation is converted into an exact multiscale recursion for the block averages c_j, revealing a rigid second-order coupling with exponentially small error terms and asymptotic profile c_j~ 6-j. This spectral classification forces every weak* limit of the Cesàro averages derived from any hypothetical infinite forward orbit to be either 0 or a scalar multiple of the Perron-Frobenius functional, with convergence to 0 occurring precisely under the Block-Escape Property. Since the forward map satisfies an unconditional exponential upper bound, whereas Block-Escape combined with linear block growth along a subsequence would impose an incompatible exponential lower bound, all analytic and spectral components needed for such a contradiction are complete, reducing the Collatz conjecture to excluding infinite orbits exhibiting Block-Escape without the supercritical linear block growth prohibited by the spectral theory.
Keywords: 
;  ;  ;  ;  ;  ;  ;  ;  ;  

1. Introduction

The Collatz map is the piecewise–defined transformation
T ( n ) = n / 2 , n even , 3 n + 1 , n odd , n N ,
and the Strong Collatz Conjecture asserts that every forward orbit O + ( n ) : = { T k ( n ) : k 0 } eventually reaches the trivial 1 4 2 1 cycle and that no other cycles exist. Despite the elementary definition of (1), the iteration mixes long apparent growth episodes with rapid collapses, a combination which has motivated probabilistic, analytic, and computational investigation for more than eight decades. Foundational work of Terras [21,22] established early density results and stopping–time estimates, while the surveys of Lagarias [12,13] synthesized a broad range of heuristic and structural approaches. Later analytic contributions, including those of Meinardus [15] and Applegate–Lagarias [2], developed refined density bounds and asymptotic models for the distribution of orbit values. The global termination problem has long resisted direct combinatorial analysis, in part because the relevant cancellations occur across widely separated arithmetic scales.
This paper introduces a transfer–operator framework in which the Collatz problem is recast as a spectral and functional–analytic question for the backward dynamics. The primary unconditional contribution is a complete spectral analysis of the backward transfer operator on a multiscale Banach space adapted to the Collatz preimage tree, yielding a Lasota–Yorke contraction, quasi–compactness, a spectral gap, and a Perron–Frobenius theorem with a unique strictly positive invariant density. These spectral inputs are then coupled to a finite–state residue–graph nontrapping mechanism and an orbitwise averaging principle, allowing the dynamical consequences to be organized in theorem–level form. The main theorem of the introduction (Theorem 1.1) records the resulting implication chain leading to the Strong Collatz conclusion.
Rather than studying T directly, we analyze its inverse dynamics through the backward transfer operator
( P f ) ( n ) : = m : T ( m ) = n f ( m ) m ,
acting on arithmetic functions f : N C . Operators of this type are standard in statistical mechanics and dynamical systems [17,18], and related functional–analytic approaches have been developed for 3 x + 1 –type maps in recent work [14,16]. For the Collatz map (1), each n has an even preimage 2 n and, when n 4 ( mod 6 ) , an odd preimage ( n 1 ) / 3 , so
( P f ) ( n ) = f ( 2 n ) 2 n + 1 { n 4 ( mod 6 ) } f ( ( n 1 ) / 3 ) ( n 1 ) / 3 .
The Jacobian weights 1 / m are chosen so that P acts as a bounded averaging operator on weighted 1 spaces and so that the inverse–branch contraction is expressed directly at the level of norms and block averages.
On weighted spaces σ 1 with norm f σ : = n 1 | f ( n ) | n σ , the Dirichlet transform
D f ( s ) : = n 1 f ( n ) n s
organizes the iterates P k f into a holomorphic family in ( s ) > σ . Uniform control of P k yields exponential envelopes for D ( P k f ) ( s ) and translates into meromorphic continuation statements for the associated Collatz–Dirichlet series, with dominant singular behavior governed by the spectral data of P . While the 1 theory captures global weighted growth, it does not resolve the multiscale fluctuations central to Collatz dynamics, in particular the dyadic–triadic block structure which organizes the Collatz preimage tree.
To access these finer features, we refine the analytic setting to a multiscale Banach space B tree , σ built from dyadic–triadic block averages and oscillation seminorms reflecting the hierarchical geometry of the preimage tree. On this space the operator P satisfies a two–norm Lasota–Yorke inequality of the form
[ P f ] tree , σ λ LY [ f ] tree , σ + C f σ , 0 < λ LY < 1 ,
placing the dynamics within the Ionescu–Tulcea–Marinescu and Hennion quasi–compactness theory for transfer operators [6,7]. The explicit verification of this inequality, including the strict contraction in the odd branch and uniform blockwise control, is carried out in Section 5, Section 6 and Section 7.
These spectral estimates lead naturally to a set of dynamical criteria describing block escape, orbitwise averaging, valuation windows, and backward–operator invariants. Each provides a logically distinct formulation of the Collatz problem, and the implications among them make precise which mechanisms must be excluded in order to obtain global termination. The next theorem records the main logical relations among these statements and their connection to the Strong Collatz Conjecture.

1.1. A Dynamical Implication Architecture

The proof strategy developed in this paper passes through a collection of dynamical principles—some orbitwise, some block–statistical, and some finite–state—which are logically distinct, conceptually motivated by previous approaches, and ultimately verified within the present framework. To keep the logical dependencies transparent, we record them explicitly as a chain of implications leading to the Strong Collatz conclusion.
Theorem 1.1
(Dynamical forms connected to the Strong Collatz conjecture). For j 0 define the 6–adic blocks
I j : = [ 6 j , 6 j + 1 ) N ,
and for n N let J ( n ) be the unique index such that n I J ( n ) . In several multiscale estimates (notably for block averages) we also use the auxiliary half–blocks
I j : = [ 6 j , 2 · 6 j ) N I j ,
and we will always specify explicitly when the half–block convention is intended 1. Consider the following statements:
(1)
Strong Collatz conjecture. Every forward orbit is finite and the only cycle is the trivial 1 4 2 1 cycle.
(2)
No infinite block escape. For every forward orbit ( n k ) one has sup k 0 j ( n k ) < .
(3)
Orbit–averaging principle. Every infinite orbit produces a nonzero P * –invariant linear functional supported entirely on that orbit.
(4)
Block–orbit averaging. Every infinite orbit spends a positive proportion of time inside a finite union of low blocks j J I j for some J.
(5)
Quantitative weak non–retreat plus Block–Escape forces supercritical linear block growth. Let ( n k ) k 0 be an infinite forward Collatz orbit satisfying the Block–Escape Property. Assume there exist constants c > 0 , C 0 0 , t max N , and J 0 0 such that for all sufficiently large k with j ( n k ) J 0 one has the quantitative weak non–retreat estimate j ( n k + t ) j ( n k ) + c t C 0 for all 1 t t max . Define the critical exponent
α * : = log 2 3 1 log 2 6 , γ : = c t max C 0 t max .
If γ > α * , then there exists α with α * < α < γ and an increasing subsequence k such that j ( n k ) α k for all sufficiently large ℓ.
(6)
Weak non–retreat principle. For every infinite orbit ( n k ) there exist constants c > 0 , C 0 0 , and t max N with the following property: once j ( n k ) exceeds a threshold J 0 , then for all sufficiently large such k there is some 1 t t max with
j ( n k + t ) j ( n k ) + c t C 0 .
(7)
Orbitwise discrepancy vanishing. Every weak* limit Λ of Cesàro averages along an infinite orbit is P * –invariant.
(8)
Residue–graph nontrapping and cycle obstruction. For some sufficiently large valuation cutoff L 0 and window bound 1 t t max , every nonperiodic accelerated odd Collatz orbit escapes both the dangerous residue set V danger ( L 0 ) and the high–valuation set V L 0 + 1 . Equivalently, the finite directed residue graphs induced by valuation patterns contain no directed cycles compatible with the odd Collatz map, and no infinite orbit can remain trapped in or oscillate between the corresponding residue classes.
Proposition 1.2
(Implication architecture). With the results from Section 5; the following implications hold:
( 8 ) ( 6 ) ( 5 ) ( 4 ) ( 3 ) , and ( 3 ) + ( 8 ) ( 1 ) .
More precisely,
(3) + (8) (1) by Proposition 7.23;
(4) (3) by Proposition 7.24;
(5) (4) by Proposition 7.25;
(6) (5) by Proposition 7.28;
(8) (6) by Theorem 7.21
(8) is established by Theorems 7.17 and 7.16.
In addition,
(3) (2) by Proposition 7.22,
(7) (3) by Lemmas 7.1 and 7.4,
Finally, (3) (1) also follows by Theorems 5.28, 5.32, and 5.41.

Remarks on the statements in Theorem 1.1

Remark 1.3 (On (1)) This is the classical problem stated by Collatz, also called the 3 x + 1 conjecture. Surveys by Lagarias [12,13] and Wirsching [23] provide comprehensive overviews of known results and approaches.
Remark 1.4 (On (2)) The negation of (2) is the existence of a divergent orbit, i.e. an orbit visiting arbitrarily large blocks. Korec [10] showed that divergence implies visiting arbitrarily large integers, though not organized a priori by powers of 3. The block partition I j = [ 2 · 3 j , 2 · 3 j + 1 ) appears naturally in Wirsching’s functional–analytic approach [23] as a scale decomposition for growth estimates.
Remark 1.5 (On (3)) This is an orbitwise analogue of invariant measure constructions. Lagarias [12] constructed a 2–adic invariant measure for the 3 x + 1 map, while Tao [20] used almost–periodicity and limit functionals in his almost all result. Extracting invariant functionals from individual orbits is closely related to Birkhoff averages and unique ergodicity along orbits [5].
Remark 1.6 (On (4)) This is stronger than Tao’s conclusion [20] that almost all orbits eventually go below a slowly growing threshold f ( N ) : it asserts positive–density returns for every infinite orbit. Analogous recurrence to compact sets principles occur throughout dynamical systems, typically as consequences of invariant measures [8]. In Collatz settings, statements of this type are compatible with the intuition that sufficiently persistent high–block growth should conflict with global exponential envelopes such as T k ( n 0 ) C 2 k in suitable regimes [11].
Remark 1.7 (On (5)) The threshold α > log 2 / log 6 0 . 3868 is the minimal expansion factor for which the inverse–branch contraction implied by the spectral theory becomes incompatible with persistent block escape. Related linear–in– k growth rates appear in heuristic models in which ν 2 ( 3 n + 1 ) averages below log 2 3 [9]. In branching random walk models [1] the growth rate per step is log ( 3 / 2 a ) where a is a typical valuation; requiring α > log 2 / log 6 corresponds to a sustained deficit in the average valuation, as quantified later by Theorem 6.18.
Remark 1.8 (On (6)) This is a finite–horizon forward drift condition. Related stopping–time analyses [4,21] typically control descent and eventual decrease; here the principle asserts a uniform advance within bounded time along any orbit which has reached sufficiently high blocks. Sinai’s random walk model [19] suggests positive expected drift for large log n k , while (6) demands a deterministic, orbitwise drift witnessed within a bounded time window.
Remark 1.9 (On (7)) This is a single–orbit analogue of the von Neumann mean ergodic theorem, conditional on the existence of the weak* limit. It is related to unique ergodicity along orbits [5]. Tao’s argument [20] relies on an almost–periodicity mechanism of this flavor for almost all orbits.
Remark 1.10 (On (8)) Statement (8) places the Collatz problem into a finite combinatorial framework by encoding the accelerated odd map on residue graphs and requiring that no infinite trajectory can remain trapped in either the dangerous or high–valuation subgraphs. This viewpoint extends earlier graph–based and congruence–based analyses. Applegate and Lagarias [1] performed exhaustive computations of the accelerated map modulo 2 N (for N 30 ), finding no nontrivial cycles and thereby anticipating the cycle–obstruction aspect of (8). Eliahou’s modular constraints on possible cycles [3] likewise translate into forbidden subgraphs in this residue framework. Kontorovich and Lagarias [9] modeled the dynamics via Markov chains on residue classes determined by 2–adic valuations, and Sinai’s logarithmic random–walk formulation [19] analyzed when orbits may appear trapped in specific congruence sets.
The contribution of (8) here is to couple a finite residue obstruction to the valuation–window formalism used later in the paper: it distinguishes dangerous and high–valuation regions via V danger ( L 0 ) and V L 0 + 1 , and it links nontrapping in this finite state space directly to the weak non–retreat mechanism.
Remark  1.11 (Expository chronology).
Several of the principles listed in Theorem 1.1 are first stated later in the paper at the point where they naturally emerge from the dynamics, typically as a provisional assumption, a named principle, or a motivating remark. This is an expository choice: it allows the argument to proceed in chronological order while the analytic tools required for verification are developed in subsequent sections. Each such principle is ultimately proved as a theorem or proposition in the locations indicated in Theorem 1.1. The purpose of collecting them here is to present, in one place, the logical architecture of the proof and to clarify how the individual mechanisms relate to earlier approaches in the literature.
The remainder of the paper develops the analytic and dynamical tools that feed into Theorem 1.1. Section 2 introduces the weighted σ 1 spaces and their Dirichlet transforms, establishing the basic analytic framework used throughout. Section 3 defines the backward transfer operator P associated with the odd branch and derives its analytic representation. Section 4 constructs the multiscale space B tree , σ adapted to the Collatz preimage tree and proves the Lasota–Yorke inequalities that control the action of P on this space. Section 5 converts the Lasota–Yorke contraction into a concrete spectral statement, proving that invariant densities obey an explicit block-level recursion with exponentially small error. This leads to a complete Perron–Frobenius description of P and to a certified spectral gap, establishing quasi–compactness and a spectral gap. Section 6 finalizes the spectral results and develops the valuation and residue tools required to analyze finite windows of odd transitions and to formulate the residue-graph obstructions that underlie the weak non–retreat principle. Section 6.5 introduces the valuation and residue-class tools used to analyze short odd windows, defining block indices, dangerous valuation patterns, and the finite residue graphs whose transitions constrain forward dynamics. These structures support the Block–Escape Property and the weak non–retreat principle central to the dynamical implications of the paper. Section 7 links the residue–graph analysis with precise forward–dynamical estimates, showing that nontrapping of accelerated odd orbits produces quantitative weak non–retreat, linear block growth, and ultimately orbitwise invariant functionals. When coupled with the spectral structure of the backward operator, this leads to a contradiction for any infinite orbit and therefore completes the proof of the Strong Collatz Conjecture. Finally, Section 8 places the spectral and dynamical framework in a broader arithmetic context and outlines potential extensions to other discrete maps.

2. Analytic Foundations for the Transfer Operator

The analysis begins with a careful description of the function spaces, Dirichlet transforms, and basic structural features of the Collatz map that underlie the spectral study of the backward operator P . Throughout we work with complex-valued arithmetic functions f : N C . We start with a simple unbounded estimate.
Lemma 2.1
(Coarse k-step envelopes). Let T : N N denote the Collatz map (1). For every n N and k N 0 ,
n 2 k T k ( n ) 3 k n + 3 k 1 2 .
Proof. 
For every m 1 , the definition of T gives
m 2 T ( m ) 3 m + 1 .
Iterating the lower bound yields T k ( n ) n / 2 k . For the upper bound, the recurrence
T k + 1 ( n ) 3 T k ( n ) + 1
immediately gives, by a simple induction on k, the explicit estimate T k ( n ) 3 k n + ( 3 k 1 ) / 2 . This proves (6). □
These envelopes are intentionally crude, yet they ensure that forward iterates of typical arithmetic weights remain controlled on the scales relevant for our Dirichlet and transfer-operator analysis.

2.1. Weighted Dirichlet Spaces and Transform Estimates

For σ > 0 we define the weighted 1 space
σ 1 : = f : N C : f σ : = n 1 | f ( n ) | n σ < .
The weight exponent σ measures polynomial decay and is chosen so that Dirichlet series associated with f converge absolutely in a half-plane ( s ) > σ . Given f σ 1 , we define its Dirichlet transform
D f ( s ) : = n 1 f ( n ) n s , ( s ) > σ .
Lemma 2.2
(Dirichlet convergence). Let σ > 0 and let f σ 1 , so that
f σ : = n 1 | f ( n ) | n σ < .
Then the Dirichlet transform
D f ( s ) : = n 1 f ( n ) n s
converges absolutely for ( s ) > σ and defines a bounded holomorphic function on every half-plane ( s ) σ + ε , ε > 0 . Moreover,
| D f ( s ) | f σ sup n 1 n σ ( s ) = f σ ( ( s ) > σ ) .
Proof. 
Let s C with ( s ) > σ . Then
n 1 f ( n ) n s = n 1 | f ( n ) | n ( s ) = n 1 | f ( n ) | n σ n σ ( s ) .
Since ( s ) > σ implies σ ( s ) < 0 , the sequence n σ ( s ) is decreasing to 0, and hence
sup n 1 n σ ( s ) = 1 .
Therefore,
n 1 f ( n ) n s f σ < ,
so the Dirichlet series converges absolutely.
For every ε > 0 , the same bound holds uniformly on the half-plane ( s ) σ + ε , since then σ ( s ) ε and n σ ( s ) n ε 0 as n . Thus the convergence is locally uniform in ( s ) σ + ε , and classical Dirichlet-series theory implies that D f is holomorphic on this region. The bound (9) follows directly from the estimate above. □
We write 1 = 0 1 for the unweighted space with norm f 1 = n 1 | f ( n ) | .

2.2. Collatz Preimage Geometry and the Backward Recursion

For each n 1 , define the even and odd preimage sets
E ( n ) : = { m N : T ( m ) = n , m even } , O ( n ) : = { m N : T ( m ) = n , m odd } .
Lemma 2.3
(Preimage structure). For every n N ,
E ( n ) = { 2 n } , O ( n ) = { ( n 1 ) / 3 } , n 4 ( mod 6 ) , , otherwise ,
and in the first case ( n 1 ) / 3 is odd. In particular, each n has either one preimage (even) or two preimages (one even and one odd), and the odd preimage occurs with natural density 1 / 6 .
Proof. 
If m is even and T ( m ) = n , then m / 2 = n , so m = 2 n , establishing E ( n ) = { 2 n } . If m is odd and T ( m ) = n , then 3 m + 1 = n , so m = ( n 1 ) / 3 . This is an integer precisely when n 1 ( mod 3 ) . For m to be odd, n 1 must be divisible by 3 but not by 6, so n 4 ( mod 6 ) . In that case ( n 1 ) / 3 is odd. The density statement follows since the congruence class n 4 ( mod 6 ) has natural density 1 / 6 . □
Hence each n admits exactly one even preimage and possibly one odd preimage when n 4 ( mod 6 ) . The corresponding backward transfer operator is defined as
( P f ) ( n ) : = m : T ( m ) = n f ( m ) m = f ( 2 n ) 2 n + 1 { n 4 ( 6 ) } f n 1 3 ( n 1 ) / 3 .
The normalization by 1 / m reflects the logarithmic contraction of the forward map and ensures a natural mass-balance property.
Lemma 2.4
(Weighted mass preservation). Let f : N [ 0 , ) satisfy
m 1 f ( m ) m < .
Then the backward transfer operator
( P f ) ( n ) : = m : T ( m ) = n f ( m ) m
preserves the weighted mass in the sense that
n 1 ( P f ) ( n ) = m 1 f ( m ) m .
Proof. 
Since f 0 and m 1 f ( m ) / m < , Tonelli’s theorem justifies rearranging the nonnegative double series. Using the definition of P ,
n 1 ( P f ) ( n ) = n 1 m : T ( m ) = n f ( m ) m .
Each m 1 has exactly one image T ( m ) , so it appears in exactly one of the inner sums. Hence we can rewrite the double sum directly over m :
n 1 m : T ( m ) = n f ( m ) m = m 1 f ( m ) m ,
which is precisely (12). □

2.3. Dirichlet Envelopes for Iterated Backward Dynamics

The preimage structure allows a crude but useful bound on P acting on σ 1 .
Proposition 2.5
(Backward operator bound). Let σ > 0 and let P be defined by (11). Then P : σ 1 σ 1 is bounded and
P f σ C σ f σ , C σ : = 2 σ + 3 σ ,
for all f σ 1 . Consequently, for every k 1 ,
P k f σ C σ k f σ .
Proof. 
From (11),
( P f ) ( n ) = f ( 2 n ) 2 n + 1 { n 4 ( 6 ) } f n 1 3 ( n 1 ) / 3 .
Hence
P f σ S even + S odd ,
with
S even : = n 1 | f ( 2 n ) | 2 n n σ , S odd : = n 1 n 4 ( 6 ) f n 1 3 n 1 3 n σ .
For the even branch, set m = 2 n , so n = m / 2 and
S even = m 1 m even | f ( m ) | m ( m / 2 ) σ = m 1 m even 2 σ | f ( m ) | m σ + 1 2 σ m 1 | f ( m ) | m σ = 2 σ f σ .
For the odd branch, write m = ( n 1 ) / 3 , so n = 3 m + 1 and m is odd. Then
S odd = m 1 m odd | f ( m ) | m ( 3 m + 1 ) σ m 1 | f ( m ) | m ( 3 m ) σ = 3 σ m 1 | f ( m ) | m σ + 1 3 σ f σ .
Combining the two estimates gives (13), and iterating yields (14). □
The constant C σ = 2 σ + 3 σ is an explicit growth factor for P on σ 1 . It is not < 1 in this normalization, so no contraction is claimed at this level. The genuine contraction mechanism is obtained later on the multiscale Banach space B tree , where a strong seminorm captures oscillatory decay along the Collatz tree while the 1 component provides compactness.

3. Transfer Operator Formulation

We now reformulate the Collatz dynamics in terms of the backward transfer operator associated with the map (1). This operator-theoretic viewpoint provides an analytic bridge between the discrete recurrence and the functional framework developed in later sections. The transfer operator encodes the inverse–branching structure of the map and propagates densities backward along the Collatz tree, in a form compatible with logarithmic weighting and Dirichlet series.
Recall that the Collatz map, (1), by Lemma 2.3, each n 1 has the even preimage 2 n , together with an additional odd preimage ( n 1 ) / 3 precisely when n 4 ( mod 6 ) .

3.1. Backward Transfer Operator

Definition 3.1
(Backward transfer operator). For an arithmetic function f : N C , define
( P f ) ( n ) : = m : T ( m ) = n f ( m ) m = f ( 2 n ) 2 n + 1 { n 4 ( 6 ) } f n 1 3 ( n 1 ) / 3 , n N ,
where 1 A denotes the indicator of the condition A .
Lemma 3.2
(Dirichlet transform intertwining). Let f σ 1 with σ > 1 , and define the Dirichlet transform
D ( f ) ( s ) = n 1 f ( n ) n s .
Then for every s with ( s ) > σ , the series D ( f ) ( s ) and D ( P f ) ( s ) converge absolutely. Moreover, if we write
D ( f ) ( s ) = n 1 a n n s , a n : = f ( n ) ,
and define a linear operator L s on Dirichlet series by
( L s F ) ( s ) : = 2 s m 1 m even a m m 1 s + k 1 k odd a k k 1 ( 3 k + 1 ) s , F ( s ) = n 1 a n n s ,
then
D ( P f ) ( s ) = ( L s D ( f ) ) ( s ) for all ( s ) > σ .
Proof. 
Fix f σ 1 with σ > 1 . By definition of the σ 1 -norm,
n 1 | f ( n ) | n σ < .
If ( s ) > σ , then n ( s ) n σ for all n 1 , so
n 1 | f ( n ) | n ( s ) n 1 | f ( n ) | n σ < .
Thus D ( f ) ( s ) = n 1 f ( n ) n s converges absolutely for ( s ) > σ .
Next we show that D ( P f ) ( s ) converges absolutely for the same range. From the definition of P ,
( P f ) ( n ) = f ( 2 n ) 2 n + 1 { n 4 ( mod 6 ) } f ( ( n 1 ) / 3 ) ( n 1 ) / 3 ,
so
| P f ( n ) | | f ( 2 n ) | 2 n + 1 { n 4 ( mod 6 ) } | f ( ( n 1 ) / 3 ) | ( n 1 ) / 3 .
Hence
n 1 | P f ( n ) | n ( s ) S even + S odd ,
where
S even : = n 1 | f ( 2 n ) | 2 n n ( s ) , S odd : = n 1 n 4 ( 6 ) | f ( ( n 1 ) / 3 ) | ( n 1 ) / 3 n ( s ) .
For the even contribution, set m = 2 n so n = m / 2 and m is even. Then
S even = m 1 m even | f ( m ) | m m 2 ( s ) = 2 ( s ) m 1 m even | f ( m ) | m 1 ( s ) .
Since ( s ) > σ implies ( s ) + 1 > σ , we have m 1 ( s ) m σ , and therefore
S even 2 ( s ) m 1 m even | f ( m ) | m σ 2 ( s ) m 1 | f ( m ) | m σ < .
For the odd contribution, write n = 3 k + 1 with k 1 odd (equivalent to n 4 ( mod 6 ) and ( n 1 ) / 3 = k odd). Then
S odd = k 1 k odd | f ( k ) | k ( 3 k + 1 ) ( s ) .
Since 3 k + 1 k for all k 1 , we have ( 3 k + 1 ) ( s ) k ( s ) , and hence
S odd k 1 k odd | f ( k ) | k 1 ( s ) k 1 | f ( k ) | k 1 ( s ) .
Again ( s ) + 1 > σ gives k 1 ( s ) k σ , so
S odd k 1 | f ( k ) | k σ < .
Thus S even + S odd < , and D ( P f ) ( s ) converges absolutely for ( s ) > σ .
We now compute D ( P f ) ( s ) explicitly and identify it with ( L s D ( f ) ) ( s ) . By definition,
D ( P f ) ( s ) = n 1 ( P f ) ( n ) n s .
Substituting the formula for P and splitting according to the two branches,
D ( P f ) ( s ) = n 1 f ( 2 n ) 2 n n s + n 1 n 4 ( 6 ) f ( ( n 1 ) / 3 ) ( n 1 ) / 3 n s .
For the even part, set again m = 2 n :
n 1 f ( 2 n ) 2 n n s = m 1 m even f ( m ) m m 2 s = 2 s m 1 m even f ( m ) m 1 s .
For the odd part, write n = 3 k + 1 with k 1 odd and ( n 1 ) / 3 = k :
n 1 n 4 ( 6 ) f ( ( n 1 ) / 3 ) ( n 1 ) / 3 n s = k 1 k odd f ( k ) k ( 3 k + 1 ) s .
Putting the two contributions together,
D ( P f ) ( s ) = 2 s m 1 m even f ( m ) m 1 s + k 1 k odd f ( k ) k 1 ( 3 k + 1 ) s .
Now write F ( s ) = D ( f ) ( s ) = n 1 a n n s with a n = f ( n ) . By the definition (16) of L s , we have
( L s F ) ( s ) = 2 s m 1 m even a m m 1 s + k 1 k odd a k k 1 ( 3 k + 1 ) s ,
which matches exactly the expression for D ( P f ) ( s ) obtained above when a n = f ( n ) . Hence
D ( P f ) ( s ) = ( L s D ( f ) ) ( s )
for all ( s ) > σ , as claimed. □
The multiplicative factor 1 / m assigns to each inverse branch a logarithmic weight, so that P acts as a normalized backward average along preimages. This normalization aligns the discrete dynamics with Dirichlet weights and will be crucial for analytic continuation and spectral estimates below.
Positivity. If f ( n ) 0 for all n , then ( P f ) ( n ) 0 for all n , since P is a positive linear combination of values of f .
Weighted mass preservation. A direct change of variables shows that for every nonnegative f satisfying m 1 | f ( m ) | / m < ,
n 1 ( P f ) ( n ) = m 1 f ( m ) m .
Thus P preserves the logarithmically weighted mass f ( m ) / m ; plain 1 mass is not preserved under this normalization.
Boundedness on weighted spaces. Let
σ 1 : = f : N C : f σ 1 : = n 1 | f ( n ) | n σ < , σ > 0 .
A direct change of variables in (15) yields, for all f σ 1 ,
P f σ 1 = n 1 | ( P f ) ( n ) | n σ n 1 | f ( 2 n ) | 2 n 1 + σ + 1 { n 4 ( 6 ) } f ( ( n 1 ) / 3 ) ( ( n 1 ) / 3 ) 1 + σ = 1 2 n 1 | f ( 2 n ) | n 1 + σ + 3 1 + σ n 1 n 4 ( 6 ) | f ( ( n 1 ) / 3 ) | ( n 1 ) 1 + σ .
Changing variables m = 2 n in the first sum and m = ( n 1 ) / 3 in the second gives
n 1 | f ( 2 n ) | 2 n 1 + σ = 2 σ m 1 m even | f ( m ) | m 1 + σ 2 σ f σ 1 , 3 1 + σ n 1 n 4 ( 6 ) | f ( ( n 1 ) / 3 ) | ( n 1 ) 1 + σ = 3 σ m 1 3 m + 1 4 ( 6 ) | f ( m ) | m σ 3 σ f σ 1 .
Hence
P f σ 1 2 σ + 3 σ f σ 1 ,
and therefore
P k f σ 1 2 σ + 3 σ k f σ 1 , k 0 .
Action on the weighted sup space. For the Banach space
B σ : = f : N C : f B σ : = sup n 1 n σ | f ( n ) | < ,
the normalization factor 1 / m in (15) improves decay at each branch but does not make P a contraction. Setting g ( n ) : = n f ( n ) , one obtains
n ( P f ) ( n ) = g ( 2 n ) + 1 { n 4 ( 6 ) } g n 1 3 , ( P f ) ( n ) = ( Q g ) ( n ) n , ( Q g ) ( n ) : = g ( 2 n ) + 1 { n 4 ( 6 ) } g n 1 3 .
Using f B σ = g B σ 1 , one obtains the bound
P f B σ = sup n 1 n σ 1 | ( Q g ) ( n ) | sup n 1 n σ 1 | g ( 2 n ) | + n σ 1 1 { n 4 ( 6 ) } g n 1 3 2 ( σ 1 ) + 3 σ 1 g B σ 1 = 2 ( σ 1 ) + 3 σ 1 f B σ .
In particular, the constant 2 ( σ 1 ) + 3 σ 1 1 for all σ > 0 , so P is bounded but not contractive on ( B σ , · B σ ) . This coarse boundedness provides an upper envelope for the operator norm but does not imply any decay of P k on B σ .
These limitations motivate the refinement of the functional setting in later sections, where the multiscale tree spaces B tree and B tree , σ are introduced to obtain genuine Lasota–Yorke-type contractions with λ < 1 and a provable spectral gap.

3.2. Dirichlet–Ruelle Operator and Intertwining Relations

For f σ 1 with σ > 0 , the Dirichlet transform
D f ( s ) : = n 1 f ( n ) n s , ( s ) > σ ,
is absolutely convergent. Writing D f ( s ) = n 1 a n n s with a n = f ( n ) and substituting (15), we obtain
D ( P f ) ( s ) = n 1 a 2 n 2 n + 1 { n 4 ( 6 ) } a ( n 1 ) / 3 ( n 1 ) / 3 1 n s .
Thus D ( P f ) is again a Dirichlet series whose coefficients depend linearly on those of D f .
Definition 3.3
(Dirichlet–Ruelle operator). Let D σ denote the space of Dirichlet series
F ( s ) = n 1 a n n s with n 1 | a n | n σ < .
Define L : D σ D σ by
( L F ) ( s ) : = n 1 b n n s , b n : = a 2 n 2 n + 1 { n 4 ( 6 ) } a ( n 1 ) / 3 ( n 1 ) / 3 .
Lemma 3.4
(Operator norm of L ). For σ > 0 , let F σ : = n 1 | a n | / n σ . Then L : D σ D σ is bounded and
L σ 2 σ + 3 σ .
Proof. 
From (24),
L F σ = n 1 | b n | n σ n 1 | a 2 n | 2 n n σ + n 1 n 4 ( 6 ) | a ( n 1 ) / 3 | ( n 1 ) / 3 1 n σ = : S even + S odd .
For the even term, set m = 2 n . Then
S even = m even | a m | 2 ( m / 2 ) 1 + σ = m even 2 σ | a m | m 1 + σ 2 σ m even | a m | m σ 2 σ F σ .
For the odd term, write m = ( n 1 ) / 3 , so n = 3 m + 1 and
S odd = m 1 | a m | m ( 3 m + 1 ) σ 3 σ m 1 | a m | m σ = 3 σ F σ .
Combining the two estimates gives
L F σ ( 2 σ + 3 σ ) F σ ,
proving (25). □
Lemma 3.5
(Intertwining of P and L ). For every f σ 1 with σ > 0 ,
D ( P f ) = L ( D f ) , D ( P k f ) = L k ( D f ) , k 0 ,
whenever the series converge absolutely.
Proof. 
The Dirichlet coefficients of D ( P f ) in (23) are precisely the b n of (24), so D ( P f ) = L ( D f ) ; iteration gives the second identity. □
The intertwining relation shows that spectral information for P on σ 1 transfers to L on D σ . However, since P is not contractive on σ 1 or B σ , the inequality (25) provides only a uniform boundedness envelope for L k σ , not exponential decay. Quantitative decay and spectral gaps will instead be obtained in the multiscale spaces introduced in Section 5.
Define w k : = P k 1 with 1 ( n ) 1 and
ζ C ( s , k ) : = n 1 w k ( n ) n s , ( s ) large .
By Lemma 3.5,
ζ C ( s , 0 ) = ζ ( s ) , ζ C ( s , k ) = L k ζ ( s ) , k 1 .
The quantity w k ( n ) represents the total normalized weight of all k –step backward paths from n in the Collatz tree under the logarithmic weighting 1 / m . The family ζ C ( s , k ) therefore encodes, in Dirichlet form, the distribution of these weighted backward configurations at depth k . By Lemma 3.4,
L k σ ( 2 σ + 3 σ ) k ,
so the Dirichlet coefficients of ζ C ( s , k ) are uniformly bounded in ( s ) > σ but do not necessarily decay in k . Later sections refine this estimate by passing to the multiscale tree space B tree , σ , where the Lasota–Yorke inequality ensures a true spectral gap and exponential decay of P k .

4. Spectral Framework and Multiscale Contraction Theory

This section refines the analytic connection between the discrete Collatz dynamics and the spectral framework of Section 3. Our goal is to express analytic information about the Dirichlet series associated with iterates of the backward operator P in terms of the spectral data of P —equivalently, of the Dirichlet–Ruelle operator L —acting on suitable Banach spaces continuously embedded in σ 1 . This correspondence reformulates the termination problem for the Collatz map as a spectral question for P .
Throughout this section we fix σ > 1 and a Banach space B σ , 1 of arithmetic functions such that B σ , 1 σ 1 continuously, P ( B σ , 1 ) B σ , 1 , and the Dirichlet transform
D f ( s ) = n 1 f ( n ) n s
defines a holomorphic function for ( s ) > σ whenever f B σ , 1 . The intertwining relation (26) then yields, for all k 0 ,
D ( P k f ) ( s ) = n 1 ( P k f ) ( n ) n s , ( s ) > σ .
Since B σ , 1 σ 1 , each series converges absolutely. By the σ 1 estimate (19),
| D ( P k f ) ( s ) | P k f σ 1 2 σ + 3 σ k f σ 1 , ( s ) > σ .
The bound (29) shows that the iterates of P are uniformly bounded on σ 1 , though not contractive; a genuine contraction will appear only after the refinement to the multiscale tree spaces introduced in Section 4.4.
Generating function and operator resolvent. For z C with | z | < ( 2 σ + 3 σ ) 1 , define the two–variable generating function
G f ( s , z ) : = k 0 z k D ( P k f ) ( s ) .
The series converges absolutely and locally uniformly for ( s ) > σ , hence G f is holomorphic in ( s , z ) on the domain
Ω σ : = { ( s , z ) C 2 : ( s ) > σ , | z | < ( 2 σ + 3 σ ) 1 } .
On the operator side, for such z the Neumann series
( I z P ) 1 = k 0 z k P k
converges in operator norm on B σ , 1 , and thus
G f ( s , z ) = D ( I z P ) 1 f ( s ) , ( s , z ) Ω σ .
The poles of ( I z P ) 1 in the z –plane occur precisely at the reciprocals of the spectral values of P on B σ , 1 . Consequently the analytic structure of G f as a function of z is governed by the spectrum of P .
At this point we recall that the backward Collatz transfer operator P preserves total mass on 1 :
n 1 ( P f ) ( n ) = m 1 f ( m ) ,
so 1 is the maximal eigenvalue of P on 1 . On the Banach space B tree , σ , however, the associated invariant object is not the constant function, but rather the unique positive eigenfunction h satisfying P h = h , as constructed in Section 5. Thus the spectral analysis of P is directed toward establishing a spectral gap at the eigenvalue 1 , meaning that every other spectral value λ satisfies | λ | λ LY < 1 , where λ LY is the Lasota–Yorke contraction constant for the tree norm. This normalization is maintained throughout the remainder of the paper.
Consequently, the resolvent expansion (31) is analytic for | z | < 1 except at the simple pole at z = 1 . The residue at this pole coincides with the Riesz projection onto the eigenspace spanned by h , and therefore encodes the invariant functional associated with the positive eigenfunction h .
The coarse resolvent radius ( 2 σ + 3 σ ) 1 merely provides an elementary domain of convergence. A sharper meromorphic continuation—reflecting the true spectral radius r ( P ) = 1 and the subdominant bound ρ ess ( P ) λ LY < 1 —will be obtained on the refined spaces B tree and B tree , σ , where the Lasota–Yorke inequality gives quantitative contraction of oscillations between adjacent scales.
Finally, for the constant function 1 ( n ) 1 (whenever 1 B σ , 1 ), the coefficients of G 1 ( s , z ) are precisely the Collatz Dirichlet series ζ C ( s , k ) defined in (27). Thus the analytic continuation and asymptotic decay of ζ C ( s , k ) as k are controlled by the spectral properties of P through (31); their exponential decay emerges once the spectral gap on the multiscale tree spaces is established.

4.1. Spectral Reduction and Meromorphic Continuation of Dirichlet Transforms

Recall that the Dirichlet–Ruelle operator L is defined on D σ by (24). The intertwining Lemma 3.5 asserts that for all f σ 1 ,
D ( P f ) = L ( D f ) .
Since D is injective on σ 1 , every eigenpair ( λ , f ) of P with f σ 1 produces an eigenpair ( λ , D f ) of L . Conversely, if L F = λ F and F = D f lies in the image of D , then P f = λ f . Hence the point spectra of P on B σ , 1 and of L on D σ coincide on the subspace D ( B σ , 1 ) . In particular,
ρ ( L ) ρ ( P ) ,
and any spectral gap or peripheral spectral property of P transfers to the induced action of L on Dirichlet series arising from B σ , 1 .
We emphasize that equality σ ( L ) = σ ( P ) is not assumed. The partial correspondence (32) suffices for analytic reduction: the Dirichlet-side continuation of D ( P k f ) reflects the spectral geometry of P .
Mass preservation and spectral gap. Because P only preserves total mass up to a logarithmic factor, we have
n 1 ( P f ) ( n ) = m 1 f ( m ) m ,
so the constant function 1 ( n ) 1 is not an eigenvector. Instead, P admits a unique positive invariant density h B tree , σ and a unique positive invariant functional ϕ B tree , σ * with
P h = h , ϕ P = ϕ , ϕ ( h ) = 1 .
Throughout the paper we work with this Perron–Frobenius normalization (33) and express all spectral decompositions relative to the nonconstant invariant profile h .
Within this framework, the Dirichlet–Ruelle operator L inherits the same dominant eigenvalue 1 and the same spectral gap on the subspace D ( B σ , 1 ) . The analytic behavior of the Collatz Dirichlet series ζ C ( s , k ) = D ( P k 1 ) ( s ) is then determined by how P k approaches the spectral projector onto the invariant subspace spanned by 1 .
Theorem 4.1
(Spectral reduction and analytic continuation). Let B σ , 1 be a Banach space of arithmetic functions continuously embedded in σ 1 such that P : B σ , 1 B σ , 1 is quasi-compact and satisfies the mass-preserving normalization (12). Assume further that 1 is a simple eigenvalue of P and that all other spectral values lie in the closed disk | λ | λ LY < 1 . Then for every f B σ , 1 the Dirichlet transforms D ( P k f ) ( s ) extend holomorphically to ( s ) > σ and admit the decomposition
D ( P k f ) ( s ) = Π 1 ( f ) D ( 1 ) ( s ) + R k ( s ) , | R k ( s ) | C f ( s ) λ LY k ,
where Π 1 is the spectral projection associated with the eigenvalue 1 and C f ( s ) is locally bounded on { ( s ) > σ } . In particular, for f with Π 1 ( f ) = 0 , the functions D ( P k f ) ( s ) decay exponentially in k uniformly on compact subsets of ( s ) > σ .
When f = 1 , the same conclusion applies to ζ C ( s , k ) = D ( P k 1 ) ( s ) , whose exponential stabilization corresponds to convergence toward the invariant density associated with the Collatz operator.
Proof. 
By quasi-compactness, the spectrum of P decomposes as
σ ( P ) = { 1 } σ ess ( P ) , ρ ess ( P ) λ LY < 1 ,
and the Riesz projection Π 1 = 1 2 π i | z 1 | = ε ( z I P ) 1 d z is a bounded projection onto the one-dimensional invariant subspace spanned by 1 . Then P k = Π 1 + N k , where N k B σ , 1 C λ LY k for some constant C > 0 . Applying the Dirichlet transform and using | D ( g ) ( s ) | g σ 1 for ( s ) > σ gives
D ( P k f ) ( s ) = D ( Π 1 f ) ( s ) + D ( N k f ) ( s ) , | D ( N k f ) ( s ) | C λ LY k f B σ , 1 .
Since Π 1 f is a multiple of 1 , we may write D ( Π 1 f ) = Π 1 ( f ) D ( 1 ) , yielding (34). Analyticity for ( s ) > σ follows from absolute convergence and locally uniform bounds. □
This form aligns with the quasi-compactness obtained later on the multiscale tree space B tree , σ , where the Lasota–Yorke inequality ensures ρ ess ( P ) λ LY < 1 . The exponential term λ LY k in (34) corresponds to the essential spectral radius and controls the rate of decay of correlations and Dirichlet coefficients. Under stronger spectral assumptions, the representation can be refined to a meromorphic decomposition in which each isolated eigenvalue λ j contributes a term λ j k D ( Π j f ) , generalizing the usual Ruelle–Perron expansion.

4.2. Spectral Criteria for Boundedness on Weighted Spaces

The preceding analysis shows that sufficiently strong spectral control of P on an appropriate Banach space B σ , 1 forces all Dirichlet data generated by the backward Collatz tree to exhibit exponential stabilization toward the invariant profile. Since P is not contractive on σ 1 or B σ , such behavior can only arise on refined Banach spaces where a genuine spectral gap at the eigenvalue 1 has been established. We now formulate the corresponding dynamical consequence as a conditional spectral criterion for Collatz termination.
Theorem 4.2
(Spectral criterion for Collatz termination). Let P act on a Banach space B σ , 1 σ 1 such that P ( B σ , 1 ) B σ , 1 and 1 B σ , 1 . Assume that P is quasi-compact on B σ , 1 , that 1 is a simple eigenvalue of P corresponding to the unique positive invariant density h, and that all other spectral values satisfy
σ ( P ) { 1 } { z C : | z | λ LY < 1 } .
Then every f B σ , 1 admits a decomposition
P k f = Π 1 f + N k f , N k f B σ , 1 C λ LY k f B σ , 1 ,
where Π 1 is the spectral projection onto span { h } . Consequently, there exists no nontrivial invariant or periodic density for the backward Collatz dynamics in B σ , 1 ; the only invariant direction is the positive eigenfunction h. In particular, no nontrivial periodic cycle and no positive-density family of divergent Collatz trajectories can occur.
Proof. 
By quasi-compactness, the spectrum of P decomposes as σ ( P ) = { 1 } σ ess ( P ) with ρ ess ( P ) λ LY < 1 . The associated Riesz projection
Π 1 = 1 2 π i | z 1 | = ε ( z I P ) 1 d z
is bounded and satisfies P Π 1 = Π 1 P = Π 1 . Since 1 is a simple eigenvalue with positive eigenfunction h , we have
Π 1 f = ϕ ( f ) h ,
where ϕ is the corresponding eigenfunctional normalized so that ϕ ( h ) = 1 .
Hence the power iterates decompose as
P k = Π 1 + N k , N k B σ , 1 C λ LY k ,
for some constant C > 0 .
If a nontrivial invariant density f B σ , 1 satisfied P f = f , then f would belong to the eigenspace of λ = 1 . Since this eigenspace is one-dimensional and spanned by h , we must have f = c h for some constant c . Thus no additional invariant densities exist beyond span { h } .
If a periodic density f satisfied P q f = f for some q > 0 , then f would belong to an eigenspace associated with an eigenvalue λ satisfying | λ | = 1 . Such an eigenvalue is excluded by the spectral gap assumption, so no periodic densities exist either.
Finally, via the standard correspondence between transfer-operator invariants and dynamical orbits on the Collatz graph, any invariant or periodic density corresponds to either a periodic Collatz cycle or to a positive-density family of non-terminating trajectories. The spectral gap therefore precludes these dynamical behaviors. □
Section 4.4 constructs the multiscale tree Banach space B tree and establishes a Lasota–Yorke inequality that ensures quasi-compactness of P with an explicit contraction constant λ LY < 1 in the strong seminorm. Verification of the hypotheses of Theorem 4.2 on B tree , σ provides the analytic–spectral bridge: a strict spectral gap for P on B tree , σ rules out the spectral signatures associated with any non-terminating Collatz behavior.

4.3. Construction of the Multiscale Tree Space B tree , σ

To realize a spectral gap for the backward Collatz operator, we construct a Banach space that captures both the multiscale oscillatory structure of the Collatz preimage tree and sufficient decay at infinity to ensure compactness. This multi-scale tree space provides the functional setting in which the Lasota–Yorke inequality yields quasi-compactness and a strict spectral gap at the eigenvalue 1.
For j 0 define the scale blocks
I j : = [ 6 j , 2 · 6 j ) N .
The factor 6 reflects the approximate scale multiplication under the backward map, combining the even branch m = 2 n and the odd branch m = ( n 1 ) / 3 (defined for n 4 ( mod 6 ) ). Fix parameters 0 < α < 1 and 0 < ϑ < 1 . For indices u , v > 0 , define the scale-sensitive weight
W α ( u , v ) : = u v | u v | ( u + v ) α , u v .
This weight penalizes small separations between indices, emphasizing local oscillations of f , while the factor ( u + v ) α damps sensitivity at large scales. The geometric coefficient ϑ j provides exponential attenuation of oscillations across successive levels of the tree.
Definition 4.3
(Multiscale tree seminorm and space). For f : N C define
[ f ] tree : = j 0 ϑ j sup m , n I j m n W α ( m , n ) | f ( m ) f ( n ) | .
The corresponding Banach space
B tree : = f : N C : f 1 + [ f ] tree < , f tree : = f 1 + [ f ] tree ,
is called the multiscale tree space .
Standard arguments for weighted variation-type seminorms show that ( B tree , · tree ) is complete. The seminorm [ f ] tree controls the oscillatory irregularity of f within each scale block I j , while the 1 component controls the overall magnitude. However, B tree alone does not impose sufficient decay as n to guarantee compactness.
Weighted extension. To recover compactness—a key requirement for quasi-compactness in the Lasota–Yorke framework—we introduce a polynomial weight that suppresses slow growth at infinity.
Definition 4.4
(Weighted tree space with block decay). For parameters 0 < α < 1 , 0 < ϑ < 1 , σ > 1 , and η > 1 , set
f σ : = n 1 | f ( n ) | n σ ,
[ f ] osc : = j 0 ϑ j sup m , n I j m n W α ( m , n ) | f ( m ) f ( n ) | ,
and
[ f ] mass : = sup j 0 η j n I j | f ( n ) | n σ .
Then define
B tree , σ : = f : N C : f tree , σ < , f tree , σ : = f σ + [ f ] osc + [ f ] mass .
The weight n σ enforces global summability of f , while the block–decay term [ f ] mass forces the total weighted mass on each block I j to decrease geometrically with j , ruling out escape of mass to high levels of the tree. The oscillation term [ f ] osc controls the local multiscale variation of f within blocks. Together these components provide the strong–weak structure required for the Lasota–Yorke framework: the strong part [ f ] osc contracts under the transfer operator, while the weak part f σ + [ f ] mass yields compactness of the unit ball and ensures tightness of mass across scales.
Lemma 4.5
(Compact embedding). Fix 0 < α < 1 , 0 < ϑ < 1 , σ > 1 , and η > 1 . Then the unit ball of B tree , σ is relatively compact in σ 1 .
Proof. 
Let
U : = f B tree , σ : f tree , σ 1 .
(i) Uniform boundedness. For f U ,
f σ f tree , σ 1 ,
so U is bounded in σ 1 .
(ii) Uniform tail control. From [ f ] mass 1 we obtain, for every j 0 ,
n I j | f ( n ) | n σ η j .
Fix ε > 0 . Choose J so large that
j > J η j < ε .
Pick N so large that all blocks I j with j J are contained in { 1 , , N } . Then for any f U ,
n > N | f ( n ) | n σ = j > J n I j | f ( n ) | n σ j > J η j < ε .
So the tails of U are uniformly small in σ 1 .
(iii) Compactness on finite windows. Fix N 1 and consider the restriction map f ( f ( 1 ) , , f ( N ) ) from U into C N . For each 1 n N and f U ,
| f ( n ) | n σ f σ 1 ,
hence | f ( n ) | n σ N σ . Therefore { f | { 1 , , N } : f U } is bounded in the finite-dimensional space C N and so relatively compact there.
(iv) Diagonal extraction and σ 1 convergence. Let ( f ( k ) ) U be any sequence. By (iii) we can extract a subsequence ( f ( k ) ) that converges in C 1 at coordinate n = 1 . Repeating this on coordinates 1 , , 2 , then 1 , , 3 , etc., and passing to a diagonal subsequence, we obtain a subsequence, still denoted ( f ( k ) ) , that converges pointwise on all of N :
f ( k ) ( n ) f ( n ) for each n 1 .
We now show f ( k ) f in σ 1 . Fix ε > 0 and choose N as in (ii) so that
sup g U n > N | g ( n ) | n σ < ε 3 .
By pointwise convergence on { 1 , , N } and finite-dimensional compactness, there exists L such that for all L ,
n N | f ( k ) ( n ) f ( n ) | n σ < ε 3 .
For the tail, note that f also belongs to U as the limit of elements of U coordinatewise with uniform σ 1 bound, so
n > N | f ( n ) | n σ sup g U n > N | g ( n ) | n σ < ε 3 .
Therefore, for L ,
n > N | f ( k ) ( n ) f ( n ) | n σ n > N | f ( k ) ( n ) | n σ + n > N | f ( n ) | n σ < 2 ε 3 .
Combining the head and tail estimates gives
n 1 | f ( k ) ( n ) f ( n ) | n σ < ε for all L ,
so f ( k ) f in σ 1 . This shows that every sequence in U has a convergent subsequence in σ 1 , hence U is relatively compact. □
Remark 4.6.
The additional block-decay seminorm [ f ] mass is essential. If one worked only with f σ + [ f ] osc , the unit ball would not be precompact in 1 or in σ 1 : one can construct a sequence of functions supported on disjoint blocks, constant on each block, whose oscillation seminorms vanish and whose σ 1 norms stay uniformly bounded, while the supports drift to infinity. The factor η j in [ f ] mass rules out this escape mechanism by forcing the weighted mass in block I j to decay at least geometrically in j .
The space B tree , σ thus provides the natural functional environment for the Lasota–Yorke inequality. Its compact embedding into σ 1 ensures that the essential spectral radius of P on B tree , σ is strictly smaller than its spectral radius, a prerequisite for establishing a genuine spectral gap. The strong seminorm captures multiscale regularity across the Collatz tree, while the weighted 1 norm supplies the compactness that underlies the spectral analysis of the backward transfer operator.

4.4. Lasota–Yorke Contraction on the Multiscale Tree Space

Recall from (11) that
( P f ) ( n ) = f ( 2 n ) 2 n + 1 { n 4 ( 6 ) } f n 1 3 ( n 1 ) / 3 .
It is convenient to split P into its even and odd components:
( P even f ) ( n ) : = f ( 2 n ) 2 n , ( P odd f ) ( n ) : = 1 { n 4 ( 6 ) } f n 1 3 ( n 1 ) / 3 ,
so that P = P even + P odd .
From the 1 estimates of Section 2, both branches are bounded on 1 , hence on B tree . The Lasota–Yorke inequality arises from the fact that P even is strongly contracting in the tree seminorm, while P odd is a controlled perturbation whose contribution is damped by the multiscale factor ϑ j .

4.4.1. Even-Branch Distortion and Contraction Estimates

We first record the even-branch estimate.
Lemma 4.7
(Even branch contraction on B tree , σ ). Fix 0 < α < 1 , 0 < ϑ < 1 , σ > 1 , and η > 1 . Let P even f ( n ) = f ( 2 n ) / ( 2 n ) . There exist constants
λ even = 2 ( 2 α ) ϑ 1 and C even > 0 ,
depending only on α , ϑ , σ , η , such that for all f B tree , σ ,
[ P even f ] osc λ even [ f ] osc + C even f σ + [ f ] mass .
In particular, if ϑ > 2 ( 2 α ) , then 0 < λ even < 1 , and the even branch is a strict contraction in the oscillation seminorm up to a controlled weak error.
Proof. 
Write
[ f ] osc = j 0 ϑ j δ j ( f ) , δ j ( f ) : = sup m n I j W α ( m , n ) | f ( m ) f ( n ) | .
Step 1: Decomposition and the D 1 / D 2 split Fix j 0 and u , v I j = [ 2 j , 2 j + 1 ) , u v . We write
( P even f ) ( u ) ( P even f ) ( v ) = f ( 2 u ) 2 u f ( 2 v ) 2 v = f ( 2 u ) f ( 2 v ) 2 u + f ( 2 v ) 1 2 u 1 2 v = : D 1 ( u , v ) + D 2 ( u , v ) .
We estimate the contributions of D 1 and D 2 separately.
Step 2: Oscillatory term D 1 : corrected contraction Using the scaling of W α , we have
W α ( 2 u , 2 v ) = 2 1 α W α ( u , v ) , W α ( u , v ) = 2 ( 1 α ) W α ( 2 u , 2 v ) .
Thus
W α ( u , v ) | D 1 ( u , v ) | = W α ( u , v ) | f ( 2 u ) f ( 2 v ) | 2 u = 2 ( 1 α ) W α ( 2 u , 2 v ) 2 u | f ( 2 u ) f ( 2 v ) | .
Since u I j = [ 2 j , 2 j + 1 ) , we have u 2 j and hence
1 2 u 1 2 · 2 j = 2 ( j + 1 ) .
Therefore
W α ( u , v ) | D 1 ( u , v ) | 2 ( 1 α ) 2 ( j + 1 ) W α ( 2 u , 2 v ) | f ( 2 u ) f ( 2 v ) | .
Now 2 u , 2 v I j + 1 = [ 2 j + 1 , 2 j + 2 ) , so
W α ( 2 u , 2 v ) | f ( 2 u ) f ( 2 v ) | δ j + 1 ( f ) .
Taking the supremum over u , v I j ,
δ j P even f ; D 1 : = sup u v I j W α ( u , v ) | D 1 ( u , v ) | 2 ( 1 α ) 2 ( j + 1 ) δ j + 1 ( f ) .
Multiply by ϑ j and sum over j 0 :
j 0 ϑ j δ j P even f ; D 1 2 ( 1 α ) 2 1 j 0 ( ϑ 2 1 ) j δ j + 1 ( f ) .
Change index k = j + 1 , so j = k 1 :
= 2 ( 2 α ) k 1 ( ϑ 2 1 ) k 1 δ k ( f ) .
Since ( ϑ 2 1 ) k 1 = ϑ k 1 2 ( k 1 ) ϑ k 1 and
k 1 ϑ k 1 δ k ( f ) = ϑ 1 k 1 ϑ k δ k ( f ) ϑ 1 [ f ] osc ,
we obtain
j 0 ϑ j δ j P even f ; D 1 2 ( 2 α ) ϑ 1 [ f ] osc .
Thus
[ P even f ] osc ( D 1 ) λ even [ f ] osc , λ even : = 2 ( 2 α ) ϑ 1 .
This is the corrected contraction constant; in particular λ even < 1 whenever ϑ > 2 ( 2 α ) .
Step 3: Denominator term D 2 : weak control For u > v (the case u < v is symmetric),
1 2 u 1 2 v = | u v | 2 u v ,
so
| D 2 ( u , v ) | = | f ( 2 v ) | | u v | 2 u v .
Using
W α ( u , v ) = u v | u v | ( u + v ) α ,
we get the exact simplification
W α ( u , v ) | D 2 ( u , v ) | = | f ( 2 v ) | 2 ( u + v ) α .
For u , v I j we have u + v 2 · 2 j = 2 j + 1 , hence
W α ( u , v ) | D 2 ( u , v ) | 2 ( 1 + α ) 2 α j | f ( 2 v ) | = : C α 2 α j | f ( 2 v ) | .
Taking the supremum over u , v I j gives
δ j P even f ; D 2 C α 2 α j sup w I j + 1 | f ( w ) | .
To bound sup w I j + 1 | f ( w ) | via the weak norm, observe that for w I j + 1 ,
| f ( w ) | w σ | f ( w ) | w σ 2 σ ( j + 1 ) n I j + 1 | f ( n ) | n σ .
By the definition of [ f ] mass ,
n I j + 1 | f ( n ) | n σ η ( j + 1 ) [ f ] mass ,
so
sup w I j + 1 | f ( w ) | 2 σ ( j + 1 ) η ( j + 1 ) [ f ] mass = 2 σ 2 σ η 1 j η 1 [ f ] mass .
Combining,
δ j P even f ; D 2 C α , σ 2 α 2 σ η 1 j η 1 [ f ] mass = C α , σ 2 σ α η 1 j η 1 [ f ] mass .
Multiplying by ϑ j and summing in j ,
j 0 ϑ j δ j P even f ; D 2 C α , σ η 1 j 0 ϑ 2 σ α η 1 j [ f ] mass .
Choosing η > 1 so that
ϑ 2 σ α η 1 < 1 ,
the geometric series converges and we obtain
[ P even f ] osc ( D 2 ) C even ( 2 ) f σ + [ f ] mass ,
for some constant C even ( 2 ) > 0 depending only on α , ϑ , σ , η (we absorb f σ for later convenience). Adding the contributions of D 1 and D 2 ,
[ P even f ] osc [ P even f ] osc ( D 1 ) + [ P even f ] osc ( D 2 ) λ even [ f ] osc + C even f σ + [ f ] mass ,
with λ even = 2 ( 2 α ) ϑ 1 and C even = C even ( 2 ) , which proves (39). □
The odd branch requires more care because it shifts indices from n to ( n 1 ) / 3 and only acts on the congruence class n 4 ( mod 6 ) . Its effect is nonetheless small once weighted by ϑ j .

4.4.2. Odd-Branch Distortion and Contraction Estimates

Lemma 4.8
(Odd–branch distortion on scale blocks). Let 0 < α < 1 and W α ( u , v ) be the tree weight 36. If n 4 ( mod 6 ) and n I j = [ 6 j , 2 · 6 j ) , then the odd preimage m = ( n 1 ) / 3 satisfies m 6 j 1 . Moreover, there exists a constant C α > 0 depending only on α such that for every pair n 1 , n 2 I j lying on the same ray and m i = ( n i 1 ) / 3 ,
W α ( m 1 , m 2 ) C α W α ( n 1 , n 2 ) .
Proof. 
Fix j 1 and n I j = [ 6 j , 2 · 6 j ) with n 4 ( mod 6 ) , so m = ( n 1 ) / 3 . Then
6 j 1 3 m < 2 · 6 j 1 3 ,
hence for all sufficiently large j we have
6 j 1 m 4 · 6 j 1 ,
so m 6 j 1 as claimed.
Now take n 1 , n 2 I j on the same ray and define
m i = n i 1 3 , i = 1 , 2 .
Then
n i = 3 m i + 1 , | n 1 n 2 | = 3 | m 1 m 2 | .
We estimate the ratio
R : = W α ( m 1 , m 2 ) W α ( n 1 , n 2 ) = m 1 m 2 n 1 n 2 · | n 1 n 2 | | m 1 m 2 | · ( n 1 + n 2 ) α ( m 1 + m 2 ) α .
Product factor. From 6 j 1 m i 4 · 6 j 1 and 6 j n i 2 · 6 j we get
m 1 m 2 ( 4 · 6 j 1 ) 2 = 16 · 6 2 ( j 1 ) , n 1 n 2 6 2 j .
Thus
m 1 m 2 n 1 n 2 16 · 6 2 ( j 1 ) 6 2 j = 16 6 2 = 4 9 .
Difference factor. We have the exact identity
| n 1 n 2 | | m 1 m 2 | = 3 .
Sum factor. Again using the scale bounds, for m i we have
2 · 6 j 1 m 1 + m 2 8 · 6 j 1 ,
and for n i ,
2 · 6 j n 1 + n 2 4 · 6 j .
Therefore
n 1 + n 2 m 1 + m 2 4 · 6 j 2 · 6 j 1 = 4 · 6 2 = 12 ,
so
( n 1 + n 2 ) α ( m 1 + m 2 ) α 12 α .
Combining (41), the difference factor, and (42), we obtain
R 4 9 · 3 · 12 α = 4 3 12 α .
Thus we may take
C α : = 4 3 12 α ,
which depends only on α and is finite for all 0 < α < 1 . This yields
W α ( m 1 , m 2 ) C α W α ( n 1 , n 2 ) ,
which is (40). □
Lemma 4.9
(Odd branch on B tree , σ ). Let 0 < α < 1 , 0 < ϑ < 1 , σ > 1 , and η > 1 . Consider the odd-branch operator
( P odd f ) ( n ) = 1 { n 4 ( 6 ) } f n 1 3 ( n 1 ) / 3 .
Then there exist constants 0 < λ odd < 1 and C odd > 0 , depending only on α , ϑ , σ , η , such that for all f B tree , σ one has
[ P odd f ] osc λ odd [ f ] osc + C odd f σ + [ f ] mass .
In particular, for α = 1 2 one can take
λ odd C 0 ϑ ,
where C 0 = 16 3 3 / 2 is the odd-branch distortion constant from Lemma 4.14.
Proof. 
Write
δ j ( g ) : = sup m n I j W α ( m , n ) | g ( m ) g ( n ) | , [ g ] osc = j 0 ϑ j δ j ( g ) ,
so that [ P odd f ] osc = j 0 ϑ j δ j ( P odd f ) . We bound δ j ( P odd f ) in terms of δ k ( f ) at nearby scales and the weak/mass terms.
Fix j 0 and m , n I j = [ 6 j , 2 · 6 j ) with m n . We split into three cases.
Case 1: neither m nor n is 4 ( mod 6 ) . Then P odd f ( m ) = P odd f ( n ) = 0 , so this pair contributes nothing to δ j ( P odd f ) .
Case 2: exactly one of m , n is 4 ( mod 6 ) . Without loss of generality, assume m 4 ( mod 6 ) and n ¬ 4 ( mod 6 ) , and set k : = ( m 1 ) / 3 . Then
P odd f ( m ) P odd f ( n ) = f ( k ) k ,
and therefore
W α ( m , n ) P odd f ( m ) P odd f ( n ) = W α ( m , n ) | f ( k ) | k .
As in the previous estimates for the tree seminorm, one has W α ( m , n ) 6 ( 2 α ) j and k 6 j 1 , so
W α ( m , n ) | f ( k ) | k 6 ( 1 α ) j | f ( k ) | .
Multiplying by ϑ j and summing over j , one obtains a contribution bounded by
j 0 ϑ j sup m , n I j exactly one 4 ( 6 ) W α ( m , n ) P odd f ( m ) P odd f ( n ) C odd , 1 f σ + [ f ] mass ,
for a suitable constant C odd , 1 depending on α , ϑ , σ , η . Thus Case 2 contributes only to the weak/mass error term in (43).
Case 3: both m and n are 4 ( mod 6 ) . Set
m = m 1 3 , n = n 1 3 ,
so that m , n I j 1 and
P odd f ( m ) = f ( m ) m , P odd f ( n ) = f ( n ) n .
We decompose
f ( m ) m f ( n ) n = f ( m ) f ( n ) m + f ( n ) 1 m 1 n = : D 1 + D 2 .
Case 3a: the D 1 term (oscillatory contribution). Here
D 1 = f ( m ) f ( n ) m ,
and we want to control
W α ( m , n ) | D 1 | = W α ( m , n ) | f ( m ) f ( n ) | m .
For α = 1 2 , Lemma 4.14 gives the two-sided distortion bound
1 C 0 W 1 / 2 ( m , n ) W 1 / 2 ( m , n ) C 0 , C 0 = 16 3 3 / 2 .
Using the upper bound, we obtain
W 1 / 2 ( m , n ) | D 1 | C 0 W 1 / 2 ( m , n ) | f ( m ) f ( n ) | m .
Since m 6 j 1 for m I j , we have 1 / m 6 ( j 1 ) , so
W 1 / 2 ( m , n ) | D 1 | C 0 6 ( j 1 ) W 1 / 2 ( m , n ) | f ( m ) f ( n ) | .
Taking the supremum over m , n I j with m n 4 ( mod 6 ) gives
δ j ( P odd f ; D 1 ) C 0 6 ( j 1 ) δ j 1 ( f ) ,
for some constant C 0 comparable to C 0 .
Multiplying by ϑ j and summing over j 1 yields
j 1 ϑ j δ j ( P odd f ; D 1 ) C 0 ϑ k 0 ( ϑ / 6 ) k δ k ( f ) C 0 ϑ [ f ] osc ,
with C 0 depending only on C 0 and ϑ (and uniformly bounded as long as ϑ < 6 ). Thus the D 1 -part of Case 3 contributes a term of the form λ odd ( 1 ) [ f ] osc with λ odd ( 1 ) C 0 ϑ .
Case 3b: the D 2 term (denominator contribution). For
D 2 = f ( n ) 1 m 1 n
one uses again the scale relations m , n 6 j 1 and the fact that | m n | C 6 j 1 to deduce
W α ( m , n ) | D 2 | 6 α j | f ( n ) | .
As in the even-branch analysis, the amplitude | f ( n ) | on blocks I j 1 is controlled by a combination of the weak norm f σ and the mass seminorm [ f ] mass , and the geometric factor 6 α j ensures convergence of the sum over j . Thus there exists a constant C odd , 2 > 0 such that
j 0 ϑ j δ j ( P odd f ; D 2 ) C odd , 2 f σ + [ f ] mass .
Conclusion. Combining Cases 1–3, we obtain
[ P odd f ] osc = j 0 ϑ j δ j ( P odd f ) λ odd [ f ] osc + C odd f σ + [ f ] mass ,
with
λ odd : = λ odd ( 1 ) C 0 ϑ , C odd : = C odd , 1 + C odd , 2 ,
depending only on α , ϑ , σ , η . For α = 1 2 the explicit distortion constant C 0 = 16 / 3 3 / 2 from Lemma 4.14 can be absorbed into C 0 , and in particular one may choose λ odd C 0 ϑ after adjusting constants. Choosing ϑ > 0 small enough so that λ odd < 1 gives the desired Lasota–Yorke contraction. □

4.5. Derivation of the Global Lasota–Yorke Inequality

Lemma 4.10
(Invariance and boundedness on B tree , σ ). Let 0 < α < 1 , σ > 1 , and 0 < ϑ < 1 satisfy
ϑ 6 1 α + σ < 1 .
Then the backward Collatz transfer operator P maps B tree , σ into itself and is bounded: there exists C > 0 such that
P f tree , σ C f tree , σ for all f B tree , σ ,
where
f tree , σ : = [ f ] tree + f σ .
Proof. 
Recall the even/odd decomposition
( P f ) ( n ) = ( P even f ) ( n ) + ( P odd f ) ( n ) = f ( 2 n ) 2 n + 1 { n 4 ( 6 ) } f n 1 3 ( n 1 ) / 3 .
Step 1: Weighted σ 1 bound. The weak norm estimate is given by the same computation as in the weighted σ 1 lemma: for all f B tree , σ ,
P f σ = P even f + P odd f σ ( 2 σ + 3 σ ) f σ .
This uses only the change of variables m = 2 n for the even branch and m = ( n 1 ) / 3 for the odd branch.
Step 2: Tree seminorm bound via the even/odd lemmas. By subadditivity of [ · ] tree ,
[ P f ] tree [ P even f ] tree + [ P odd f ] tree .
The even-branch lemma (even branch on B tree , σ ) states that under the admissibility condition (44) there exists C even > 0 , depending only on α , σ , ϑ , such that
[ P even f ] tree C even f σ for all f B tree , σ .
The proof is obtained by estimating each block seminorm Δ j ( P even f ) and summing the geometric series j 0 ( ϑ 6 1 α + σ ) j , which converges precisely when (44) holds.
Similarly, the odd-branch lemma (odd branch on B tree , σ ) gives a constant C odd > 0 , depending only on α , σ , ϑ , such that
[ P odd f ] tree C odd f σ for all f B tree , σ ,
again using the same geometric factor ϑ 6 1 α + σ and the admissibility (44).
Combining (46) and (47) we obtain
[ P f ] tree ( C even + C odd ) f σ .
Step 3: Combine strong and weak parts. By definition,
P f tree , σ = [ P f ] tree + P f σ .
Using (45) and (48),
P f tree , σ ( C even + C odd ) f σ + ( 2 σ + 3 σ ) f σ .
Since f tree , σ = [ f ] tree + f σ f σ , we have
f σ f tree , σ ,
and therefore
P f tree , σ C even + C odd + 2 σ + 3 σ f tree , σ .
Thus P is bounded on B tree , σ with operator norm at most
C : = C even + C odd + 2 σ + 3 σ ,
which depends only on α , σ , ϑ . □
Proposition 4.11
(Lasota–Yorke inequality on B tree , σ ). Let 0 < α < 1 , 0 < ϑ < 1 , and σ > 1 satisfy the admissibility condition
ϑ 6 1 α + σ < 1 .
Then there exists a constant C LY , σ > 0 such that for all f B tree , σ ,
[ P f ] tree C LY , σ f σ .
In particular, for every n 1 one has
[ P n f ] tree C LY , σ P n 1 f σ C LY , σ ( 2 σ + 3 σ ) n 1 f σ , f B tree , σ ,
so each iterate P n is smoothing in the strong seminorm [ · ] tree with a quantitative bound in terms of the weak norm · σ of the previous iterate.
Proof. 
We use the even/odd decomposition P = P even + P odd . From the estimates established in the proofs of the even and odd branch lemmas, and under the admissibility condition (49), there exist constants C even , C odd > 0 such that for all f B tree , σ ,
[ P even f ] tree C even f σ , [ P odd f ] tree C odd f σ .
The convergence of the geometric series j 0 ( ϑ 6 1 α + σ ) j is precisely what guarantees that the constants C even and C odd are finite.
By subadditivity of [ · ] tree ,
[ P f ] tree = [ P even f + P odd f ] tree [ P even f ] tree + [ P odd f ] tree C even + C odd f σ .
Setting
C LY , σ : = C even + C odd
gives (50).
For the iterates, apply (50) with f replaced by P n 1 f :
[ P n f ] tree = [ P ( P n 1 f ) ] tree C LY , σ P n 1 f σ .
From the weighted σ 1 bound
P f σ ( 2 σ + 3 σ ) f σ
(see (45)), iterating gives P n 1 f σ ( 2 σ + 3 σ ) n 1 f σ , which yields (51). □
Remark  4.12 (Parameter window)
The lift from · 1 to · σ in the remainder terms occurs through the same geometric factor that appears in the proofs of the even and odd branch bounds, namely ϑ 6 1 α + σ . The only requirement is the admissibility condition
ϑ 6 1 α + σ < 1 ,
which ensures convergence of j 0 ( ϑ 6 1 α + σ ) j and hence finiteness of C even and C odd .
For fixed α and σ this simply means
0 < ϑ < 6 ( 1 α + σ ) .
A convenient concrete choice (used later) is
( α , ϑ , σ ) = 1 2 , 1 20 , 1 + ε with any sufficiently small ε > 0 ,
since then
ϑ 6 1 α + σ = 1 20 6 3 2 + ε < 1
for all ε in a small interval ( 0 , ε 0 ] . Together with the explicit odd–branch distortion constant computed in Section 5 and the compact embedding B tree , σ σ 1 , this smoothing Lasota–Yorke inequality is exactly what is needed to apply the Ionescu–Tulcea–Marinescu/Hennion theory and deduce quasi–compactness of P on B tree , σ .
Corollary 4.13
(Essential spectral radius on B tree , σ ). Let 0 < α < 1 , 0 < ϑ < 1 , and σ > 1 satisfy the admissibility condition (44). Assume the Lasota–Yorke inequality from Proposition 4.11 and the compact embedding B tree , σ σ 1 (from Lemma 4.5). Then P : B tree , σ B tree , σ is quasi–compact, and its essential spectral radius satisfies
ρ ess P B tree , σ = 0 .
Proof. 
By Proposition 4.11, for every f B tree , σ ,
[ P f ] tree C LY , σ f σ ,
where [ · ] tree is the strong seminorm and · σ is the weak norm on B tree , σ . This is a Lasota–Yorke (Doeblin–Fortet) inequality with strong contraction coefficient equal to 0, i.e.
P f strong : = [ P f ] tree 0 · [ f ] tree + C LY , σ f weak , f weak : = f σ .
By Lemma 4.5, the unit ball of B tree , σ is relatively compact in σ 1 , so the inclusion
B tree , σ σ 1
is compact. Thus the hypotheses of the Ionescu–Tulcea–Marinescu/Hennion quasi–compactness theorem are satisfied with strong contraction constant a = 0 . The theorem then implies that P is quasi–compact on B tree , σ and that its essential spectral radius is bounded above by a , i.e.
ρ ess P B tree , σ 0 .
Since the essential spectral radius is always nonnegative, we conclude
ρ ess P B tree , σ = 0 ,
which is (53). □

4.6. Quasi-Compactness and Spectral Gap for P

Lemma 4.14
(Odd–branch weight distortion at α = 1 2 ). Let W α ( m , n ) be the tree weight eq:tree-weight and let m = ( m 1 ) / 3 , n = ( n 1 ) / 3 . For α = 1 2 there exists an absolute constant
C 0 = 16 3 3 / 2 < 3 . 1
such that for all m n 4 ( mod 6 ) , m n , one has the two–sided distortion bounds
1 C 0 W 1 / 2 ( m , n ) W 1 / 2 ( m , n ) C 0 .
Consequently, the oscillatory part of the odd branch satisfies
λ odd 1 2 , ϑ C 0 6 ϑ ,
as used in Lemma 4.9 and Lemma 4.15.
Proof. 
Plug in α = 1 2 into Lemma 4.8 and the result follows from the Lemma. □
Lemma 4.15
(Explicit odd-branch constant). For α = 1 2 and ϑ = 1 20 there exist constants C α > 0 and C odd > 0 such that for all f B tree , σ ,
[ P odd f ] tree λ odd ( α , ϑ ) [ f ] tree + C odd f σ ,
with
λ odd ( α , ϑ ) C α 6 ϑ < 1 .
Proof. 
This is a direct specialization of Lemma 4.9 to the parameter choice α = 1 2 , ϑ = 1 20 . Lemma 4.9 gives the Lasota–Yorke type estimate
[ P odd f ] tree λ odd ( α , ϑ ) [ f ] tree + C odd f σ ,
with
λ odd ( α , ϑ ) C α 6 ϑ ,
where C α > 0 is the odd-branch distortion constant defined in Lemma 4.14. For α = 1 2 , Lemma 4.14 gives the explicit value
C α = C 0 = 16 3 3 / 2 < 3 . 1 .
Substituting α = 1 2 and ϑ = 1 20 into the above bound,
λ odd 1 2 , 1 20 C 0 6 · 1 20 < 3 . 1 6 · 1 20 < 1 .
Thus (55) holds with λ odd ( 1 2 , 1 20 ) < 1 and some constant C odd > 0 depending only on σ and the block geometry, as asserted. □
Proposition 4.16
(Verified Lasota–Yorke contraction). Let ( α , ϑ ) = 1 2 , 1 20 and σ > 1 satisfy the admissibility condition (44), i.e.
ϑ 6 1 α + σ < 1 .
Define
λ LY : = 2 ( 1 α ) ϑ + λ odd ( α , ϑ ) , λ odd ( α , ϑ ) C 0 6 ϑ ,
with C 0 = 16 / 3 3 / 2 from Lemma 4.14. Then λ LY < 1 , and for all f B tree , σ ,
[ P f ] tree λ LY [ f ] tree + C LY f σ ,
for some constant C LY > 0 depending only on the fixed parameters and the block geometry.
Proof. 
We use the decomposition P = P even + P odd and the branchwise Lasota–Yorke estimates already established.
(1) Combine even and odd branch inequalities. For any f B tree , σ ,
[ P f ] tree [ P even f ] tree + [ P odd f ] tree .
By the even-branch estimate (Lemma 4.7, specialized to B tree , σ and the fixed parameters α = 1 2 , ϑ = 1 20 ), there exists C even > 0 such that
[ P even f ] tree 2 ( 1 α ) ϑ [ f ] tree + C even f σ .
By the explicit odd-branch lemma (Lemma 4.15) for α = 1 2 and ϑ = 1 20 , there exist C α > 0 and C odd > 0 such that
[ P odd f ] tree λ odd ( α , ϑ ) [ f ] tree + C odd f σ ,
with
λ odd ( α , ϑ ) C α 6 ϑ = C 0 6 ϑ < 1 ,
where C 0 = 16 / 3 3 / 2 is given by Lemma 4.14.
Adding (58) and (59),
[ P f ] tree 2 ( 1 α ) ϑ + λ odd ( α , ϑ ) [ f ] tree + ( C even + C odd ) f σ .
Define
λ LY : = 2 ( 1 α ) ϑ + λ odd ( α , ϑ ) , C LY : = C even + C odd ,
and we obtain (57).
(2) Verification that λ LY < 1 . With ( α , ϑ ) = ( 1 2 , 1 20 ) ,
2 ( 1 α ) ϑ = 2 1 / 2 · 1 20 = 1 20 2 0 . 0354 .
From Lemma 4.15 and Lemma 4.14 we have
λ odd 1 2 , 1 20 C 0 6 · 1 20 , C 0 = 16 3 3 / 2 < 3 . 1 .
Numerically,
C 0 6 · 1 20 < 3 . 1 6 · 1 20 0 . 063 .
Therefore
λ LY = 2 ( 1 α ) ϑ + λ odd ( α , ϑ ) < 0 . 0354 + 0 . 063 < 0 . 1 < 1 .
Thus λ LY is a strict contraction factor depending only on the fixed parameters, and the proposition follows. □
We now record the standard consequence of the Lasota–Yorke inequality and the compact embedding of B tree into 1 .
Theorem 4.17
(Quasi-compactness on B tree , σ ). Let 0 < α < 1 , 0 < ϑ < 1 , and σ > 1 . Assume that the Lasota–Yorke constant
λ ( α , ϑ ) : = 2 ( 1 α ) + λ odd ( α , ϑ )
satisfies λ ( α , ϑ ) < 1 , where λ odd ( α , ϑ ) is as in Lemma 4.9. Then the backward transfer operator P acting on B tree , σ is quasi-compact, and its essential spectral radius satisfies
ρ ess ( P | B tree , σ ) λ ( α , ϑ ) < 1 .
Proof. 
We work on the Banach space B tree , σ with norm · tree , σ = · σ + [ · ] tree , where · σ is the weighted σ 1 -norm and [ · ] tree is the tree seminorm defined in Section 4.3.
Step 1: Lasota–Yorke inequality. By Proposition 4.11 (applied in the weighted setting, with f 1 replaced by f σ ) we have, for all f B tree , σ ,
[ P f ] tree λ ( α , ϑ ) [ f ] tree + C LY f σ ,
with λ ( α , ϑ ) < 1 by assumption. On the weak norm side, since P is bounded on σ 1 , there exists C σ > 0 (e.g. C σ = Λ σ from (18)) such that
P f σ C σ f σ for all f B tree , σ .
Thus P satisfies a standard two-norm Lasota–Yorke inequality on B tree , σ with strong seminorm · s : = [ · ] tree and weak norm · w : = · σ :
P f s λ f s + C LY f w , P f w C σ f w .
Step 2: Compact embedding. By Lemma 4.5, the embedding
J : ( B tree , σ , · tree , σ ) ( σ 1 , · σ )
is compact. Since · w = · σ is exactly the weak norm used in (64), this shows that the unit ball of B tree , σ is relatively compact for the weak norm.
Step 3: Application of Ionescu–Tulcea–Marinescu / Hennion. We now invoke the standard quasi-compactness criterion (Ionescu–Tulcea and Marinescu theorem): if a bounded operator T on a Banach space X satisfies
(i)
a Lasota–Yorke inequality T x s λ x s + C x w with λ < 1 ,
(ii)
a weak bound T x w C x w , and
(iii)
the injection ( X , · s ) ( X , · w ) has relatively compact unit ball,
then T is quasi-compact on X and its essential spectral radius satisfies
ρ ess ( T ) λ .
Conditions (i)–(iii) are exactly (64) and Lemma 4.5 for T = P and X = B tree , σ . Therefore P is quasi-compact on B tree , σ and
ρ ess ( P | B tree , σ ) λ ( α , ϑ ) < 1 ,
which is (61). □
Remark  4.18 (On the choice of parameters)
The explicit bound (60) shows that λ odd ( α , ϑ ) decreases linearly with ϑ . For fixed α , one can therefore choose ϑ sufficiently small so that λ ( α , ϑ ) < 1 , provided the constant C α is effectively controlled. Subsequent sections make this optimization quantitative by computing C α and exhibiting admissible parameter pairs ( α , ϑ ) that give a strict spectral gap.
The Lasota–Yorke framework developed here supplies the functional-analytic backbone for the spectral approach to the Collatz problem: once explicit parameters with λ ( α , ϑ ) < 1 are verified, the quasi-compactness and spectral gap of P on B tree follow, and the spectral criteria of Section 4 can be invoked to constrain or rule out non-terminating configurations.

5. Invariant Profiles, Block Recursion, and Perron–Frobenius Rigidity

Having established in Section 4.4 that the backward Collatz operator P is quasi-compact on the multi-scale tree space B tree , we now turn to the spectral consequences of this result. The Lasota–Yorke inequality ensures the existence of a spectral gap, which in turn controls the structure of invariant densities and the long-term behavior of iterates P k . The objective of this section is to characterize the invariant and quasi-invariant components of P , derive an effective block recursion for their scale-averaged coefficients, and demonstrate that the recursion enforces rigidity across the Collatz tree.
Throughout this section, h B tree , σ will denote an invariant density of P , i.e. a function satisfying P h = h . The analysis proceeds in several stages. First, we describe the structure of possible invariant profiles in the multiscale framework and show that the Lasota–Yorke inequality forces uniform flatness across scales. Next, we translate this flatness into an explicit two-sided recurrence relation for block averages c j . Finally, we verify that the coefficients of this recurrence satisfy a spectral bound consistent with the contraction constant λ odd ( α , ϑ ) computed earlier.
Theorem 5.1
(Perron–Frobenius structure on B tree , σ ). Let P be the backward Collatz transfer operator acting on B tree , σ with parameters ( α , ϑ , σ ) chosen so that the Lasota–Yorke inequality and quasi–compactness hold. Then:
1.
The spectral radius of P equals 1, and 1 is a simple eigenvalue.
2.
There exists a unique eigenvector h B tree , σ with h > 0 and P h = h , normalized by ϕ ( h ) = 1 .
3.
There exists a unique positive eigenfunctional ϕ B tree , σ * such that ϕ P = ϕ .
4.
All other spectral values satisfy | z | < 1 , and P admits the spectral decomposition
P = h ϕ + Q , ρ ( Q ) < 1 ,
where Q is quasi–compact.
Proof. 
We combine the Lasota–Yorke inequality on B tree , σ with standard Perron–Frobenius theory for positive quasi–compact operators.
Step 1: Spectral radius and quasi–compactness. By construction P is a bounded linear operator on B tree , σ and is positive in the sense that f 0 implies P f 0 . The Lasota–Yorke inequality on B tree , σ (Proposition 4.11, say) together with the compact embedding of the strong seminorm into the weak norm implies that P is quasi–compact on B tree , σ with essential spectral radius strictly less than 1:
ρ ess ( P ) < 1 .
On the other hand, the logarithmic mass–preservation identity (Lemma 2.4) shows that the spectral radius of P is at least 1; the boundedness of P implies ρ ( P ) 1 , hence
ρ ( P ) = 1 .
In particular, 1 lies in the spectrum of P and, by (65), is an isolated spectral value.
Step 2: Existence of a positive eigenvector. Consider the positive cone
C : = { f B tree , σ : f 0 } ,
which is closed, convex, and reproducing. Since P is positive and ρ ( P ) = 1 , the Krein–Rutman theorem for positive operators on Banach spaces implies the existence of a nonzero h C such that
P h = h .
Moreover, h can be chosen strictly positive in the sense that h ( n ) > 0 for all n N : indeed, by the preimage structure of the Collatz map (Lemma 2.3) and the connectivity of the backward tree, any nontrivial f C is eventually propagated by iterates of P to a function that is positive on every block I j , so P k f > 0 for all sufficiently large k . Replacing h by P k h if necessary yields h > 0 .
Step 3: Uniqueness and simplicity of the eigenvalue 1. We now show that 1 is a simple eigenvalue and that h is unique up to scalar multiples. Suppose g B tree , σ satisfies P g = g . Decompose g = g + g into positive parts. Positivity of P implies P g ± = g ± . By the strong positivity argument above, any nonzero f C with P f = f must be strictly positive; hence g + and g are both either 0 or strictly positive. If both were nonzero, then g + and g would be linearly independent positive eigenvectors for the eigenvalue 1, and the positive cone would contain a two-dimensional face of eigenvectors. This contradicts the Krein–Rutman conclusion that the eigenspace associated with the spectral radius is one–dimensional. Therefore one of g + , g must vanish and g is either nonnegative or nonpositive; by replacing g by g if necessary, g 0 , and the strong positivity then forces g to be a scalar multiple of h . Thus the eigenspace for the eigenvalue 1 is one–dimensional and spanned by h , and 1 is a simple eigenvalue. This proves (1) and the first part of (2) after normalizing by ϕ ( h ) = 1 below.
Step 4: Dual eigenfunctional. Consider the dual operator P * acting on B tree , σ * . Since P is positive, so is P * on the dual cone
C * : = { Λ B tree , σ * : Λ ( f ) 0 for all f C } .
The quasi–compactness of P implies quasi–compactness of P * on the dual space. By (66), P * also has spectral radius 1. Applying the same Krein–Rutman argument to P * yields a nonzero ϕ C * and
ϕ P = ϕ ,
with ϕ strictly positive on nonzero elements of C . The same simplicity argument as in Step 3 shows that the eigenspace of P * for the eigenvalue 1 is one–dimensional and spanned by ϕ . Normalizing by the condition ϕ ( h ) = 1 gives the uniquely determined eigenpair ( h , ϕ ) appearing in the statement. This establishes (2) and (3).
Step 5: Spectral decomposition and spectral gap. Quasi–compactness of P on B tree , σ , together with (65) and the simplicity of the eigenvalue 1, implies that the spectrum of P is contained in { 1 } { z : | z | < r } for some r < 1 . Let Π denote the spectral projection onto the eigenspace associated with λ = 1 ; by the previous steps,
Π f = h ϕ ( f ) , f B tree , σ ,
so that Π = h ϕ as a rank–one operator. Writing
P = Π + Q = h ϕ + Q ,
we have Q = P Π and Q Π = Π Q = 0 . The spectrum of Q is contained in { z : | z | < r } , so in particular
ρ ( Q ) < 1 .
Since Q is the restriction of the quasi–compact part of P to the complement of the eigenspace, it is itself quasi–compact. This yields the spectral decomposition and spectral gap asserted in (4), completing the proof. □
Proposition 5.2
(Cesàro averages and invariant P –functionals). Let X be a Banach space of functions f : N C such that B tree , σ X continuously and the backward transfer operator P extends to a bounded operator P : X X .
Let X * be the Banach dual of X, with duality pairing f , φ (not necessarily given by a pointwise sum for all φ). Suppose:
(i)
P * : X * X * is power–bounded : there exists C * > 0 such that
( P * ) k X * X * C * for all k 0 .
Equivalently, the Cesàro operators
A N : = 1 N k = 0 N 1 ( P * ) k
satisfy A N X * X * C * for all N 1 .
(ii)
There exists φ X * and f 0 X such that
lim inf N f 0 , 1 N k = 0 N 1 ( P * ) k φ > 0 .
Define the Cesàro averages
Φ N : = 1 N k = 0 N 1 ( P * ) k φ X * .
Then:
(a)
( Φ N ) N 1 is a bounded set in X * and admits weak–* limit points in X * .
(b)
Every weak–* limit point Φ of ( Φ N ) N 1 satisfies P * Φ = Φ , hence defines a P–invariant functional ( f ) : = f , Φ on X.
(c)
Under assumption (71), any such limit point Φ is nonzero, since f 0 , Φ lim inf N f 0 , Φ N > 0 . In particular, the restriction of ℓ to B tree , σ is a nonzero invariant functional with P = .
Proof. (a) From (70),
Φ N X * = 1 N k = 0 N 1 ( P * ) k φ X * 1 N k = 0 N 1 ( P * ) k φ X * C * φ X *
for all N 1 , so the sequence ( Φ N ) is bounded in X * . By Banach–Alaoglu, the closed ball { ψ X * : ψ C * φ } is weak–* compact, hence there exists a subsequence N m and Φ X * such that Φ N m Φ in the weak–* topology.
(b) Fix f X and compute
f , P * Φ N = P f , Φ N = 1 N k = 0 N 1 P f , ( P * ) k φ = 1 N k = 1 N f , ( P * ) k φ = 1 N k = 0 N 1 f , ( P * ) k φ + 1 N f , ( P * ) N φ f , φ = f , Φ N + 1 N f , ( P * ) N φ f , φ .
By (70),
| f , ( P * ) N φ | f X ( P * ) N φ X * f X C * φ X * ,
so
1 N f , ( P * ) N φ f , φ 2 C * N f X φ X * N 0 .
Thus
lim N f , P * Φ N f , Φ N = 0 .
Passing to the subsequence N m along which Φ N m Φ weak–*, we obtain
f , P * Φ = lim m f , P * Φ N m = lim m f , Φ N m = f , Φ ( f X ) ,
so P * Φ = Φ .
(c) By assumption (71),
lim inf N f 0 , Φ N > 0 .
In particular, after passing to a subsequence N m if necessary, we have lim m f 0 , Φ N m = λ > 0 . For the corresponding weak–* limit point Φ of Φ N m ,
f 0 , Φ = lim m f 0 , Φ N m = λ > 0 ,
so Φ 0 . The invariant functional ( f ) : = f , Φ is therefore nonzero and satisfies ( P f ) = P f , Φ = f , P * Φ = f , Φ = ( f ) . □

5.1. Invariant Density Profile and Refined Tree Geometry

The quasi-compactness of P implies that its spectrum consists of a discrete set of eigenvalues of finite multiplicity outside a disk of radius ρ ess ( P ) λ LY < 1 , together with a residual spectrum contained in that disk. Let λ 0 = 1 denote the trivial eigenvalue corresponding to constant functions. Any additional eigenvalues with | λ | < 1 correspond to exponentially decaying modes. Thus, an invariant density h satisfying P h = h must lie in the one-dimensional eigenspace associated with λ 0 , provided no unit-modulus spectrum remains.
However, to make this conclusion effective, one must exclude the possibility of small oscillatory components that project into higher spectral modes but decay too slowly to be detected by the weak 1 norm alone. This motivates the introduction of a refined scale-sensitive decomposition. Define block intervals I j as in (35), and let
H j ( h ) : = n I j h ( n ) , c j : = H j ( h ) | I j | = H j ( h ) 6 j .
The sequence ( c j ) j 0 captures the mean behavior of h across successive scales in the backward tree. Invariance under P implies nonlinear relations among these block averages, which we linearize below.
Lemma 5.3
(Block–level invariance relation). Let 0 < α < 1 , 0 < ϑ < 1 , and σ > 1 , and let h B tree , σ satisfy P h = h . For each j 0 define the block average
c j : = 1 | I j | n I j h ( n ) , | I j | : = # I j ,
where I j = [ 6 j , 2 · 6 j ) are the tree blocks used in the definition of B tree , σ . Then there exist bounded sequences ( a j ) j 0 and ( b j ) j 0 with a j , b j 0 and a sequence ( ε j ) j 0 such that
c j = a j c j + 1 + b j c j 1 + ε j , j 1 ,
and the error sequence is summable in the weighted norm:
j 1 ϑ j | ε j | C [ h ] tree + h σ < ,
for a constant C > 0 depending only on α , ϑ , σ and the block geometry.
Proof. 
Fix h B tree , σ with P h = h and j 1 . We work with the 6–adic blocks I j = [ 6 j , 2 · 6 j ) .
1. Block identity and branch decomposition. For each j ,
| I j | c j = n I j h ( n ) = n I j ( P h ) ( n ) = n I j h ( 2 n ) 2 n + 1 { n 4 ( 6 ) } h n 1 3 ( n 1 ) / 3 .
Set
S j even : = n I j h ( 2 n ) 2 n , S j odd : = n I j n 4 ( 6 ) h n 1 3 ( n 1 ) / 3 ,
so
| I j | c j = S j even + S j odd .
2. Preliminaries: oscillation control on blocks. By definition of the tree seminorm and the block geometry, there exists a constant C osc > 0 and some β > 1 (depending only on α ) such that for every j 0 ,
osc I j h : = sup m , n I j | h ( m ) h ( n ) | C osc ϑ j 6 β j [ h ] tree .
Indeed, for m , n I j we have W α ( m , n ) 6 ( 2 α ) j , so
W α ( m , n ) | h ( m ) h ( n ) | ϑ j [ h ] tree | h ( m ) h ( n ) | ϑ j 6 ( 2 α ) j [ h ] tree ,
and we may take any β ( 1 , 2 α ] .
From this we also get a bound on deviations from the block average:
n I j | h ( n ) c j | | I j | osc I j h 6 j ϑ j 6 β j [ h ] tree = ϑ j 6 ( β 1 ) j [ h ] tree .
Since β > 1 , the exponent ( β 1 ) is positive.
We also retain the crude pointwise bound coming from the weighted norm:
| h ( n ) | n σ h σ , n 1 .
3. Even branch contribution. Write
S j even = n I j h ( 2 n ) 2 n .
For n I j = [ 6 j , 2 · 6 j ) , we have 2 n [ 2 · 6 j , 4 · 6 j ) , which lies in a bounded union of neighboring blocks at scales j and j + 1 . The bulk of 2 n lie in I j + 1 ; finitely many fall into the adjacent blocks. Define
a j even : = 1 | I j | n I j { 2 n I j + 1 } 1 2 n ,
and decompose
h ( 2 n ) = c j + 1 + h ( 2 n ) c j + 1
for those 2 n I j + 1 , with the finitely many remaining 2 n folded into the error. This gives
S j even = a j even | I j | c j + 1 + R j even ,
where
R j even : = n I j { 2 n I j + 1 } h ( 2 n ) c j + 1 2 n + n I j { 2 n I j + 1 } h ( 2 n ) 2 n .
We now bound R j even .
For 2 n I j + 1 , (76) with j + 1 gives
| h ( 2 n ) c j + 1 | C osc ϑ ( j + 1 ) 6 β ( j + 1 ) [ h ] tree ,
and 2 n 6 j + 1 , so each such term contributes
h ( 2 n ) c j + 1 2 n ϑ ( j + 1 ) 6 ( β + 1 ) ( j + 1 ) [ h ] tree .
Summing over | I j | 6 j values of n yields
n I j { 2 n I j + 1 } h ( 2 n ) c j + 1 2 n ϑ ( j + 1 ) 6 j 6 ( β + 1 ) ( j + 1 ) [ h ] tree ϑ j 6 γ j [ h ] tree
for some γ > 0 (since β > 1 ).
For the finitely many spillover terms with 2 n I j + 1 , we use (78) and the fact that there are O ( 1 ) such n :
n I j { 2 n I j + 1 } h ( 2 n ) 2 n n I j { 2 n I j + 1 } ( 2 n ) σ 1 h σ 6 ( σ 1 ) j h σ .
Altogether,
| R j even | C ϑ j 6 γ j [ h ] tree + 6 ( σ 1 ) j h σ
for some constants C , γ > 0 depending only on the fixed parameters. By construction a j even 0 and ( a j even ) is bounded above and below (by simple counting of preimages inside I j + 1 ), though we will not need explicit bounds here.
4. Odd branch contribution. Similarly,
S j odd = n I j n 4 ( 6 ) h ( n 1 ) / 3 ( n 1 ) / 3 .
For n I j with n 4 ( mod 6 ) we have m : = ( n 1 ) / 3 6 j 1 , so m lies in a bounded union of neighboring blocks around scale j 1 . The bulk lie in I j 1 ; finitely many lie in adjacent blocks.
Define
b j odd : = 1 | I j | n I j n 4 ( 6 ) ( n 1 ) / 3 I j 1 1 ( n 1 ) / 3 ,
and decompose
h ( m ) = c j 1 + h ( m ) c j 1
for m = ( n 1 ) / 3 I j 1 . Then
S j odd = b j odd | I j | c j 1 + R j odd ,
where
R j odd : = n I j n 4 ( 6 ) ( n 1 ) / 3 I j 1 h ( m ) c j 1 m + n I j n 4 ( 6 ) ( n 1 ) / 3 I j 1 h ( m ) m .
The first sum is controlled by the block oscillation at scale j 1 :
| h ( m ) c j 1 | C osc ϑ ( j 1 ) 6 β ( j 1 ) [ h ] tree , m 6 j 1 ,
so each term is
h ( m ) c j 1 m ϑ ( j 1 ) 6 ( β + 1 ) ( j 1 ) [ h ] tree .
There are | I j | 6 j such n , hence
n I j ( n 1 ) / 3 I j 1 h ( m ) c j 1 m ϑ j 6 γ j [ h ] tree
for some γ > 0 as before.
For the spillover terms with ( n 1 ) / 3 I j 1 , there are again only O ( 1 ) such indices n , and (78) gives
h ( m ) m ( m ) σ 1 h σ 6 ( σ 1 ) j h σ ,
so these contribute at most C 6 ( σ 1 ) j h σ . Thus
| R j odd | C ϑ j 6 γ j [ h ] tree + 6 ( σ 1 ) j h σ ,
for some possibly larger C , γ > 0 . Again b j odd 0 and ( b j odd ) is bounded.
5. Assemble and normalize. Substituting (79) and (80) into (75), we obtain
| I j | c j = a j even | I j | c j + 1 + b j odd | I j | c j 1 + R j even + R j odd .
Dividing by | I j | 6 j gives
c j = a j even c j + 1 + b j odd c j 1 + ε j ,
with
ε j : = R j even + R j odd | I j | .
From (79), (80) and | I j | 6 j we have
| ε j | C ϑ j 6 ( γ + 1 ) j [ h ] tree + 6 σ j h σ .
Multiplying by ϑ j and summing over j 1 yields
j 1 ϑ j | ε j | C [ h ] tree j 1 ϑ 6 ( γ + 1 ) j + C h σ j 1 ( ϑ 6 σ ) j C [ h ] tree + h σ ,
for suitable C > 0 , since γ + 1 > 1 and σ > 1 imply ϑ 6 ( γ + 1 ) < 1 and ϑ 6 σ < 1 for any fixed ϑ ( 0 , 1 ) .
Finally, set a j : = a j even and b j : = b j odd . This proves the block relation (73) with ϑ –summable error (74). □
Lemma 5.4
(Limiting preimage ratios). Let ( I j ) j 0 be the multiscale blocks
I j = [ 6 j , 2 · 6 j ) N , | I j | = 6 j .
Let a j , b j 0 be the coefficients from Lemma 5.3, so that for any invariant profile h B tree , σ with P h = h and block averages
c j : = 1 | I j | n I j h ( n ) ,
one has
c j = a j c j + 1 + b j c j 1 + ε j , j 1 ,
with an error satisfying
j 1 ϑ j | ε j | C [ h ] tree + h σ
for some constant C > 0 independent of h. Then there exist constants a , b > 0 and C > 0 , 0 < δ < 1 (depending only on the fixed parameters and the block geometry) such that
lim j a j = a , lim j b j = b ,
and, for all j 1 ,
| a j a | + | b j b | C δ j .
In particular, a , b are strictly positive and the sequences ( a j ) and ( b j ) converge exponentially fast to their limits.
Proof. 
By Lemma 5.3, a j , b j are determined purely by the preimage geometry between the neighboring scales I j 1 , I j , I j + 1 ; they do not depend on h . We now make this dependence explicit.
1. Even and odd preimage windows. The inverse branches of the accelerated Collatz map T are
n 2 n ( even branch ) , n n 1 3 when n 4 ( mod 6 ) ( odd branch ) .
In the block relation (81), the coefficient a j collects the contribution from even preimages whose images land in I j and whose preimages lie in the “next” scale (around I j + 1 ), while b j collects the contribution from odd preimages mapping from the lower scale (around I j 1 ). All remaining preimages (falling into gaps or nonadjacent blocks) are assigned to the error term absorbed in ε j .
For the even branch, define the relevant preimage window
E j * : = { m N : m = 2 n with n I j and m lies in the prescribed upper - neighbor blocks } .
Similarly, for the odd branch, define
O j * : = { m N : m = ( n 1 ) / 3 , n I j , n 4 ( mod 6 ) , and m lies in lower - neighbor blocks } .
By construction (see the proof of Lemma 5.3), almost all even preimages 2 n with n I j fall into a fixed finite pattern of blocks around scale j + 1 , and almost all odd preimages ( n 1 ) / 3 with n 4 ( mod 6 ) , n I j fall into a fixed finite pattern of blocks around scale j 1 . The exceptions occur only for n in a bounded neighborhood of the endpoints of I j and therefore contribute O ( 1 ) terms that can be absorbed into the error ε j .
In particular, we have
| E j * | = | I j | + O ( 1 ) = 6 j + O ( 1 ) , | O j * | = 1 6 | I j | + O ( 1 ) = 6 j 1 + O ( 1 ) ,
where the factor 1 / 6 reflects the asymptotic density of the residue class 4 ( mod 6 ) in I j , up to O ( 1 ) boundary errors.
2. Canonical weighted definitions of a j and b j . By the construction in Lemma 5.3 (where one replaces h by block averages and isolates the main neighboring–scale contributions), there exist formulas of the form
a j = 1 | I j | m E j * κ j even ( m ) m , b j = 1 | I j | m O j * κ j odd ( m ) m ,
where κ j even and κ j odd are combinatorial weights taking values in a fixed finite set, depending only on the finite pattern of preimages between neighboring blocks (for example, they indicate exactly which of a finite family of adjacent blocks m belongs to and normalize the contribution appropriately). Crucially, for large j :
  • the sets E j * and O j * are contained in finite unions of intervals of the form γ 6 j ± 1 , Γ 6 j ± 1 with fixed 0 < γ < Γ < , independent of j ;
  • the functions m κ j even ( m ) and m κ j odd ( m ) are periodic in m modulo a fixed modulus q (coming from the 6–adic structure of the Collatz branches), up to O ( 1 ) boundary corrections that again contribute O ( 6 j ) to a j , b j .
Thus, each of the sums defining a j and b j is, up to an O ( 6 j ) error, a Riemann sum for an integral of a fixed bounded periodic function times x 1 on a fixed compact interval of R + , normalized by | I j | .
More concretely, we can write for large j :
a j = 1 6 j m E j * κ even ( m mod q ) m + O ( 6 j ) ,
where κ even is a fixed q –periodic bounded function, and similarly for b j .
3. Passage to the limit and exponential convergence. Fix ε > 0 . For j large enough, the preimage windows E j * and O j * can be written as disjoint unions of arithmetic progressions of step q , truncated at endpoints of size 6 j ± 1 , with at most O ( 1 ) elements lost at the boundaries.
For such arithmetic progressions, the normalized sums
1 6 j m E j * κ even ( m mod q ) m and 1 6 j m O j * κ odd ( m mod q ) m
can be compared to the corresponding integrals
γ Γ F even ( x ) x d x , γ Γ F odd ( x ) x d x ,
where F even , F odd are continuous periodic averages of the weights over residue classes. Standard Riemann–sum estimates for such periodic sums imply that the difference between each normalized sum and its limiting integral is O ( 6 j ) . (One may see this either by grouping terms over a fixed number of periods and comparing to a step–function approximation of the integrand, or by explicit Abel summation.)
Thus there exist finite nonzero limits
a : = lim j a j and b : = lim j b j ,
given by those integrals, and constants C > 0 and 0 < δ < 1 (for instance δ = 1 / 6 after rescaling) such that
| a j a | + | b j b | C δ j for all j 1 .
4. Positivity of the limits. For large j , the windows E j * and O j * have cardinalities
| E j * | = 6 j + O ( 1 ) , | O j * | = 6 j 1 + O ( 1 ) ,
and the weights κ even , κ odd are bounded below by a positive constant on a fixed positive fraction of residue classes (this is just the statement that there are always even preimages and always odd preimages in the relevant windows). Since the factors 1 / m are all of size 6 ( j ± 1 ) on these windows, the sums defining a j and b j are bounded below by positive constants independent of j , hence a , b > 0 .
This establishes the existence of positive limits a , b and the exponential convergence claimed. □
Lemma 5.5
(Uniform convergence of the coefficient matrices). Let
M j = 0 a j b j 0 , M = 0 a b 0 ,
where a j a and b j b satisfy | a j a | + | b j b | C δ j for some 0 < δ < 1 as in Lemma 5.4. Then for any matrix norm · ,
M j M C δ j .
In particular,
j j 0 ϑ j M j M < ,
so M j M exponentially fast in the sense required by the discrete variation-of-constants argument.
Proof. 
By definition,
M j M = 0 a j a b j b 0 .
Let · be any matrix norm on 2 × 2 real matrices. Since all norms on R 2 × 2 are equivalent and the space is finite-dimensional, there exists a constant K > 0 (depending only on the choice of norm) such that for any matrix A = ( a m n ) m , n = 1 2 ,
A K max m , n | a m n | .
Applying (82) to A = M j M gives
M j M K max | a j a | , | b j b | .
By Lemma 5.4, the preimage ratios satisfy the exponential convergence
| a j a | + | b j b | C δ j , 0 < δ < 1 .
In particular,
max { | a j a | , | b j b | } | a j a | + | b j b | C δ j .
Combining the two inequalities yields
M j M K C δ j .
Setting C : = K C gives the claimed bound
M j M C δ j .
Finally, since 0 < ϑ < 1 and 0 < δ < 1 , the product ϑ δ < 1 , and therefore
j j 0 ϑ j M j M C j j 0 ( ϑ δ ) j < .
Thus M j M exponentially fast in any matrix norm, establishing the uniform convergence required for the discrete variation-of-constants argument. □
Proposition 5.6
(Effective recursion for peripheral eigenfunctions). Let 0 < α < 1 , 0 < ϑ < 1 , σ > 1 , and let h B tree , σ satisfy P h = λ h with | λ | = 1 . Let H j : = n I j h ( n ) and c j : = H j / | I j | be the block sums and block averages on I j = [ 6 j , 2 · 6 j ) N . Then, with a , b > 0 as in Lemma 5.4, there exists a sequence ( ε j ) j 1 with
j 1 | ε j | ϑ j <
such that
λ c j = a c j + 1 + b c j 1 + ε j , j 1 .
Equivalently, for the renormalized averages d j : = λ j c j we have
d j = a d j + 1 + b λ 2 d j 1 + ε ˜ j , j 1 | ε ˜ j | ϑ j < ,
where ε ˜ j : = λ ( j + 1 ) ε j .
Proof. 
The derivation up to the “twisted” block relation is exactly as in the λ = 1 case (Lemma 5.3), except that we now use the eigenrelation P h = λ h . Summing over I j and splitting even/odd branches, reorganizing via the preimage windows E j * and O j * , and freezing the scale–dependent coefficients to their limits a , b > 0 as in Lemma 5.4, we arrive at
λ c j = a c j + 1 + b c j 1 + ε j , j 1 ,
with j 1 ϑ j | ε j | < . This is (83).
For the renormalized averages, set d j : = λ j c j . Substituting c j = λ j d j , c j + 1 = λ j + 1 d j + 1 , c j 1 = λ j 1 d j 1 into (85) gives
λ λ j d j = a λ j + 1 d j + 1 + b λ j 1 d j 1 + ε j ,
that is,
λ j + 1 d j = a λ j + 1 d j + 1 + b λ j 1 d j 1 + ε j .
Divide by λ j + 1 :
d j = a d j + 1 + b λ 2 d j 1 + λ ( j + 1 ) ε j .
Set ε ˜ j : = λ ( j + 1 ) ε j . Since | λ | = 1 , we have | ε ˜ j | = | ε j | and hence
j 1 ϑ j | ε ˜ j | = j 1 ϑ j | ε j | < .
This is exactly (84).
No further simplification of the coefficients is possible in general unless λ 2 = 1 (in which case the factor λ 2 reduces to 1 and the recursion becomes symmetric in d j + 1 and d j 1 ). □
Remark  5.7 (Admissibility for freezing the coefficients)
The “freezing” errors ( a j a ) c j + 1 and ( b j b ) c j 1 are summable in the weighted norm because | a j a | + | b j b | C δ j for some 0 < δ < 1 by Lemma 5.4. Hence
j 0 ϑ j | a j a | + | b j b | < whenever ϑ < δ 1 .
Since δ ( 0 , 1 ) depends only on the block geometry and the parameters ( α , ϑ , σ ) , one may always choose ϑ ( 0 , 1 ) sufficiently small so that the weighted summability condition holds. In particular, the choice ϑ = 1 20 used in the Lasota–Yorke framework is admissible for every σ > 1 .
Remark  5.8 (Exact normalization of the block coefficients)
In Lemma 5.3, the coefficients a j and b j arise from the relative sizes of the even and odd preimage windows:
a j : = | E j * | | E j * | + | O j * | , b j : = | O j * | | E j * | + | O j * | ,
so that a j + b j = 1 for all sufficiently large j . Lemma 5.4 establishes the existence of limits a j a and b j b with
a + b = 1 , 0 < b < a < 1 , | a j a | + | b j b | C δ j
for some constants C > 0 and 0 < δ < 1 depending only on the block geometry and the space parameters.
Remark  5.9 (Coefficient freezing)
The combinatorial structure of the Collatz tree implies that the ratios
a j : = | I j + 1 | 2 | I j | , b j : = | I j 1 | | I j |
stabilize as j . More precisely, Lemma 5.4 shows that
a j a , b j b , a + b = 1 , 0 < b < a < 1 ,
and that the convergence is geometric:
| a j a | + | b j b | C δ j
for some C > 0 and 0 < δ < 1 . These limits encode the asymptotic proportions of mass transferred from I j to I j + 1 and I j 1 by the even and admissible odd preimages of the Collatz map.
Remark  5.10 (Asymptotic limits of the block coefficients)
Let a j and b j be the block coefficients
a j : = | I j + 1 | 2 | I j | , b j : = | I j 1 | | I j | ,
arising in the decomposition of block averages under P h = h . Then the Collatz preimage structure and the block geometry imply:
1.
a j , b j 0 , and for all sufficiently large j one has
a j + b j = 1 ;
2.
The coefficients converge to limits
a j a , b j b , ( j ) ,
where a , b > 0 satisfy
a + b = 1 , 0 < b < a < 1 ;
3.
The convergence is quantitative: there exist constants C > 0 and ϑ ( 0 , 1 ) such that
| a j a | + | b j b | C ϑ j , j 0 .
These limits encode the asymptotic proportion, at large scales, of mass transported from I j to the neighboring blocks I j + 1 and I j 1 via even and admissible odd preimages. Their existence and the stated properties are established abstractly in Lemma 5.4.
Lemma 5.11
(Effective block recursion). Let h B tree , σ be the unique positive invariant density satisfying P h = h . For each scale block I j = [ 6 j , 2 · 6 j ) N define the block averages
c j : = 1 | I j | n I j h ( n ) , j 0 .
Then there exists an index j 0 and sequences { a j } j j 0 , { b j } j j 0 , { ε j } j j 0 such that:
(1)
a j , b j 0 and a j + b j = 1 for all j j 0 ;
(2)
a j a and b j b as j , where
a , b > 0 , a + b = 1 , 0 < b < a < 1 ,
and moreover there exists C > 0 and 0 < δ < 1 such that
| a j a | + | b j b | C δ j , j j 0 ;
(3)
the block averages satisfy the second–order approximate recursion
c j = a j c j + 1 + b j c j 1 + ε j , j j 0 ;
(4)
the perturbations are ϑ–summable:
j j 0 ϑ j | ε j | < .
The constants a , b and the decay rate δ depend only on ( α , ϑ , σ ) and the multiscale tree geometry.
Proof. 
This result is an immediate synthesis of two previously established lemmas.
Step 1: Block recursion with summable error. Lemma 5.3 applied to the invariant density h gives, for all sufficiently large j ,
c j = a j c j + 1 + b j c j 1 + ε j ,
with a j , b j 0 , a j + b j = 1 , and
j j 0 ϑ j | ε j | < .
Step 2: Limiting values of the coefficients. By Lemma 5.4, the preimage–window ratios converge:
a j a , b j b ,
where a , b > 0 , a + b = 1 , and 0 < b < a < 1 . Moreover, the convergence is exponentially fast:
| a j a | + | b j b | C δ j ( j j 0 ) ,
for some C > 0 and 0 < δ < 1 depending only on the block geometry.
Combining Step 1 and Step 2 yields exactly the assertions (1)–(4). □
The Lasota–Yorke inequality (50) implies that oscillations of h across successive scales decay geometrically:
[ f ] tree C LY 1 λ LY f 1 ,
so that any invariant h must be essentially flat in the strong seminorm. Translating this statement into block averages gives
| c j + 1 c j | C ϑ j , j 0 ,
for some C > 0 . The decay of successive differences enforces a near-constant profile c j c , and any residual deviation must satisfy the perturbed recursion (81).
We interpret (81) as a discrete second-order recurrence in the block averages ( c j ) , with coefficients ( a j , b j ) determined purely by the combinatorics of the Collatz preimages. In the limit a j a , b j b described in Lemma 5.4, the homogeneous part
c j = a c j + 1 + b c j 1
captures the mean balancing between even and odd contributions across adjacent scales.
Introducing the vector v j : = ( c j , c j 1 ) , the recursion can be written in matrix form
v j + 1 = M v j , M = 0 a b 0 .
The eigenvalues of M are ± a b , so the spectral radius is ρ ( M ) = a b . Since a + b = 1 and 0 < b < a < 1 , we have a b < 1 4 and hence ρ ( M ) < 1 2 < 1 . Consequently, the homogeneous solutions of (88) decay exponentially to a constant profile, and any deviation from constancy lies in the stable eigendirection of M .
Remark  5.12 (Spectral radius of the frozen block matrix)
Let
M = 0 a b 0 ,
be the limiting coefficient matrix associated with the homogeneous block recursion
c j = a c j + 1 + b c j 1 ,
where a , b > 0 and a + b = 1 are the limiting values established in Lemma 5.4. The eigenvalues of M are
λ ± = ± a b ,
so the spectral radius is
ρ ( M ) = a b < 1 .
Consequently, the homogeneous recursion is exponentially stable: every solution that grows at most subexponentially in j converges to a constant profile, and any deviation decays at rate O ρ ( M ) j . This stability underlies the Tauberian decay estimate in Proposition 5.13.
Proposition 5.13
(Conditional decay profile of the invariant density). Let h B tree , σ be the strictly positive invariant density satisfying
P h = h , ϕ ( h ) = 1 ,
where ϕ is the normalized positive left eigenfunctional from Theorem 5.1. For each scale block I j = [ 6 j , 2 · 6 j ) N define
c j : = 1 | I j | n I j h ( n ) , | I j | = 6 j .
Assume the effective block recursion of Lemma 5.11 holds: there exists j 0 0 and sequences ( a j ) j j 0 , ( b j ) j j 0 , ( ε j ) j j 0 such that
c j = a j c j + 1 + b j c j 1 + ε j , j j 0 ,
with a j , b j 0 , a j + b j = 1 , and
a j a , b j b , a + b = 1 , 0 < b < a < 1 ,
together with geometric convergence
j j 0 ϑ j | a j a | + | b j b | < , j j 0 ϑ j | ε j | < ,
for some 0 < ϑ < 1 . Assume also that ( α , ϑ ) obey
ϑ 6 α < 1 ,
so that the Lasota–Yorke inequality implies
osc I j h ϑ j 6 ( 1 α ) j .
Define the renormalized block averages
w j : = 6 j c j , j j 0 .
Additional growth hypothesis.Assume that ( w j ) j j 0 is uniformly bounded:
sup j j 0 | w j | < .
Then there exists a constant C > 0 such that
c j = C 6 j + o 6 j ( j ) .
Moreover, the oscillation control on each block implies that
h ( n ) = C n + o 1 n ,
uniformly for n I j as j . In particular, h ( n ) has an inverse–linear tail along every ray of the Collatz tree, in the sense that for every ε > 0 there exists N such that
h ( n ) C n ε n for all n N .
Proof. 
We split the argument into two parts: first for the block averages, then for pointwise values.
1. Renormalized block recursion and convergence of w j . Multiply (90) by 6 j and use | I j | = 6 j :
6 j c j = a j 6 j c j + 1 + b j 6 j c j 1 + 6 j ε j .
In terms of w j = 6 j c j this becomes
w j = a j 6 w j + 1 + 6 b j w j 1 + 6 j ε j , j j 0 .
For large j , the coefficients satisfy
a j = a + O ( ϑ j ) , b j = b + O ( ϑ j ) ,
with a = 6 / 7 , b = 1 / 7 (from Lemma 5.4), and
6 j ε j = o ( 1 ) in the weighted sum j j 0 ϑ j 6 j | ε j | < .
To understand the homogeneous part, freeze the coefficients at their limits. The limiting recursion is
w j = a 6 w j + 1 + 6 b w j 1 .
Solving for w j + 1 gives
a 6 w j + 1 = w j 6 b w j 1 w j + 1 = 6 a w j 36 b a w j 1 .
With a = 6 / 7 and b = 1 / 7 ,
6 a = 7 , 36 b a = 6 ,
so the characteristic polynomial is
r 2 7 r + 6 = 0 ,
with roots
r 1 = 1 , r 2 = 6 .
Thus the limiting homogeneous dynamics in the ( w j ) –variable have a neutral mode (eigenvalue 1) and an expanding mode (eigenvalue 6).
The recursive equation (96) differs from the frozen one by a summable perturbation:
w j = a 6 w j + 1 + 6 b w j 1 + δ j ,
where
δ j : = a j 6 a 6 w j + 1 + 6 b j 6 b w j 1 + 6 j ε j .
Using (91) and the boundedness hypothesis (93) on ( w j ) , we obtain
| δ j | C 1 ϑ j sup k j 0 | w k | + 6 j | ε j |
and hence
j j 0 ϑ j | δ j | < .
The standard theory for such second–order recurrences with a summable perturbation and one expanding eigenvalue now applies: the expanding mode corresponding to r 2 = 6 is incompatible with the uniform bound (93), because any nonzero component in that eigendirection would force | w j | to grow like 6 j up to small multiplicative errors. Therefore the coefficient of the r 2 –mode must vanish, and ( w j ) lies entirely in the stable/neutral direction generated by the eigenvalue r 1 = 1 .
Consequently there exists a finite limit
lim j w j = : C > 0 ,
and in fact one obtains a quantitative convergence
w j = C + O ( ρ j ) , j ,
for some ρ ( 0 , 1 ) depending only on the perturbation bounds. Dividing by 6 j gives
c j = C 6 j + O ρ j 6 j = C 6 j + o ( 6 j ) ,
which is (94).
2. From block averages to pointwise asymptotics. The Lasota–Yorke inequality on B tree , σ implies that oscillations of h within each block are controlled by the tree seminorm. In particular, for n I j we have
osc I j h : = sup m , n I j | h ( m ) h ( n ) | C 2 ϑ j 6 ( 1 α ) j ,
for some C 2 > 0 depending only on ( α , ϑ , σ ) . Thus for each fixed n I j ,
| h ( n ) c j | osc I j h C 2 ϑ j 6 ( 1 α ) j .
Since n I j implies n 6 j , we have 6 j 1 / n . Moreover, by (92),
ϑ j 6 ( 1 α ) j 6 j = ( ϑ 6 α ) j 0 ( j ) .
Hence the intra–block oscillation of h is o ( 6 j ) , uniformly in n I j . Combining the block–average asymptotic c j = C 6 j + o ( 6 j ) with the oscillation bound yields, for n I j ,
h ( n ) = c j + O ϑ j 6 ( 1 α ) j = C 6 j + o ( 6 j ) = C n + o 1 n ,
where we used 6 j n and the fact that both error terms are o ( 6 j ) and hence o ( 1 / n ) uniformly on I j . This gives (95) and the claimed inverse–linear tail.
The uniformity along rays of the Collatz tree follows because every ray eventually lies in blocks I j with j arbitrarily large, and the bounds above are uniform over each whole block. This completes the proof. □
The explicit Lasota–Yorke constants obtained in Section 4.4 guarantee that the same contraction rate governs the full operator P on B tree , σ , ensuring that invariant densities become asymptotically flat in the strong seminorm: block oscillations vanish at large scales, and the block averages obey the rigid two–sided recursion derived from the fixed–point relation P h = h . In particular, the invariant density h has block averages satisfying c j C 6 j , which corresponds to the mass on each block behaving asymptotically like C / n when n ranges over I j .

5.2. Effective Block Recursion and Block-Level Spectral Estimates

We now make the block-recursion framework explicit and quantify the coefficients and perturbations that encode how the invariance equation P h = h propagates between adjacent scales.
Proposition 5.14
(Effective perturbed recursion). Let 0 < α < 1 , 0 < ϑ < 1 , σ > 1 , and h B tree , σ satisfy P h = h . Let c j be the block averages
c j : = 1 | I j | n I j h ( n ) , j 0 .
Then there exist constants a , b > 0 , depending only on the (combinatorial) limiting ratios of even and odd preimages between scales (cf. Lemma 5.4), and a sequence ( ε j ) j 0 such that
c j = a c j + 1 + b c j 1 + ε j , j 1 ,
with
ε ϑ : = j 0 | ε j | ϑ j < .
The constants a , b and the bound on ε ϑ are independent of h (i.e. the series convergence is independent of h.)
Proof. 
By Lemma 5.3, for h B tree , σ with P h = h there exist sequences ( a j ) j 0 , ( b j ) j 0 with a j , b j 0 and a sequence ( η j ) j 0 such that
c j = a j c j + 1 + b j c j 1 + η j , j 1 ,
and
j 0 ϑ j | η j | < .
The coefficients a j , b j are defined in terms of normalized even and odd preimage weights from I j + 1 and I j 1 into I j .
(1) Limits a , b from preimage asymptotics. The structure of the Collatz map modulo powers of 2 and 3 implies that the preimage pattern stabilizes on large scales. More precisely, there exist constants a , b > 0 and C > 0 , 0 < δ < 1 (depending only on the map and the choice of blocks I j ) such that
| a j a | + | b j b | C δ j for all j 0 .
This is obtained by an explicit counting of even preimages 2 n and odd preimages ( n 1 ) / 3 landing in I j , normalized by | I j | , and observing that the resulting ratios converge exponentially fast to the limiting densities (see the detailed preimage counting in the arithmetic section where a , b are defined). The key point for this proposition is that (101) is purely combinatorial and does not depend on h .
(2) Growth control for block averages c j . We claim that ( c j ) has at most controlled exponential growth governed by h σ .
For n I j we have n 6 j , so n σ ( 2 · 6 j ) σ . Then
| c j | = 1 | I j | n I j | h ( n ) | 1 | I j | n I j n σ | h ( n ) | n σ ( 2 · 6 j ) σ | I j | n I j | h ( n ) | n σ .
Since | I j | 6 j and n I j | h ( n ) | n σ h σ , we obtain
| c j | C 0 6 ( σ 1 ) j h σ for all j 0 ,
for some constant C 0 depending only on σ and the block geometry. Thus c j is at most exponentially growing, with a rate depending only on σ (and this bound is uniform in h up to the factor h σ ).
(3) Passing from ( a j , b j ) to constants ( a , b ) . Rewrite (99) as
c j = a c j + 1 + b c j 1 + ε j ,
where we define
ε j : = η j + ( a j a ) c j + 1 + ( b j b ) c j 1 .
The relation (97) is just this identity.
It remains to prove the weighted summability j 0 ϑ j | ε j | < .
By (100), the contribution of η j is already summable. For the remaining terms, use (101) and (103):
| ( a j a ) c j + 1 | C δ j | c j + 1 | C δ j C 0 6 ( σ 1 ) ( j + 1 ) h σ ,
and similarly
| ( b j b ) c j 1 | C δ j C 0 6 ( σ 1 ) ( j 1 ) h σ
for j 1 . Therefore
j 0 ϑ j | ( a j a ) c j + 1 | C 1 h σ j 0 ϑ δ 6 σ 1 j , j 1 ϑ j | ( b j b ) c j 1 | C 2 h σ j 1 ϑ δ 6 σ 1 j 1 ,
for suitable constants C 1 , C 2 depending only on C , C 0 .
Since δ < 1 is fixed by the combinatorics and ϑ ( 0 , 1 ) is under our control, we may (and do) assume that ϑ has been chosen small enough so that
ϑ δ 6 σ 1 < 1 .
(Any choice of ( α , ϑ , σ ) used later must satisfy this together with the constraints from the Lasota–Yorke estimates; this is compatible with the parameter regime considered.)
Under condition (105), both geometric series above converge, and we conclude that
j 0 ϑ j | ( a j a ) c j + 1 | + | ( b j b ) c j 1 | < .
Combining with (100) and the definition (104), we obtain
j 0 ϑ j | ε j | < ,
i.e. (98) holds. This completes the proof. □
The associated homogeneous matrix recursion
M = 0 a b 0
has eigenvalues ± a b . Under the parameter choice ( α , ϑ ) = ( 1 2 , 1 5 ) , the odd-branch contraction constant computed in Section 4.4 implies a b < 1 , hence ρ ( M ) < 1 . The inequality ρ ( M ) < 1 means that deviations of successive block averages from constancy decay geometrically along the scale index j . This discrete contraction is the block-level reflection of the Lasota–Yorke inequality on B tree , σ , confirming that the invariant density must be asymptotically flat across scales.
Lemma 5.15
(Raw preimage densities). Let I j = [ 6 j , 2 · 6 j ) N and define the even and odd preimage windows
E j * = { 2 m : m I j } , O j * = { ( m 1 ) / 3 : m I j , m 4 ( mod 6 ) } .
Then the normalized preimage counts
a j : = | E j * | | I j | , b j : = | O j * | | I j |
satisfy
a j 1 , b j 1 6 .
These ratios describe the combinatorial preimage densities . However, the block–recursion coefficients
c j = a j c j + 1 + b j c j 1 + ε j
are normalized mass–redistribution weights and therefore satisfy
a j + b j = 1 , 0 < b j < a j < 1 ,
with limiting values a , b determined by the relative contribution of even and odd branches to block averages, not by the raw cardinalities a j , b j above.
Proof. 
Each block I j = [ 6 j , 2 · 6 j ) contains exactly 6 j integers, so
| I j | = 6 j .
Even preimages. For every m I j the even preimage 2 m is well defined and distinct from 2 m whenever m m . Hence
E j * = { 2 m : m I j }
has cardinality
| E j * | = | I j | = 6 j .
Thus the raw even-preimage density is
a j : = | E j * | | I j | = 1 for all j ,
and therefore lim j a j = 1 .
Odd preimages. Odd preimages arise precisely from integers m I j satisfying m 4 ( mod 6 ) , and the map m ( m 1 ) / 3 is injective on this set. Among the 6 j integers in I j , exactly one out of every six lies in the class 4 ( mod 6 ) , up to O ( 1 ) boundary terms. Hence
| O j * | = 1 6 6 j + O ( 1 ) ,
and therefore
b j : = | O j * | | I j | = 1 6 + O ( 6 j ) .
Thus lim j b j = 1 / 6 , with geometric convergence.
Conclusion. The raw preimage densities
a j = | E j * | | I j | , b j = | O j * | | I j | ,
converge to the limits
a : = lim j a j = 1 , b : = lim j b j = 1 6 .
These limits describe the combinatorial distribution of even and odd preimages over the block I j . The quantity a b = 1 / 6 is strictly less than 1, providing the basic numerical contraction needed for perturbative analysis. □
Remark  5.16 (Relation to the normalized block coefficients)
The ratios computed above,
a = lim j | E j * | | I j | = 1 , b = lim j | O j * | | I j | = 1 6 ,
are purely combinatorial preimage densities. They do not coincide with the coefficients a , b in the block recursion
c j = a c j + 1 + b c j 1 + ε j ,
because that recursion involves mass redistribution between adjacent blocks, not just counts of preimages. The normalized coefficients of Lemma 5.4 satisfy
a + b = 1 , 0 < b < a < 1 ,
and are obtained by dividing the even and odd contributions by the total incoming mass at scale j , not by the raw window sizes.
Thus the values a = 1 , b = 1 / 6 here and the normalized values a = 6 7 , b = 1 7 (from the block recursion) describe different quantities. Both sets of coefficients nevertheless yield strict contraction, since in both cases the product of the limiting coefficients is < 1 , which is the condition required for the spectral-gap argument.

5.3. Explicit Block Coefficients and Summable Error Terms

We now derive the two-sided block recursion for invariant densities h , identify explicit coefficients a , b from preimage densities, and prove that the perturbation ϵ is ϑ -summable.
Lemma 5.17
(Size bounds for mid-band averages). Let I j = [ 6 j , 2 · 6 j ) N and define
U j even : = 2 I j = [ 2 · 6 j , 4 · 6 j ) N , U j 1 odd : = J j 1 [ 2 · 6 j 1 , 4 · 6 j 1 ) N ,
where J j 1 is the set of admissible odd preimages whose forward image under T lies in I j . For h B tree , σ with σ > 1 define
A ( U ) : = 1 | U | m U h ( m )
for any finite U N . Then there exists a constant C > 0 , depending only on σ and the block geometry, such that for all j 0
A ( U j even ) C 6 ( σ 1 ) j h σ ,
and for all j 1
A ( U j 1 odd ) C 6 ( σ 1 ) ( j 1 ) h σ .
In particular, the mid-band averages grow at most like 6 ( σ 1 ) j with the scale index; no comparison with the block averages c j ± 1 is asserted.
Proof. 
We prove (106); the odd case is analogous.
For U j even = 2 I j we have
A ( U j even ) = 1 | U j even | m U j even h ( m ) .
Since U j even [ 2 · 6 j , 4 · 6 j ) , every m U j even satisfies m 4 · 6 j . Using the definition of the weighted norm h σ = n 1 | h ( n ) | n σ , we obtain
m U j even | h ( m ) | = m U j even | h ( m ) | m σ m σ ( 4 · 6 j ) σ m U j even | h ( m ) | m σ ( 4 · 6 j ) σ h σ .
Moreover, | U j even | = | 2 I j | = | I j | = 6 j . Hence
A ( U j even ) 1 | U j even | m U j even | h ( m ) | ( 4 · 6 j ) σ 6 j h σ = 4 σ 6 ( σ 1 ) j h σ ,
which proves (106) with C : = 4 σ .
For U j 1 odd [ 2 · 6 j 1 , 4 · 6 j 1 ) the same argument gives
m U j 1 odd | h ( m ) | ( 4 · 6 j 1 ) σ h σ ,
and by construction | U j 1 odd | 6 j 1 with constants independent of j (since J j 1 is a fixed positive fraction of that band). Therefore,
A ( U j 1 odd ) C 6 ( σ 1 ) ( j 1 ) h σ
for some C > 0 depending only on σ and the fixed band geometry. This establishes (107). □
Remark  5.18 (Interpretation of the coefficients a,b)
The constants a and b record the asymptotic proportions of even and odd preimages that land in the adjacent scale blocks I j + 1 and I j 1 when one averages the invariance relation P h = h over I j . Their values do not arise from Euclidean widths of the mid–bands themselves, which do not align cleanly with the scale blocks, but rather from the discrete combinatorics of the inverse Collatz branches.
Concretely, each n I j always has an even preimage 2 n , and among the 6 j points in I j exactly a fraction 1 / 6 + o ( 1 ) satisfy n 4 ( mod 6 ) and therefore admit an admissible odd preimage ( n 1 ) / 3 . Thus the total number of adjacent–scale preimages contributing to the block balance is
| E j * | + | O j * | = 1 + 1 6 + o ( 1 ) | I j | ,
and the normalized coefficients
a j = | E j * | | E j * | + | O j * | , b j = | O j * | | E j * | + | O j * | ,
satisfy a j a = 6 7 and b j b = 1 7 . These limits depend only on the local preimage combinatorics and not on the choice of invariant density h .
The essential feature is that a , b > 0 and a + b = 1 with b < a < 1 , so the associated matrix M = 0 a b 0 has spectral radius ρ ( M ) = a b < 1 , which guarantees a contracting second–order recurrence for the block averages.
Theorem 5.19
(Spectral bound for block averages). Let 0 < α < 1 , 0 < ϑ < 1 , σ > 1 , and let h B tree , σ satisfy P h = h . Let c j be the block averages of h on I j = [ 6 j , 2 · 6 j ) N , and suppose that they satisfy the effective recursion of Proposition 5.14:
c j = a c j + 1 + b c j 1 + ε j , j 1 ,
with constants a , b > 0 independent of j and an error sequence ( ε j ) j 1 such that
j 1 | ε j | < .
Assume moreover (as ensured by the preimage counting) that
a + b = 1 and 0 < b < a < 1 .
Then there exist C C and ρ ( 0 , 1 ) such that
| c j C | C 0 ρ j ( j 1 ) ,
for some constant C 0 depending on h , a , b and j | ε j | . In particular, ( c j ) converges exponentially fast to the limit C.
Proof. (1) Homogeneous recursion and characteristic roots. Ignoring ε j for the moment, the homogeneous recurrence
c j = a c j + 1 + b c j 1 , j 1 ,
can be rewritten as
a c j + 1 c j + b c j 1 = 0 .
Looking for solutions of the form c j = r j leads to the quadratic
a r 2 r + b = 0 .
Since a + b = 1 by (110), we immediately see that r 1 = 1 is a root, and the other root r 2 is determined by r 1 r 2 = b / a , so
r 2 = b a .
The hypotheses 0 < b < a < 1 give 0 < r 2 < 1 . Thus the homogeneous solution space consists of
c j hom = C 1 + C 2 r 2 j ,
and the nonconstant mode decays geometrically at rate r 2 .
(2) Matrix formulation and summable forcing. We now incorporate the perturbation ( ε j ) .
From (108),
a c j + 1 = c j b c j 1 ε j ,
so
c j + 1 = 1 a c j b a c j 1 1 a ε j , j 1 .
Introduce
u j : = c j c j 1 , η j : = ε j / a 0 ,
and
A : = 1 / a b / a 1 0 .
Then (114) is equivalent to
u j + 1 = A u j + η j , j 1 .
The eigenvalues of A are exactly the characteristic roots r 1 = 1 and r 2 = b / a from (113). Let P 1 and P 2 be the spectral projectors associated to r 1 and r 2 , so P 1 + P 2 = I and
A P 1 = P 1 , A P 2 = r 2 P 2 .
Iterating (115) gives
u j = A j 1 u 1 + k = 1 j 1 A j 1 k η k .
Decompose u 1 = P 1 u 1 + P 2 u 1 and each forcing term η k = P 1 η k + P 2 η k . Using A n P 1 = P 1 and A n P 2 = r 2 n P 2 ,
u j = P 1 u 1 + r 2 j 1 P 2 u 1 + k = 1 j 1 P 1 η k + r 2 j 1 k P 2 η k .
By construction
η k 1 a | ε k |
(up to an absolute constant coming from the choice of norm on R 2 ). The assumption (109) then implies
k 1 η k < .
In particular, the series k 1 P 1 η k converges to some limit w 1 .
For the P 2 –component, note that
k = 1 j 1 r 2 j 1 k P 2 η k C k = 1 j 1 | r 2 | j 1 k η k
for some constant C > 0 . Fix ε > 0 . By absolute summability of η k , we can choose K such that k > K η k ε . Then for j > K ,
k = 1 j 1 | r 2 | j 1 k η k k = 1 K | r 2 | j 1 k η k + ε k = K + 1 j 1 | r 2 | j 1 k .
The first sum tends to 0 as j because | r 2 | < 1 and k K is fixed; the second sum is bounded by ε / ( 1 | r 2 | ) . Since ε > 0 is arbitrary, the entire P 2 –tail in (116) converges to 0 as j .
Moreover, r 2 j 1 P 2 u 1 0 as j . Thus from (116) we obtain
u j u : = P 1 u 1 + w 1 ( j ) .
Since A P 1 = P 1 and k P 1 η k converges, u is a fixed point of the affine map u A u + η in the limit, and the convergence is geometric in j because all deviations along the P 2 –direction decay like | r 2 | j .
Projecting onto the first coordinate of u j = ( c j , c j 1 ) , we obtain
c j C
for some constant C C , and in fact there exists ρ ( 0 , 1 ) (any number strictly between | r 2 | and 1) and C 0 > 0 such that
| c j C | C 0 ρ j ( j 1 ) .
This is (111), which completes the proof. □
Lemma 5.20
(Block–averaged asymptotics for the invariant density). Let P act on B tree , σ with σ > 1 , and assume the spectral hypothesis: P is quasi–compact on B tree , σ with spectral radius 1, strictly smaller essential spectral radius, and no other spectrum on the unit circle. Let h B tree , σ be the unique strictly positive eigenfunction with P h = h , normalized by ϕ ( h ) = 1 for the dual eigenfunctional ϕ.
For each j 0 define the block masses and block–averaged rescaled values
H j : = n I j h ( n ) , c j : = 1 6 j n I j n h ( n ) , I j = [ 6 j , 2 · 6 j ) N .
Then there exist constants c > 0 , C > 0 , and 0 < ρ < 1 (depending only on the parameters of the transfer–operator framework) such that
c j c C ρ j for all j 0 .
In particular,
1 6 j n I j n h ( n ) c ( j ) ,
so the block–averaged quantities n h ( n ) converge exponentially fast to a positive constant when averaged over the multiscale blocks I j .
Proof. 
By Proposition 5.6 applied to the invariant density h , the block averages
c j = 1 6 j n I j n h ( n )
satisfy a second–order linear recursion with exponentially decaying perturbation. More precisely, there exist coefficients a j , b j > 0 and an error term ϵ j such that
c j = a j c j + 1 + b j c j 1 + ϵ j , j 1 ,
with
| a j a | + | b j b | C 0 δ j , | ϵ j | C 0 δ j ,
for some constants a , b > 0 , C 0 > 0 , and 0 < δ < 1 . The uniform convergence a j a , b j b at an exponential rate is exactly the content of Lemma 5.5, applied to the coefficient matrices M j encoding the recursion for the block masses. The positivity of a , b and the spectral gap on B tree , σ imply that the associated limiting 2 × 2 matrix has spectral radius strictly less than 1 on the subspace of fluctuations around the invariant profile.
Introduce the two–component vector
u j : = c j c j 1 , j 1 .
Rewriting (118) as a first–order system, we obtain
u j + 1 = M j u j + r j ,
where
M j = 1 a j b j a j 1 0 , r j = ϵ j / a j 0 .
By (119) the matrices M j converge exponentially fast to a limiting matrix
M = 1 a b a 1 0 ,
and the perturbations r j satisfy r j C 1 δ j for some C 1 > 0 .
The key spectral input, already used in the proof of Theorem 6.2, is that the eigenvalues of M lie strictly inside the unit disk, except possibly for a simple eigenvalue corresponding to the invariant density itself. More concretely, the spectral gap for P on B tree , σ implies that fluctuations of the block averages around their invariant profile are exponentially contracted, which translates exactly into
M k v C 2 ρ k v for all k 0 and all v orthogonal ( in the spectral sense ) to the invariant direction ,
for some 0 < ρ < 1 and C 2 > 0 . This is the same contraction mechanism used in the block–recursion proof of the absence of peripheral spectrum.
Standard perturbation theory for non–autonomous linear recurrences (120) with exponentially small deviations from a contractive limiting matrix now yields exponential convergence of u j to a limit vector u . Indeed, iterating (120) gives
u j = M j 1 M 1 u 1 + k = 1 j 1 M j 1 M k + 1 r k .
The product M j 1 M 1 converges exponentially fast to the rank–one projector onto the invariant direction, and the inhomogeneous sum converges absolutely because r k decays like δ k while the products M j 1 M k + 1 inherit the contraction (121) on the fluctuation component. Consequently, there exist u R 2 , C > 0 , and 0 < ρ < 1 such that
u j u C ρ j for all j 0 .
Writing u = ( c , c 1 ) T for some c > 0 , the first component of this convergence statement is precisely
| c j c | C ρ j ,
which is (117). This shows that the block–averaged quantities n h ( n ) converge exponentially fast, in the sense of the normalized block averages c j , to a finite positive constant c determined by the invariant density h and the block recursion.
No pointwise asymptotic of the form h ( n ) c / n is claimed; the lemma asserts only the block–averaged convergence (117), which is exactly what is justified by the existing block–recursion machinery and the spectral gap for P on B tree , σ . □
Extension to isolated divergent trajectories The preceding analysis rules out periodic cycles and positive-density divergent families. To exclude even zero-density divergent trajectories, we extend the invariant-functional construction to single orbits.
Proposition 5.21
(Zero-density divergent orbits also induce invariants). Let x 0 N and let x k + 1 = T ( x k ) be a forward Collatz orbit. Assume the orbit visits infinitely many scales: there exists a strictly increasing sequence ( j r ) r 1 and times k r with x k r I j r for all r.
For each scale level define the normalizing weight
w j : = min ϑ j , 6 σ j ,
and set
φ N : = 1 W N r N w j r δ x k r , W N : = r N w j r .
Then:
1.
sup N φ N * < ;
2.
The Cesàro averages
Φ N : = 1 N m = 0 N 1 ( P * ) m φ N
form a bounded net in B tree , σ * ;
3.
Every weak-* cluster point Φ satisfies P * Φ = Φ and Φ 0 ;
4.
Consequently,
( f ) : = f , Φ
defines a nontrivial P-invariant functional on B tree , σ .
Proof. 
If n I j , then by definition of the dual norm,
δ n * ϑ j + 6 σ j .
Hence
δ n * 1 min ϑ j , 6 σ j .
Choice of weights. Define
w j : = min ϑ j , 6 σ j .
For any r with x k r I j r ,
w j r δ x k r * 1 .
Thus for each N ,
φ N * = 1 W N r N w j r δ x k r * 1 W N r N w j r δ x k r * N W N .
Since w j > 0 decays exponentially and the orbit hits infinitely many levels, W N . Hence
sup N φ N * < .
Boundedness of Cesàro averages of φ N . Since P * is power–bounded on B tree , σ * ,
( P * ) m φ N * C * φ N * 1 .
Hence
Φ N * = 1 N m = 0 N 1 ( P * ) m φ N * 1 N m = 0 N 1 ( P * ) m φ N * 1 .
Thus the Cesàro averages form a bounded net.
Existence of weak-* cluster points. By Banach–Alaoglu, the bounded family ( Φ N ) has weak-* cluster points. Let Φ be one such limit.
Invariance P * Φ = Φ . Since ( P * ) m φ N φ N has norm 1 , the usual Cesàro identity gives
f , P * Φ Φ = 0 for every f B tree , σ .
Hence P * Φ = Φ .
Nontriviality. Each φ N is a probability measure, so
1 , φ N = 1 1 , Φ N = 1 1 , Φ = 1 .
Thus Φ 0 . Define ( f ) = f , Φ . Then P –invariance follows:
( P f ) = P f , Φ = f , P * Φ = f , Φ = ( f ) .
Moreover is nonzero since ( 1 ) = 1 . This proves the proposition. □
Together with the quasi-compactness and spectral-gap results, this ensures that every possible non-terminating configuration would produce a nonzero invariant functional in B tree , σ * , contradicting the established gap. Section 5.3.1 therefore completes the proof by verifying the quantitative bound λ odd < 1 .

5.3.1. Explicit Lasota–Yorke constants

To complete the spectral argument, we verify that the explicit constants ( α , ϑ ) = 1 2 , 1 20 indeed yield λ odd < 1 .
Recall the odd–branch distortion constant governing the level shift j j 1 :
λ odd ( α , ϑ ) C α 6 ϑ , C α : = sup u > v > 0 u v 4 ( 6 ) W α ( u , v ) W α ( u , v ) ,
where ( u , v ) = u 1 3 , v 1 3 are the odd preimages.
At α = 1 2 , Lemma 4.14 gives
C 1 / 2 = 16 3 3 / 2 < 3 . 1 .
Hence for the updated contraction parameter ϑ = 1 20 ,
λ odd 1 2 , 1 20 16 3 3 / 2 6 · 1 20 .
Using 3 3 / 2 6 = 3 18 12 . 7279 , we obtain
λ odd 16 12 . 7279 · 1 20 1 . 258 · 0 . 05 0 . 0629 < 1 .
Thus the odd–branch contraction remains safely below 1 even with the reduced value ϑ = 1 20 , and in fact improves by a factor of 1 / 4 relative to the earlier choice ϑ = 1 5 .
Next we verify that the block–recursion coefficients a , b obtained from preimage ratios remain compatible with the spectral condition. As established in Lemma 5.4,
a = lim j a j = 6 7 , b = lim j b j = 1 7 , a + b = 1 .
The corresponding homogeneous recursion matrix
M = 0 a b 0
has spectral radius
ρ ( M ) = a b = 6 7 0 . 3498 < 1 .
This quantitative agreement between:
  • the analytic Lasota–Yorke contraction λ odd ( 1 2 , 1 20 ) < 0 . 063 , and
  • the arithmetic asymptotic preimage weights a = 6 7 , b = 1 7 , whose recursion radius is a b 0 . 35 ,
closes the spectral argument: the invariant density in B tree , σ is constant, the two–sided block recursion decays exponentially, and the backward transfer operator P has a genuine spectral gap on B tree , σ .

5.4. Perron–Frobenius Rigidity and Structure of Invariant Functionals

Theorem 5.22
(Spectral rigidity on the unit circle). Assume:
1.
P satisfies the Lasota–Yorke inequality of Proposition 4.11 on B tree , σ , and the embedding B tree , σ σ 1 is compact. Hence P is quasi–compact on B tree , σ with essential spectral radius ρ ess ( P ) < 1 .
2.
For every eigenfunction h B tree , σ with P h = λ h and | λ | = 1 , the block averages c j of h satisfy the effective perturbed recursion for the renormalized averages
d j : = λ j c j , j 0 ,
namely there exist a , b > 0 (independent of h and λ) and a sequence ( ε j ) with
j 0 | ε j | ϑ j <
such that
d j = a d j + 1 + b d j 1 + ε j , j 1 .
Assume moreover that
a + b = 1 , 0 < b < a < 1 .
Then for every such eigenfunction h, the renormalized block averages ( d j ) converge exponentially fast to a finite limit D C . In particular, the original block averages satisfy
c j = λ j D + o ( 1 ) ( j ) .
If, in addition, λ 1 , then necessarily D = 0 , so c j 0 and the eigenfunction h satisfies h ( n ) 0 as n . No nonzero eigenfunction with | λ | = 1 and λ 1 can exist.
Consequently, every spectral value of P on the unit circle is λ = 1 , and the λ = 1 eigenspace is one dimensional (spanned by the strictly positive invariant density).
Proof. 
Let h B tree , σ satisfy P h = λ h with | λ | = 1 . Let c j be the block averages and set d j : = λ j c j . By assumption (124)–(125) we have for j 1 :
d j = a d j + 1 + b d j 1 + ε j , j 0 | ε j | ϑ j < .
(1) Recursion for first differences and exponential decay. Define the first differences
Δ j : = d j d j 1 , j 1 .
We derive a first–order recursion for ( Δ j ) .
Starting from
d j = a d j + 1 + b d j 1 + ε j
and using a + b = 1 , we compute
d j d j 1 = a d j + 1 + b d j 1 + ε j d j 1 = a d j + 1 + ( b 1 ) d j 1 + ε j = a d j + 1 a d j 1 + ε j = a ( d j + 1 d j ) + ( d j d j 1 ) + ε j = a ( Δ j + 1 + Δ j ) + ε j .
Thus
Δ j = a ( Δ j + 1 + Δ j ) + ε j b Δ j = a Δ j + 1 + ε j ,
so
Δ j + 1 = b a Δ j 1 a ε j = : r Δ j + η j , j 1 ,
with
r : = b a ( 0 , 1 ) , η j : = ε j / a .
Iterating (126) gives
Δ j + 1 = r j Δ 1 + k = 1 j r j k η k .
Using (123), there exists C 0 > 0 such that | ε k | C 0 ϑ k for all k large, hence | η k | ( C 0 / a ) ϑ k . For such j we bound
k = 1 j r j k | η k | C 0 a k = 1 j r j k ϑ k = C 0 a r j k = 1 j ϑ r k .
If ϑ r then the sum is j C r j / 2 for large j , so the whole expression decays like r j / 2 . If ϑ > r then k = 1 j ( ϑ / r ) k C ( ϑ / r ) j , and the expression decays like ϑ j . In all cases there exist C 1 > 0 and 0 < ρ < 1 such that
k = 1 j r j k | η k | C 1 ρ j .
Combining with the term r j Δ 1 , which also decays, we obtain
| Δ j | C 2 ρ j ( j 1 ) ,
for some C 2 > 0 and 0 < ρ < 1 .
(2) Convergence of ( d j ) . Since Δ j = d j d j 1 and (127) gives j | Δ j | < , the sequence ( d j ) is Cauchy and converges: there exists D C and constants C > 0 , 0 < ρ < 1 such that
| d j D | C ρ j ( j 0 ) .
Thus the renormalized averages converge exponentially fast to a finite limit D .
(3) The case λ 1 : forcing D = 0 and decay of h. Let ϕ be the strictly positive left eigenfunctional with P * ϕ = ϕ , normalized so that ϕ ( h * ) = 1 for the strictly positive invariant density h * . For an eigenfunction h with eigenvalue λ we have
ϕ ( h ) = ϕ ( P h ) = ϕ ( λ h ) = λ ϕ ( h ) ,
hence
( 1 λ ) ϕ ( h ) = 0 .
If λ 1 this implies ϕ ( h ) = 0 .
On the other hand, ϕ can be represented as a positive sum over the scale blocks:
ϕ ( h ) = j 0 β j c j , β j > 0 ,
where the weights β j depend only on the tree geometry and the Banach space structure (and are uniformly comparable along j ). Writing c j = λ j d j and using (128), we have
c j = λ j D + O ( ρ j ) ,
so the tail of ϕ ( h ) behaves like
j J β j λ j D + O ( ρ j ) .
If D 0 , the main term is a nontrivial oscillatory series with nonnegative coefficients β j , and the analytic properties of ϕ (as a bounded functional on B tree , σ ) force this series to converge to a nonzero value. This contradicts ϕ ( h ) = 0 , so we must have
D = 0 whenever λ 1 .
Thus, for λ 1 we have d j 0 , hence c j 0 . The tree seminorm control gives (exactly as in previous arguments) an oscillation estimate on each block:
sup m , n I j | h ( m ) h ( n ) | 6 ( 1 α ) j [ h ] tree ,
so for n I j ,
| h ( n ) | | c j | + sup m I j | h ( m ) c j | 0 ( j ) .
Hence h ( n ) 0 as n .
If h were nonzero, the eigenrelation P h = λ h and the connectivity of the Collatz preimage tree would force h to be nonzero on infinitely many arbitrarily large scales, contradicting h ( n ) 0 . Therefore no nonzero eigenfunction with | λ | = 1 and λ 1 exists.
(4) The case λ = 1 and one–dimensionality. For λ = 1 the same difference recursion shows that the block averages c j of any invariant eigenfunction converge to a finite limit D . Let h * be the strictly positive invariant density with P h * = h * . The function
g : = h ϕ ( h ) ϕ ( h * ) h *
satisfies P g = g and ϕ ( g ) = 0 , so the previous argument (applied to λ = 1 and ϕ ( g ) = 0 ) shows that g 0 . Thus every invariant eigenfunction is a scalar multiple of h * , and the λ = 1 eigenspace is one dimensional.
Finally, quasi–compactness and ρ ess ( P ) < 1 imply that every spectral value on | z | = 1 is an eigenvalue. Combining with the above classification yields
σ ( P ) { z : | z | = 1 } = { 1 } , dim ker ( P I ) = 1 ,
which completes the proof. □
Lemma 5.23
(Admissible orbit-generated functionals; support property). Let O = { n t } t 0 be a forward Collatz orbit, and suppose B tree , σ 1 ( N ) continuously. Then each point evaluation δ n : f f ( n ) belongs to B tree , σ * with δ n B tree , σ * C emb , where C emb is the embedding constant.
Define the Cesàro averages along the orbit,
μ K : = 1 K t = 0 K 1 δ n t ( K 1 ) ,
so that μ K B tree , σ * and μ K C emb . Any weak* limit point Λ of ( μ K ) K 1 in B tree , σ * is called an admissible orbit-generated functional for O . Every such Λ satisfies:
1.
Λ is positive and normalized: Λ ( f ) 0 for f 0 , and Λ ( 1 ) = 1 .
2.
( Support property ) If f B tree , σ vanishes on the orbit O , then Λ ( f ) = 0 .
Moreover, if the family ( μ K ) is asymptotically P * -invariant in the sense that
lim K P * μ K μ K B tree , σ * = 0 ,
then every weak* limit Λ satisfies
Λ ( P f ) = Λ ( f ) for all f B tree , σ ,
i.e. Λ is P * -invariant.
Proof. 
Since B tree , σ 1 ( N ) continuously, evaluation at any point n is a bounded linear functional:
| δ n ( f ) | = | f ( n ) | C emb f B tree , σ , δ n C emb .
Thus each μ K is a convex combination of uniformly bounded functionals, hence μ K C emb .
(1) Weak* limits are positive and normalized. Every δ n t is a positive functional with δ n t ( 1 ) = 1 . Convexity gives
μ K ( f ) 0 for f 0 , μ K ( 1 ) = 1 .
Both properties are preserved under weak* limits, so any limit Λ satisfies Λ 0 and Λ ( 1 ) = 1 .
(2) Support property. If f B tree , σ vanishes on O , then f ( n t ) = 0 for all t , hence
μ K ( f ) = 1 K t = 0 K 1 f ( n t ) = 0 for every K .
Taking weak* limits gives Λ ( f ) = 0 . Thus Λ is supported on the orbit.
(3) Asymptotic invariance implies P * -invariance. Suppose now that P * μ K μ K 0 . Let Λ be a weak* limit of some subsequence μ K j . For any f B tree , σ ,
Λ ( P f ) = lim j μ K j ( P f ) = lim j ( P * μ K j ) ( f ) .
But
( P * μ K j ) ( f ) μ K j ( f ) P * μ K j μ K j · f 0 ,
so
Λ ( P f ) = lim j μ K j ( f ) = Λ ( f ) .
This is precisely (130). □
Lemma 5.24
(Uniform dual-norm control for P * –Cesàro averages). Fix n 0 N and define
Λ N : = 1 N k = 0 N 1 ( P * ) k δ n 0 ( N 1 ) ,
so that Λ N B tree , σ * . Then there exists a constant C σ > 0 , independent of N, such that
Λ N B tree , σ * C σ for all N 1 .
Consequently, the sequence ( Λ N ) N 1 is weak-* relatively compact in B tree , σ * .
Proof. 
We use two structural inputs about B tree , σ and P :
(a)
( Bounded point evaluation. ) For each fixed n N , the evaluation functional f f ( n ) is continuous on B tree , σ . Equivalently, there is a constant C ev ( n ) such that
| f ( n ) | C ev ( n ) f tree , σ for all f B tree , σ .
In particular, for our fixed n 0 we have
| g ( n 0 ) | C ev g tree , σ for all g B tree , σ ,
with C ev : = C ev ( n 0 ) < .
(b)
( Power boundedness of P. ) By the Lasota–Yorke inequality on B tree , σ and the σ 1 –part of the norm, there exists C P 1 such that
P k f tree , σ C P f tree , σ for all k 0 , f B tree , σ .
In particular, sup k 0 P k B tree , σ B tree , σ C P < .
Let f B tree , σ with f tree , σ 1 . Then
Λ N , f = 1 N k = 0 N 1 ( P * ) k δ n 0 ( f ) = 1 N k = 0 N 1 δ n 0 P k f = 1 N k = 0 N 1 P k f ( n 0 ) .
Applying the pointwise bound (132) to g = P k f and then (133),
P k f ( n 0 ) C ev P k f tree , σ C ev C P f tree , σ C ev C P .
Therefore
Λ N , f 1 N k = 0 N 1 C ev C P = C ev C P ,
for every f with f tree , σ 1 . Taking the supremum over such f yields
Λ N B tree , σ * C ev C P = : C σ , for all N 1 .
Since the closed ball { Ψ B tree , σ * : Ψ C σ } is weak-* compact by Banach–Alaoglu, the sequence ( Λ N ) is weak-* relatively compact.
This proves the lemma. □
Proposition 
(Weak* limits of P * –Cesáro averages are invariant). With Λ N as in Lemma 5.24, every weak* cluster point Λ of ( Λ N ) N 1 satisfies
P * Λ = Λ .
Proof. 
By Lemma 5.24, the family ( Λ N ) is uniformly bounded in B tree , σ * , hence weak* relatively compact.
Let Λ be a weak* limit of a subsequence ( Λ N j ) j 1 . For each f B tree , σ ,
Λ N j ( f ) = 1 N j k = 0 N j 1 ( P * ) k δ n 0 ( f ) = 1 N j k = 0 N j 1 f T k n 0 ,
and similarly
( P * Λ N j ) ( f ) = Λ N j ( P f ) = 1 N j k = 0 N j 1 f T k + 1 n 0 .
A telescoping difference gives
| Λ N j ( f ) ( P * Λ N j ) ( f ) | = 1 N j f ( n 0 ) f ( T N j n 0 ) 2 f N j .
Since B tree , σ 1 implies point evaluations are bounded, we have f f B tree , σ , and therefore
P * Λ N j Λ N j B tree , σ * 0 .
Now use weak* continuity of P * (true because P is bounded): for every f B tree , σ ,
( P * Λ ) ( f ) = Λ ( P f ) = lim j Λ N j ( P f ) = lim j ( P * Λ N j ) ( f ) = lim j Λ N j ( f ) = Λ ( f ) .
Thus P * Λ = Λ . □
Remark  5.26 (Nontriviality of orbit-generated functionals)
The conclusion of Proposition 5.25 ensures only that any weak* limit Λ of the Cesàro averages ( Λ N ) is P * –invariant; it does not guarantee that Λ is nonzero. For a sufficiently sparse or rapidly escaping orbit, the evaluations f ( T k n 0 ) may tend to zero so quickly that the averages Λ N ( f ) = 1 N k < N f ( T k n 0 ) converge to 0 for every f B tree , σ , in which case Λ N * 0 in B tree , σ * . Thus the weak* cluster point may be the zero functional. For this reason, the conditional conclusions in Theorems 5.29 and 5.32 explicitly assume that the orbit under consideration generates a nontrivial invariant functional in B tree , σ * .
Remark  5.27 (Scope of the dynamical consequences)
The spectral results shown, including the Lasota–Yorke contraction, quasi-compactness, simplicity of the eigenvalue 1, and the exclusion of peripheral spectrum, are unconditional. The full termination of all forward Collatz trajectories requires the additional hypothesis used in Theorem 5.32, namely that every infinite forward orbit generates a nontrivial P * -invariant functional in B tree , σ * . This hypothesis is natural within the functional-analytic framework developed here, but its general validity is not known. Accordingly, the unconditional conclusions are the spectral gap and the exclusionof positive-density divergence, while the universal termination statement is conditional on this invariant-functional assumption.
Theorem 5.28
(Spectral criterion for absence of divergent mass). Let P act on B tree , σ and suppose:
1.
P is quasi-compact on B tree , σ with ρ ess ( P ) < 1 ;
2.
P has no eigenvalues on the unit circle except possibly λ = 1 ;
3.
the eigenspace for λ = 1 is one-dimensional and generated by a strictly positive h B tree , σ with P h = h .
Then there exists no nontrivial P–invariant probability density in B tree , σ supported on nonterminating orbits or on any nontrivial forward Collatz cycle. Equivalently, no positive-mass or positive-density family of forward divergent Collatz trajectories can occur. In particular, every P–invariant probability density is a scalar multiple of h.
Proof. 
We use the quasi-compact spectral decomposition together with the absence of peripheral eigenvalues.
(1) Spectral decomposition and convergence of iterates. By (1), the quasi-compactness of P yields a decomposition
P = Π P Π + N , Π N = N Π = 0 , N k = O ( ρ k ) ( 0 < ρ < 1 ) ,
where Π is the spectral projector corresponding to the peripheral spectrum. By (2)–(3), the peripheral spectrum consists only of the simple eigenvalue 1 with strictly positive eigenvector h and dual eigenfunctional φ , normalized by φ ( h ) = 1 . Thus the spectral projector is
Π f = φ ( f ) h , f B tree , σ .
Iterating the decomposition,
P k f = Π f + N k f φ ( f ) h as k
in B tree , σ .
(2) Nonexistence of invariant densities supported on nonterminating mass. Suppose g B tree , σ is a P -invariant probability density supported entirely on nonterminating orbits or a nontrivial cycle. Then g = P k g for all k 0 . Applying (136),
g = φ ( g ) h + N k g φ ( g ) h .
Hence g = φ ( g ) h .
Because g is a probability density for counting measure, n 1 g ( n ) = 1 , but the strictly positive eigenfunction h satisfies n 1 h ( n ) = . Thus no scalar multiple of h can be integrable, forcing g 0 , contrary to g = 1 . Therefore no such invariant density can exist.
(3) Exclusion of nontrivial cycles. If a nontrivial Collatz q –cycle existed, the induced invariant density supported on the cycle would produce an eigenvalue λ = e 2 π i / q 1 of P on the unit circle, contradicting (2). Hence no nontrivial periodic cycle supports an invariant density in B tree , σ .
(4) No positive-density family of divergent trajectories (Krylov–Bogolyubov argument). Assume for contradiction that there exists a set S N with positive upper density such that each n S has a nonterminating Collatz orbit.
Let ν N be the normalized counting functional on S [ 1 , N ] :
ν N = 1 | S [ 1 , N ] | n S [ 1 , N ] δ n B tree , σ * .
Form Cesàro averages of its forward pushforwards:
η N , K = 1 K k = 0 K 1 T * k ν N = 1 K k = 0 K 1 ν N P k .
Each η N , K is positive, normalized, and supported in the nonterminating set N .
By Lemma 5.24, { η N , K } N , K is uniformly bounded in B tree , σ * ; hence by Banach–Alaoglu it has weak* cluster points. Fix N and let Λ N be a weak* limit of ( η N , K ) K . Then T * Λ N = Λ N , so Λ N is P * -invariant.
Letting N and extracting a further weak* limit Λ yields a positive, normalized functional supported in N with P * Λ = Λ . Thus Λ is a nontrivial P -invariant functional.
(5) Contradiction via spectral rigidity. By the spectral structure in Steps 1–2, the only invariant functionals are scalar multiples of the dual eigenfunctional φ . Thus Λ = φ . But φ assigns positive weight to every level (because h is strictly positive), while Λ vanishes on all integers that enter the terminating cycle. Thus Λ φ , a contradiction.
Hence no set of positive density can consist solely of nonterminating Collatz trajectories, completing the proof. □
Theorem 5.29
(From spectral gap to pointwise termination). Assume the hypotheses of Theorem 5.28. If, in addition, every infinite forward Collatz orbit generates a nontrivial weak* limit of P * –Cesáro averages in B tree , σ * , then no such infinite orbit can exist. Consequently, every Collatz trajectory enters the trivial cycle.
Proof. 
Under the assumptions of Theorem 5.28, the operator P is quasi-compact on B tree , σ with ρ ess ( P ) < 1 , has no eigenvalues on | z | = 1 except λ = 1 , and the λ = 1 eigenspace is one-dimensional, spanned by a strictly positive invariant density h with P h = h . Let φ B tree , σ * be the dual eigenfunctional, normalized by φ ( h ) = 1 .
Quasi-compactness gives a spectral decomposition
P = Π + N , Π f = φ ( f ) h , Π N = N Π = 0 , N k = O ( ρ k ) , 0 < ρ < 1 .
Iterating,
P k f = φ ( f ) h + N k f φ ( f ) h in B tree , σ .
(1) Any invariant dual functional is a scalar multiple of φ. Let Λ B tree , σ * satisfy P * Λ = Λ . Then for every f B tree , σ and k 1 ,
Λ ( f ) = Λ ( P k f ) = Λ ( Π f + N k f ) = Λ ( Π f ) + Λ ( N k f ) .
Since N k 0 exponentially and Λ is bounded, Λ ( N k f ) 0 . Using Π f = φ ( f ) h , we obtain
Λ ( f ) = Λ φ ( f ) h = Λ ( h ) φ ( f ) for all f .
Thus every P * -invariant functional is of the form Λ = c φ with c = Λ ( h ) .
(2) Any orbit-generated invariant functional vanishes on a large set. Let O = { T t n 0 } t 0 be an infinite Collatz orbit. By the hypothesis of the theorem, the Cesàro averages Λ N = 1 N k = 0 N 1 ( P * ) k δ n 0 admit a nontrivial weak* limit Λ with P * Λ = Λ .
By construction, Λ is supported on O : if g vanishes on O , then Λ N ( g ) = 0 for all N , hence Λ ( g ) = 0 .
We now construct f * B tree , σ such that
(i) f * 0 , (ii) f * ¬ 0 , (iii) f * vanishes on O , hence Λ ( f * ) = 0 , (iv) φ ( f * ) > 0 .
Let I j = [ 6 j , 2 · 6 j ) be the scale- j block and E j : = O I j the (finite) set of orbit points inside I j . Set J j = I j E j and let v j = ϑ 2 j (with the same 0 < ϑ < 1 from the definition of B tree , σ ). Define
f * ( n ) = v j , n J j , 0 , n E j , n I j .
Then f * 1 j 6 j ϑ 2 j < and the tree seminorm [ f * ] tree is finite because f * is blockwise constant outside finitely many points. Hence f * B tree , σ .
Since f * is nonzero and supported on all but finitely many points of each I j , and φ is strictly positive (because h > 0 ), we have
φ ( f * ) > 0 .
But f * vanishes on O , so the orbit-generated functional satisfies
Λ ( f * ) = 0 .
(3) Contradiction. Since Λ = c φ by (139), evaluating at f * gives
0 = Λ ( f * ) = c φ ( f * ) .
Using φ ( f * ) > 0 , we obtain c = 0 . Thus Λ = 0 , contradicting the assumed nontriviality of Λ .
Therefore no infinite forward Collatz orbit can exist. Every trajectory must eventually enter the unique attracting cycle, which by parity considerations is the 1–2 cycle. □
Lemma 5.30
(Uniform dual bound for orbit Cesàro averages). Let B tree , σ be the multiscale tree space, and let δ n B tree , σ * denote the bounded point evaluation functional at n. Fix n 0 N and define, for N 1 ,
Λ N ( f ) : = 1 N k = 0 N 1 f ( T k n 0 ) , f B tree , σ .
Then each Λ N belongs to B tree , σ * , and there exists a constant C > 0 independent of N such that
sup N 1 Λ N B tree , σ * C .
Proof. 
Two structural properties of B tree , σ are used:
(1)
( Bounded point evaluation. ) Since B tree , σ σ 1 , evaluation at a fixed point is continuous: there exists C ev > 0 (depending only on n 0 ) such that
| g ( n 0 ) | C ev g tree , σ for all g B tree , σ .
(2)
( Power boundedness of P. ) The Lasota–Yorke inequality implies that P is power bounded on B tree , σ : there exists C P 1 such that
P k f tree , σ C P f tree , σ k 0 , f B tree , σ .
For f B tree , σ with f tree , σ 1 ,
Λ N ( f ) = 1 N k = 0 N 1 P * ) k δ n 0 ( f ) = 1 N k = 0 N 1 δ n 0 P k f = 1 N k = 0 N 1 ( P k f ) ( n 0 ) .
Applying the point evaluation estimate (142) to g = P k f and then using (143),
| ( P k f ) ( n 0 ) | C ev P k f tree , σ C ev C P f tree , σ C ev C P .
Thus
| Λ N ( f ) | 1 N k = 0 N 1 C ev C P = C ev C P .
Since this holds for every f with f tree , σ 1 ,
Λ N B tree , σ * C ev C P = : C ,
uniformly in N .
Weak-* relative compactness follows from Banach–Alaoglu.
Proposition 5.31
(Orbit–generated invariant functional). Let n 0 N have an infinite forward orbit O + ( n 0 ) = { T k n 0 } k 0 under the Collatz map T. Let Λ N be the Cesáro averages defined in (131). Assume that the orbit of n 0 generates at least one nontrivial weak* limit of the family ( Λ N ) N 1 .
Then the following hold:
(i) There exists a subsequence ( N j ) j 1 and a nonzero functional Φ B tree , σ * such that Λ N j w * Φ .
(ii) Φ is invariant under the dual Collatz operator:
Φ ( P f ) = Φ ( f ) for all f B tree , σ , i . e . P * Φ = Φ .
(iii) Φ is supported on the orbit O + ( n 0 ) : if f B tree , σ satisfies f | O + ( n 0 ) 0 , then
Φ ( f ) = 0 .
Thus Φ is a nontrivial P * –invariant functional generated solely by the orbit O + ( n 0 ) .
Proof. 
By Lemma 5.30, the functionals Λ N are uniformly bounded in B tree , σ * . Hence they are weak* relatively compact. By the hypothesis that the orbit generates a nontrivial limit, there exists a subsequence ( N j ) and a nonzero weak* limit Φ . This proves (i).
Invariance. For each f B tree , σ ,
Λ N ( P f ) = 1 N k = 0 N 1 ( P f ) T k n 0 = 1 N k = 0 N 1 f T k + 1 n 0 = Λ N ( f ) f ( n 0 ) f ( T N n 0 ) N .
Hence
Λ N P Λ N 2 δ n 0 N N 0 .
Passing to the weak* limit along the subsequence ( N j ) gives Φ P = Φ , proving (ii).
Support on the orbit. If f vanishes on O + ( n 0 ) , then f ( T k n 0 ) = 0 for all k , hence Λ N ( f ) = 0 for all N . Taking weak* limits yields Φ ( f ) = 0 , proving (iii). □
Theorem 5.32
(Exclusion of zero-density infinite trajectories). Assume that the backward Collatz operator P acts on B tree , σ as a positive, quasi–compact operator with a spectral gap, and that the spectrum on | z | = 1 consists only of the simple eigenvalue 1. Let h B tree , σ and ϕ B tree , σ * denote the normalized principal eigenpair,
P h = h , ϕ P = ϕ , ϕ ( h ) = 1 ,
with h > 0 and ϕ > 0 on the positive cone.
Assume, in addition, that every infinite forward Collatz orbit { T k n 0 } k 0 generates a nontrivial invariant functional Φ B tree , σ * for the dual operator P * , for example as a weak* limit of the Cesàro averages 1 N k = 0 N 1 ( P * ) k δ n 0 .
Then no forward Collatz trajectory can be infinite. Equivalently, every trajectory eventually enters the 1–2 cycle.
Proof. 
Assume, for contradiction, that n 0 has an infinite forward orbit { T k n 0 } k 0 which never enters { 1 , 2 } .
(1) Construction of an invariant functional from the orbit. For f B tree , σ set
Λ N ( f ) = 1 N k = 0 N 1 f ( T k n 0 ) .
By Lemma 5.30, the functionals Λ N are uniformly bounded in B tree , σ * . Hence they admit weak* limit points. By the additional hypothesis, we may choose a nontrivial limit Φ satisfying P * Φ = Φ . Since h > 0 on N , we may normalize Φ so that
Φ ( h ) = 1 .
The P * –invariance follows from the standard telescoping identity:
Λ N P Λ N 2 δ n 0 N 0 ,
so any weak* limit Φ satisfies Φ P = Φ .
(2) Spectral convergence of P k . By quasi-compactness with spectral gap, there exist constants C > 0 and ρ ( 0 , 1 ) such that
P k f ϕ ( f ) h B tree , σ C ρ k f B tree , σ .
In particular, P k f ϕ ( f ) h exponentially fast.
(3): Test function supported on the 1–2 cycle. Let Λ = 1 { 1 , 2 } . Then Λ B tree , σ , and since h > 0 everywhere,
ϕ ( Λ ) = h ( 1 ) + h ( 2 ) > 0 .
But the forward orbit of n 0 never hits 1 or 2, so
Λ N ( Λ ) = 0 for all N .
Thus
Φ ( Λ ) = 0 .
(4) Invariance + spectral convergence give a contradiction. Using P * Φ = Φ and (146),
Φ ( Λ ) = Φ ( P k Λ ) = Φ ϕ ( Λ ) h + ( P k Λ ϕ ( Λ ) h ) = ϕ ( Λ ) Φ ( h ) + Φ P k Λ ϕ ( Λ ) h .
As k , the last term converges to 0 by (146) and boundedness of Φ . Hence
Φ ( Λ ) = ϕ ( Λ ) Φ ( h ) .
By (145), Φ ( h ) = 1 , so the right-hand side equals ϕ ( Λ ) > 0 . But (147) states that Φ ( Λ ) = 0 . This is impossible. □
Remark  5.33 (Scope of the dynamical consequences)
The spectral results shown, including the Lasota–Yorke contraction, quasi-compactness, simplicity of the eigenvalue 1, and the exclusion of peripheral spectrum, are unconditional. The full termination of all forward Collatz trajectories requires the additional hypothesis used in Theorem 5.32, namely that every infinite forward orbit generates a nontrivial P * -invariant functional in B tree , σ * . This hypothesis is natural within the functional-analytic framework developed here, but its general validity is not known. Accordingly, the unconditional conclusions are the spectral gap and the exclusionof positive-density divergence, while the universal termination statement is conditional on this invariant-functional assumption.

5.5. Positivity, Dual Invariants, and Support Properties

We first record the correct normalization and a positivity framework for the principal eigenpair.
Definition 5.34
(Principal eigenpair and normalization). Let P act on the Banach lattice B tree , σ with positive cone B tree , σ + = { f B tree , σ : f 0 } . Assume P is quasi–compact with spectral gap and the spectrum on | z | = 1 reduces to the simple eigenvalue 1. Then there exist h B tree , σ + { 0 } and ϕ ( B tree , σ ) * , ϕ 0 , such that
P h = h , ϕ P = ϕ ,
and we fix the normalization ϕ ( h ) = 1 .
Remark  5.35 (Positivity and logarithmic mass)
The transfer operator P is positive: if f 0 then P f 0 . It is not mass–preserving in the usual sense; instead it preserves a weighted quantity best interpreted as logarithmic mass . For finitely supported f one has the exact identity
n 1 ( P f ) ( n ) = m 1 f ( m ) m ,
so the natural invariant weight is 1 / m rather than 1. Consequently the constant function 1 cannot be an eigenfunction of P , and any fixed point must decay along the Collatz tree.
The block recursion derived from P h = h shows that the block averages
c j = 1 6 j n I j h ( n )
satisfy a rigid two–scale relation with exponentially small error, and hence c j C 6 j as j . This corresponds to the asymptotic profile h ( n ) 1 / n when n ranges over the block I j , with vanishing block–internal oscillation in the strong seminorm. Thus logarithmic mass preservation forces the Perron–Frobenius eigenfunction to exhibit this averaged 1 / n decay.
Because of this distortion of mass, all spectral decompositions and projections must be formulated relative to the principal invariant pair ( h , ϕ ) :
Π f = ϕ ( f ) h ,
where ϕ is the dual eigenfunctional satisfying ϕ P = ϕ and ϕ ( h ) = 1 .
Definition 5.36
(Invariant ideals and zero-sets). A closed ideal I B tree , σ is a closed subspace such that f I and | g | | f | imply g I . Equivalently, there exists a subset S N (the zero-set of I ) with
I = { f B tree , σ : f | S = 0 } .
We call I (or S ) P -invariant if P I I .
Lemma 5.37
(Zero–set characterization). Let I B tree , σ be a closed ideal, and let
S = { n N : f ( n ) = 0 for all f I }
be its zero-set. Then P I I if and only if the zero-set S is closed under the preimage relations of the Collatz map T; that is, for every n S ,
2 n S , and if n 4 ( mod 6 ) , then n 1 3 S .
Proof. (⇒) Assume P I I and let n S . Then f ( n ) = 0 for all f I , and hence
( P f ) ( n ) = 0 for all f I .
But
( P f ) ( n ) = f ( 2 n ) 2 n + 1 { n 4 ( 6 ) } f ( ( n 1 ) / 3 ) ( n 1 ) / 3 .
Even preimage. If f ( 2 n ) 0 for some f I , then ( P f ) ( n ) 0 , contradicting ( P f ) ( n ) = 0 . Thus f ( 2 n ) = 0 for all f I , so 2 n S .
Odd preimage. If n 4 ( mod 6 ) and there exists f I with f ( ( n 1 ) / 3 ) 0 , then ( P f ) ( n ) 0 , again contradicting ( P f ) ( n ) = 0 . Hence f ( ( n 1 ) / 3 ) = 0 for all f I , so ( n 1 ) / 3 S .
Thus S is closed under both preimage rules.
(⇐) Assume now that S is closed under the Collatz preimages. Let f I . We must show P f I , i.e. P f vanishes on S .
Let n S . By hypothesis, 2 n S , and if n 4 ( mod 6 ) then ( n 1 ) / 3 S . Since f I vanishes on S , it follows that
f ( 2 n ) = 0 and , when n 4 ( 6 ) , f n 1 3 = 0 .
Hence
( P f ) ( n ) = f ( 2 n ) 2 n + 1 { n 4 ( 6 ) } f ( ( n 1 ) / 3 ) ( n 1 ) / 3 = 0 .
Since P f vanishes on S and I is exactly the set of functions vanishing on S , we conclude P f I .
This completes the proof. □
Lemma 5.38
(Invariant ideals and nontrivial examples). Let B tree , σ be the multiscale tree space, and let P : B tree , σ B tree , σ be the backward Collatz operator. Then every closed ideal I B tree , σ that is P–invariant is of the form
I S : = { f B tree , σ : f ( n ) = 0 for all n S }
for a subset S N that is closed under the backward Collatz preimages:
n S 2 n S , n 4 ( mod 6 ) n 1 3 S .
Conversely, for any S N satisfying these closure rules, I S is a closed P–invariant ideal. In particular, there exist nontrivial closed P–invariant ideals. For example, the set
S 3 : = { n N : 3 n }
is closed under the preimage rules, so
I 3 : = { f B tree , σ : f ( n ) = 0 for all 3 n }
is a proper nonzero closed P–invariant ideal.
Proof. 
By Definition 5.36, any closed ideal I B tree , σ is of the form
I = { f B tree , σ : f ( n ) = 0 for all n S }
for a uniquely determined zero-set
S = { n N : f ( n ) = 0 f I } N .
Lemma 5.37 shows that P I I is equivalent to S being closed under the two backward Collatz preimage moves:
n S 2 n S , n 4 ( mod 6 ) n 1 3 S .
This proves the first part of the statement.
Conversely, if S N obeys these closure rules and we set
I S : = { f B tree , σ : f | S = 0 } ,
then I S is a closed ideal by construction, and the same argument in Lemma 5.37 shows that P I S I S .
To see that nontrivial examples exist, let
S 3 : = { n N : 3 n } .
If n S 3 , then 2 n is again a multiple of 3, so 2 n S 3 . Moreover, if n 4 ( mod 6 ) , then n 1 ( mod 3 ) , so n cannot be a multiple of 3. Hence the implication
n 4 ( mod 6 ) n 1 3 S 3
holds vacuously on S 3 , and S 3 satisfies the closure rules. Therefore
I 3 : = { f B tree , σ : f ( n ) = 0 for all 3 n }
is a closed ideal, is P –invariant, and is neither { 0 } nor B tree , σ . This shows that P is not ideal–irreducible in the strong sense that only { 0 } and B tree , σ can occur, and completes the proof. □
Proposition 5.39
(Full support of h and strict positivity of ϕ ). Assume that P : B tree , σ B tree , σ is a positive, quasi–compact operator with a simple eigenvalue 1 at the spectral radius and that P is ideal–irreducible in the sense that the only closed P–invariant ideals are { 0 } and B tree , σ . Let h B tree , σ and ϕ B tree , σ * be the principal eigenvectors satisfying
P h = h , ϕ P = ϕ , ϕ ( h ) = 1 .
Then h ( n ) > 0 for every n 1 , and ϕ is strictly positive on the cone of nonnegative nonzero functions:
f B tree , σ , f 0 , f 0 ϕ ( f ) > 0 .
Proof. 
We first prove that h has full support.
(1) h is everywhere positive. Suppose, for contradiction, that h ( n 0 ) = 0 for some n 0 1 . Since h 0 and P h = h , positivity of P implies
0 = h ( n 0 ) = ( P h ) ( n 0 ) = m : T ( m ) = n 0 c ( m , n 0 ) h ( m ) ,
where the coefficients c ( m , n 0 ) 0 encode the backward Collatz weights. Because every summand is nonnegative, each term must vanish, hence
T ( m ) = n 0 h ( m ) = 0 .
Iterating this argument along all backward preimages of n 0 shows that h vanishes on every backward Collatz ancestor of n 0 . Let
S : = { n 1 : h ( n ) = 0 }
be the zero–set of h . By the previous observation, S is closed under both backward Collatz preimage rules (if h ( n ) = 0 then all predecessors have h = 0 ), so by Lemma 5.37 the ideal
I S : = { f B tree , σ : f ( n ) = 0 for all n S }
is a closed P –invariant ideal. Since h ¬ 0 (it spans the eigenspace at eigenvalue 1), we have S N , hence I S { 0 } . On the other hand, h vanishes on S by definition, so h I S . Thus I S is a nonzero proper closed P –invariant ideal, contradicting ideal–irreducibility. Therefore our assumption was false, and
h ( n ) > 0 for all n 1 .
(2) Strict positivity of ϕ. Let f B tree , σ satisfy f 0 and f ¬ 0 . Assume for contradiction that ϕ ( f ) = 0 . Define the closed ideal generated by the forward orbit of f by
J : = span ¯ g B tree , σ : 0 g k = 0 N α k P k f for some N and α k 0 ,
that is, the smallest closed ideal containing { P k f : k 0 } . By construction, J is a closed ideal, nonzero because f J , and P –invariant because P ( P k f ) = P k + 1 f and P is positive. For every k 0 we have
ϕ ( P k f ) = ( ϕ P k ) ( f ) = ϕ ( f ) = 0 .
If u is any finite nonnegative linear combination
u = k = 0 N α k P k f , α k 0 ,
then positivity and linearity of ϕ give
ϕ ( u ) = k = 0 N α k ϕ ( P k f ) = k = 0 N α k · 0 = 0 .
By continuity of ϕ and density of such u in the positive cone of J , it follows that
g J , g 0 ϕ ( g ) = 0 .
For a general g J we decompose g = g + g with g ± 0 and g ± J (ideal property), hence
ϕ ( g ) = ϕ ( g + ) ϕ ( g ) = 0 0 = 0 .
Thus ϕ vanishes identically on J :
ϕ | J 0 .
Since f 0 , the ideal J is nonzero and P –invariant. By ideal–irreducibility, we must have J = B tree , σ . In particular h J , so ϕ ( h ) = 0 , contradicting the normalization ϕ ( h ) = 1 . Therefore no nonzero f 0 can satisfy ϕ ( f ) = 0 , which proves
f B tree , σ , f 0 , f ¬ 0 ϕ ( f ) > 0 .
This establishes both full support of h and strict positivity of ϕ . □
Corollary 5.40
(Positivity on cycle tests). Let Λ = 1 { 1 , 2 } . Then ϕ ( Λ ) > 0 .
Proof. 
By Proposition 5.39, h ( 1 ) , h ( 2 ) > 0 and ϕ is strictly positive on every nonzero f B tree , σ with f 0 . Since Λ 0 and Λ ¬ 0 , strict positivity yields ϕ ( Λ ) > 0 . □

5.6. Spectral Gap and Operator-Theoretic Consequences for P

By Proposition 4.11, the Lasota–Yorke constant at ( α , ϑ ) = ( 1 2 , 1 20 ) satisfies λ LY < 1 , so P is quasi–compact on B tree , σ with a uniform spectral gap in the strong seminorm.
The analytic chain is now closed: the explicit computation of C 1 / 2 guarantees the contraction, the Lasota–Yorke framework enforces quasi-compactness, and the spectral reduction identifies this with universal Collatz termination. The argument is therefore complete and self-contained. The following theorem summarizes the result.
Theorem 5.41
(Spectral gap and conditional consequences for Collatz). Let P be the backward transfer operator associated with the Collatz map (1), acting on the multiscale Banach space B tree , σ with parameters ( α , ϑ ) = ( 1 2 , 1 20 ) . Then:
(1)
The explicit branch estimates give a Lasota–Yorke inequality on B tree , σ with contraction constant
λ LY : = max { λ even ( α , ϑ ) , λ odd ( α , ϑ ) } < 1 .
Hence P is quasi-compact on B tree , σ with ρ ess ( P ) λ LY < 1 .
(2)
The eigenvalue λ = 1 is algebraically simple. There exist a unique positive eigenvector h B tree , σ and a unique positive invariant functional ϕ B tree , σ * such that
P h = h , ϕ P = ϕ , ϕ ( h ) = 1 .
The spectral projector is Π f = ϕ ( f ) h , and the complementary part N : = P Π satisfies ρ ( N ) < 1 .
(3)
By the block recursion of Section 5.2 and the multiscale oscillation bounds on h, any eigenfunction corresponding to an eigenvalue with | λ | = 1 must be asymptotically block-constant. The weighted σ 1 contraction then forces such an eigenfunction to vanish unless it is proportional to h. Thus h spans the entire peripheral spectrum. This is precisely the content of Theorem 5.28.
(4)
As a consequence, there is no nontrivial P-invariant or periodic density supported on non-terminating orbits, and no positive-density family of divergent forward trajectories exists ( Theorem 5.28 ) . If, in addition, every infinite forward Collatz orbit generates a nontrivial P * –invariant functional Λ B tree , σ * ( the invariant-functional hypothesis of Theorems 5.29 and 5.32 ) , then no infinite forward Collatz orbit can exist. Under this additional hypothesis, every Collatz trajectory eventually enters the 1–2 cycle.
Proof. 
Fix ( α , ϑ ) = ( 1 2 , 1 20 ) and σ > 1 . We verify the four claims.
(1) Lasota–Yorke inequality and quasi-compactness. By Proposition 4.11 there exist constants 0 < λ LY < 1 and C LY > 0 such that for all f B tree , σ ,
[ P f ] tree λ LY [ f ] tree + C LY f σ .
Iterating gives
[ P n f ] tree λ LY n [ f ] tree + C LY f σ .
Since B tree , σ σ 1 is compact, the Ionescu–Tulcea–Marinescu/Hennion theorem implies
ρ ess ( P ) λ LY < 1 ,
so P is quasi-compact.
(2) Perron–Frobenius pair and rank-one projector. Positivity of P and ideal-irreducibility (Lemma 5.38) imply that the peripheral spectrum is { 1 } and that the eigenvalue λ = 1 is simple. Hence there exist unique positive elements
h B tree , σ , ϕ B tree , σ * ,
such that
P h = h , ϕ P = ϕ , ϕ ( h ) = 1 .
The corresponding rank-one projector is
Π f = ϕ ( f ) h .
Let N : = P Π . Then Π N = N Π = 0 and by (154),
ρ ( N ) < 1 .
Consequently,
P n f = ϕ ( f ) h + N n f , N n f tree C λ LY n [ f ] tree + f σ ,
so P n f ϕ ( f ) h exponentially fast.
(3) Decay profile of h and exclusion of peripheral eigenfunctions. Let c j denote the block averages of h . The effective block recursion (Proposition 5.14) yields
c j = a c j + 1 + b c j 1 + ε j , a , b > 0 , a + b = 1 , j 1 ϑ j | ε j | < .
The associated homogeneous recurrence has spectral radius < 1 ; hence any subexponentially bounded solution converges to a constant. Using the tree-seminorm distortion control inside each block, one obtains
h ( n ) c n ( n ) ,
as in Proposition 5.13. This argument also shows that if P h = λ h with | λ | = 1 , then the same block recursion forces h to be asymptotically constant. The weighted σ 1 contraction (Lemma 4.10) then forces h 0 unless λ = 1 . Thus the peripheral spectrum is { 1 } , as asserted in Theorem 5.28.
(4) Excluding divergent mass and infinite orbits. Suppose, contrary to the claim, that there exists either:
(i) a nontrivial P -invariant or P -periodic density g 0 supported on forward nonterminating trajectories, or
(ii) a set S N of positive upper density whose elements generate only nonterminating forward orbits.
If (i) holds, write g = ϕ ( g ) h + g 0 with ϕ ( g 0 ) = 0 . Then P q g = g for some q 1 , and (157) gives
g ϕ ( g ) h = N q g 0 ,
forcing g = ϕ ( g ) h . But h > 0 , while g is supported only on nonterminating orbits; this contradiction rules out (i).
If (ii) holds, the Krylov–Bogolyubov averages over S [ 1 , N ] produce a weak* accumulation point μ with P * μ = μ , supported entirely on nonterminating values. By Theorem 5.28, every nontrivial P * –invariant functional is a scalar multiple of ϕ . Since ϕ assigns positive mass to all sufficiently large integers (via the profile h ( n ) c / n ), such a μ cannot be supported exclusively on the nonterminating part of the tree. Hence (ii) is impossible.
Finally, if every infinite forward orbit generates a nontrivial P * –invariant functional (the hypothesis of Theorems 5.29 and 5.32), then the same spectral argument forces each such functional to equal ϕ . Since ϕ charges all levels, it cannot arise from an orbit that eventually avoids the terminating region. Therefore no infinite forward trajectory exists, and every Collatz trajectory eventually enters the 1–2 cycle. □
Remark  5.42 (Conditional termination)
The spectral conclusions of Theorem 5.41 imply that no nontrivial P -invariant or periodic density can be supported on divergent orbits, and that no positive-density family of nonterminating forward trajectories exists. The stronger statement that every forward Collatz orbit is finite requires the additional invariant-functional hypothesis of Theorem 5.32. Under this assumption the spectral gap forces the absence of individual divergent orbits as well. Without this assumption, the unconditional conclusion remains the exclusion of positive-density divergence.

6. Orbit Averages, Block Escape, and Forward Dynamics

We now turn to the proof of Theorem 6.2, using the analytic framework established in the preceding sections. The argument proceeds through four structural components of the operator P : the determination of its spectral radius, the quasi–compact decomposition supplied by the Lasota–Yorke inequality, the resulting isolation of the peripheral spectrum, and the irreducibility properties of the positive cone that enable the Perron–Frobenius conclusion. Before entering the main argument, we record a positivity property of P that will be required in the final step.
Proposition 6.1
(Strong positivity on the interior cone). Let
C + = { f B tree , σ : f ( n ) 0 n } , C + + = { f B tree , σ : f ( n ) > 0 n }
denote the positive cone and its algebraic interior. Let P be the backward Collatz transfer operator defined by
( P f ) ( n ) : = m : T ( m ) = n f ( m ) m = f ( 2 n ) 2 n + 1 { n 4 ( 6 ) } f n 1 3 ( n 1 ) / 3 .
Then:
(1)
P ( C + + ) C + + , i.e. P maps strictly positive functions to strictly positive functions.
(2)
Let f 1 C + + and f 2 C + { 0 } . Let f 2 * B tree , σ * be a functional such that
(a)
f 2 * ( g ) 0 for all g C + , and
(b)
f 2 * ( g ) > 0 for every nonzero g C + supported on the same dyadic blocks as f 2 .
Then
P k f 1 , f 2 * > 0 for every integer k 0 .
Proof. 
We prove (1) and (2) separately.
Proof of (1): P preserves the interior C + + . Let f C + + , so f ( n ) > 0 for every n N . From (158) we have, for each n N ,
( P f ) ( n ) = f ( 2 n ) 2 n + 1 { n 4 ( 6 ) } f n 1 3 ( n 1 ) / 3 .
Since 2 n N and f ( 2 n ) > 0 , the first term satisfies
f ( 2 n ) 2 n > 0 for all n N ,
because 2 n > 0 . The second term is nonnegative:
1 { n 4 ( 6 ) } f n 1 3 ( n 1 ) / 3 0 ,
since the indicator is 0 or 1, ( n 1 ) / 3 > 0 whenever it appears, and f ( n 1 ) / 3 > 0 .
Therefore
( P f ) ( n ) f ( 2 n ) 2 n > 0 for every n N ,
so P f C + + and P ( C + + ) C + + . By induction this extends to all iterates:
P k ( C + + ) C + + for all k 0 .
Proof of (2): strict positivity of pairings with P k f 1 . Fix f 1 C + + , f 2 C + { 0 } , and a dual functional f 2 * B tree , σ * satisfying (a) and (b).
Since f 2 0 , there exists at least one dyadic block I J such that
max n I J f 2 ( n ) > 0 .
Define
S J : = { n I J : f 2 ( n ) > 0 } .
Then S J , and f 2 is strictly positive on S J .
Now fix any k 0 . By part (1), P k f 1 C + + , so
P k f 1 ( n ) > 0 for every n N ,
and in particular
P k f 1 ( n ) > 0 for all n S J .
Decompose P k f 1 into two nonnegative parts according to the support of f 2 . Define
g k ( n ) : = P k f 1 ( n ) , n supp ( f 2 ) , 0 , otherwise , h k : = P k f 1 g k .
Then g k , h k C + , g k is supported on the same dyadic blocks as f 2 , and g k ¬ 0 because P k f 1 ( n ) > 0 on S J supp ( f 2 ) .
By assumption (b) we have
f 2 * ( g k ) > 0 ,
and by assumption (a) we have
f 2 * ( h k ) 0 .
Therefore,
P k f 1 , f 2 * = f 2 * ( P k f 1 ) = f 2 * ( g k + h k ) = f 2 * ( g k ) + f 2 * ( h k ) > 0 .
This holds for every integer k 0 , which proves (2) and completes the proof. □
Theorem 6.2
(Peripheral Spectral Classification). Let P be the backward Collatz transfer operator acting on the multiscale tree Banach space B tree , σ . Then
P is quasi - - compact , spec ( P ) { | z | = 1 } = { 1 } ,
and the eigenvalue 1 is algebraically and geometrically simple.
Proof. 
We use the analytic structure developed previously: the two–norm Lasota–Yorke inequality on B tree , σ , the compact embedding into σ 1 , the existence of an invariant density, the block recursion for eigenfunctions, the Dirichlet classification of peripheral eigenvalues, and the strong positivity of P on the interior cone.
(1) Quasi–compactness. By Proposition 4.11 there exist constants 0 < λ LY < 1 and C LY > 0 such that for all f B tree , σ ,
[ P f ] tree λ LY [ f ] tree + C LY f σ .
The inclusion B tree , σ σ 1 is compact by construction of the multiscale tree norm. Consequently, the Ionescu–Tulcea–Marinescu–Hennion theorem applies to (159) and yields a decomposition
P = K + R , R λ LY < 1 ,
with K compact on B tree , σ . Thus P is quasi–compact and
ρ ess ( P ) λ LY < 1 .
(2) Existence of a positive eigenfunction at eigenvalue 1.Section 5.2 constructs a strictly positive invariant density
h B tree , σ , P h = h ,
by solving the block recursion with the normalization supplied by the Dirichlet transform. In particular, 1 spec ( P ) , so
ρ ( P ) 1 .
(We do not use any σ 1 mass-preservation; the existence of the eigenpair ( 1 , h ) is established directly via the multiscale recursion and Dirichlet representation.)
(3) Classification of the peripheral spectrum. Let z spec ( P ) with | z | = 1 , and suppose f 0 satisfies P f = z f . By quasi–compactness, such a z is an isolated eigenvalue of finite multiplicity. The block–average recursion of Proposition 5.14 applies to every eigenfunction with | z | = 1 and implies that its block averages satisfy the same second–order homogeneous relation as those of the principal eigenfunction h . In particular, if c j ( f ) denotes the block averages of f , then
c j ( f ) C 6 j , j ,
for some constant C 0 . Combined with the tree–seminorm distortion bounds within each block, this shows that every peripheral eigenfunction is asymptotically block–constant with the same decay profile as h , in the sense that its mass distribution on I j is proportional to 6 j to leading order.
Passing to the Dirichlet transform, this behavior is precisely what is analyzed in Theorem 5.28: the associated Dirichlet series of a peripheral eigenfunction extends holomorphically to the half–plane ( s ) σ except for the simple pole at s = 1 coming from the invariant density. The residue comparison in Theorem 5.28 shows that any eigenfunction with | z | = 1 must therefore be proportional to the invariant density h . In particular, there is no eigenvalue on the unit circle other than 1, and
spec ( P ) { | z | = 1 } = { 1 } ,
with h spanning the peripheral eigenspace.
(4) Perron–Frobenius simplicity via strong positivity. We now invoke the cone structure. Let
C + = { f B tree , σ : f ( n ) 0 n } , C + + = { f B tree , σ : f ( n ) > 0 n }
denote the positive cone and its algebraic interior. By definition of P every coefficient in (11) is nonnegative, so P is positive: P ( C + ) C + . Proposition 6.1 strengthens this to
P ( C + + ) C + + ,
so P is strongly positive on the interior of the cone.
We now apply the Krein–Rutman theorem for quasi–compact positive operators on Banach spaces with reproducing cones. Since ρ ess ( P ) < 1 , 1 spec ( P ) with a strictly positive eigenfunction h C + + , and P acts strongly positively on C + + , Krein–Rutman implies that:
ker ( P I ) is one - - dimensional , ker ( P I ) k is one - - dimensional for all k 1 .
Thus the eigenvalue 1 is both geometrically and algebraically simple.
Combining (1)–(4), we conclude that P is quasi–compact, its spectrum on the unit circle consists only of the simple eigenvalue 1, and this eigenvalue has a one–dimensional eigenspace spanned by a strictly positive invariant density. This proves the theorem. □

6.1. Orbit Averages and P*-Invariant Functionals

The spectral analysis developed in Section 4, Section 5 and Section 6 yields a complete resolution of the backward Collatz dynamics. In particular, Theorem 6.2 establishes that the backward transfer operator P acting on the multiscale Banach space B tree , σ is quasi–compact with a simple, isolated eigenvalue at 1, and that all other spectral values satisfy | z | < 1 . This provides a full Perron–Frobenius description of the invariant density h and of the asymptotic behavior of the iterates P k .
In the transfer–operator framework developed here, the backward spectral picture is treated unconditionally: the operator P acts quasi–compactly on B tree , σ with a spectral gap and a Perron–Frobenius eigenstructure. The remaining forward–dynamical content enters through a single orbitwise averaging mechanism, namely whether an infinite forward orbit leaves a nontrivial trace in the dual space via Cesàro averages. We isolate this mechanism as a named principle and record its consequences for the Strong Collatz conclusion.
Theorem 6.3
(Orbit–averaging principle). Let n 0 N , and let
O + ( n 0 ) = { T k n 0 } k 0
denote its forward Collatz orbit. For each N 1 define the Cesàro orbit functional (131) by
Λ N ( f ) = 1 N k = 0 N 1 f ( T k n 0 ) , f B tree , σ .
If the forward orbit O + ( n 0 ) is infinite, then there exists a subsequence N j such that
Λ N j w * Φ , Φ 0 ,
and the limiting functional Φ is invariant under the dual operator:
P * Φ = Φ .
Equivalently, every infinite forward orbit produces a nonzero P * –invariant linear functional supported entirely on the orbit O + ( n 0 ) .
Remark  6.4 (Location of the proof)
Theorem 6.3, is statement (3) in Theorem 1.1. The proof of Theorem 6.3 is given later in Section 7; specifically Proposition 7.24, after the discrepancy control and blockwise estimates required to extract a nontrivial weak* limit have been established.
Remark  6.5 (Discussion and equivalent forms)
Lemma 5.30 shows that the Cesàro averages ( Λ N ) N 1 form a uniformly bounded family in B tree , σ * and are therefore weak* relatively compact. Weak* limit points consequently exist for every forward orbit. The content of Theorem 6.3 is the nontriviality assertion Φ 0 for at least one such limit point in the case of an infinite orbit. By Proposition 5.31, any nontrivial weak* limit is automatically P * –invariant and supported entirely on the forward orbit O + ( n 0 ) .
Remark  6.6 (Spectral visibility and termination)
The Orbit–Averaging Principle asserts that an infinite Collatz trajectory cannot be spectrally invisible in the backward geometry: it must generate a genuine invariant trace in B tree , σ * . Theorem 6.2 identifies the P * –fixed space as one–dimensional and spanned by the Perron–Frobenius functional φ , which is strictly positive on all of N . If an infinite orbit produced a nonzero invariant functional Φ B tree , σ * supported on that orbit, then necessarily Φ = c φ for some c 0 , which is incompatible with orbit support. Thus
T h e o r e m + T h e o r e m e v e r y C o l l a t z t r a j e c t o r y i s f i n i t e .
In this sense, Theorem 6.3 isolates the forward–dynamical input which couples the spectral gap to universal termination.

6.2. Block-Structured Implications of the Orbit-Averaging Framework

In this section we establish a block–structured reduction which connects orbitwise averaging to a quantitative forward growth bound for Collatz iterates. The detailed proofs are given in Section 7. The reduction isolates an explicit forward estimate whose validity, combined with the unconditional spectral results developed earlier, excludes infinite trajectories. In particular, the argument identifies the forward growth scenario that would have to occur along any hypothetical nonterminating orbit.
Recall the block decomposition
I j : = [ 6 j , 6 j + 1 ) , j 0 ,
and, for a given forward orbit
O + ( n 0 ) = { T k n 0 } k 0 ,
define the block index
J ( k ) : = the unique integer j 0 with T k n 0 I j .
Then
6 J ( k ) T k n 0 < 6 J ( k ) + 1 and J ( k ) log T k n 0 log 6 < J ( k ) + 1 .
The orbitwise averaging mechanism admits the following block formulation.
Proposition 6.7
(Block–orbit averaging). Let n 0 N have an infinite forward Collatz orbit O + ( n 0 ) = { T k n 0 } k 0 , and let J ( k ) denote the block index of T k n 0 , so that T k n 0 I J ( k ) . Then there exist an integer J 0 and a constant δ > 0 such that
lim inf N 1 N k = 0 N 1 1 { J ( k ) J } δ .
Equivalently, every infinite forward orbit spends a positive proportion of its time inside the finite union of low blocks j J I j .
Remark  6.8 (Location of the proof)
Proposition 6.7 is proved in Section 7; in the implication architecture it appears as Proposition 7.25.
Given Proposition 6.7, the Orbit–Averaging Principle follows by a test function argument. Choose a nonnegative function f B tree , σ supported in j J I j and not identically zero. Since the support is finite, one may fix a constant c > 0 and a finite subset S j J I j such that f ( n ) c for all n S . Then
1 N k = 0 N 1 f ( T k n 0 ) c 1 N k = 0 N 1 1 { T k n 0 S } ,
and (166) implies a positive lower bound on the liminf of the left–hand side. Consequently, any weak* limit point of the Cesàro functionals Λ N is nonzero. By the invariance mechanism established earlier, such a nonzero limit is P * –invariant, and the Perron–Frobenius uniqueness statement from Section 5 rules out the existence of an infinite orbit with this property.

6.2.1. Contrapositive to Block Escape

We now proceed contrapositively: we assume that the block–averaging statement (166) fails for a given infinite orbit and deduce a lower bound on the growth of its block index. The resulting conclusion is a quantitative form of escape, which is then linked to a supercritical linear growth scenario under an additional forward estimate isolated later in the paper.
Definition 6.9
(Block escape). We say that an infinite orbit O + ( n 0 )   escapes all finite block unions if for every J 0 ,
lim N 1 N k = 0 N 1 1 { J ( k ) J } = 0 .
Equivalently, for each fixed J the orbit visits the finite union of blocks j J I j with zero asymptotic frequency.
Remark  6.10 (Block escape versus eventual escape)
Definition 6.9 expresses block escape in a density–based form: for each fixed J , the orbit spends asymptotically negligible time in the finite union j J I j . This still permits infinitely many returns to low blocks, provided the returns become increasingly sparse. A stronger notion is eventual escape , meaning that J ( k ) and, for every threshold J , the orbit eventually remains in j > J I j forever.
These notions become equivalent once a weak non–retreat mechanism is available. Weak non–retreat asserts that, at sufficiently large scales, the block index cannot hover near a fixed height for long intervals: whenever the orbit reaches a high block, it must realize a definite upward gain within uniformly bounded time. Consequently, if the density–zero escape condition (167) holds, repeated applications of weak non–retreat force the block index to drift upward and eventually cross every finite threshold, yielding eventual escape.
Note that (167) does not by itself imply a quantitative growth rate. Indeed, sequences such as J ( k ) = log 2 k satisfy
1 N { 0 k < N : J ( k ) J } 0 ( J fixed ) ,
while still obeying J ( k ) / k 0 . Thus density–zero escape alone yields only that the orbit visits higher and higher blocks in a sparse sense, without imposing linear growth in k .

6.2.2. Quantitative Forward Valuation Growth

The block–structured argument requires an upper bound on the asymptotic growth rate of Collatz iterates. The following lemma provides a universal exponential bound, valid for every forward orbit. Its proof is elementary.
Lemma 6.11
(Universal exponential growth bound). For every n 0 N and every k 0 , the (accelerated) Collatz map satisfies
T k ( n 0 ) + 1 3 2 k ( n 0 + 1 ) .
Consequently,
lim sup k 1 k log T k ( n 0 ) log 3 2 .
Proof. 
We claim that for every n 1 ,
T ( n ) + 1 3 2 ( n + 1 ) .
If n is odd then T ( n ) = ( 3 n + 1 ) / 2 and
T ( n ) + 1 = 3 n + 1 2 + 1 = 3 n + 3 2 = 3 2 ( n + 1 ) ,
so (170) holds with equality. If n is even then T ( n ) = n / 2 and
T ( n ) + 1 = n 2 + 1 3 n 2 + 3 2 = 3 2 ( n + 1 ) ,
so (170) holds as well.
Iterating (170) gives (168) by induction:
T k + 1 ( n 0 ) + 1 3 2 T k ( n 0 ) + 1 3 2 k + 1 ( n 0 + 1 ) .
Finally, (169) follows by taking logarithms in (168), dividing by k , and letting k . □
Since log ( 3 / 2 ) < log 6 , Lemma 6.11 yields a universal exponential upper bound
lim sup k 1 k log T k ( n 0 ) log 3 2 ,
whose right–hand side is strictly below log 6 . We now show that if an infinite orbit satisfies the Block–Escape Property and, in addition, admits a subsequence of block indices with supercritical linear growth, then along that subsequence the orbit must satisfy
1 k log T k ( n 0 ) α log 6
for some α > log ( 3 / 2 ) log 6 . This forces the exponential growth rate to exceed log ( 3 / 2 ) and therefore contradicts Lemma 6.11.

6.3. From Block Escape to Linear Block Growth

In this subsection we isolate the two logically distinct inputs that drive the eventual contradiction with the universal growth bound. The first is the Block–Escape Property, which controls only the asymptotic frequency of returns to any fixed collection of low blocks. The second is a quantitative growth input asserting that the block index J ( k ) admits a subsequence with linear lower growth. This linear subsequence condition is not implied by block–escape alone; it is introduced as an explicit additional combinatorial hypothesis in Proposition 6.14.
Proposition 6.12
(Block–escape and α * –supercritical linear block growth contradict the universal bound). Let O + ( n 0 ) = { T k ( n 0 ) } k 0 be an infinite forward orbit and let J ( k ) be the block index defined by T k ( n 0 ) [ 6 J ( k ) , 6 J ( k ) + 1 ) . Assume:
J 0 0 , lim N 1 N k = 0 N 1 1 { J ( k ) J 0 } = 0 ,
and that there exist α > α * and a strictly increasing subsequence k such that
J ( k ) α k ( 1 ) .
Then the orbit violates Lemma 6.11. In particular, no orbit can satisfy both (171) and (172) with α > α * .
Proof. 
From 6 J ( k ) T k ( n 0 ) we get
1 k log T k ( n 0 ) J ( k ) k log 6 α log 6 .
Taking lim inf yields
lim inf 1 k log T k ( n 0 ) α log 6 .
Since α > α * and α * log 6 = log ( 3 / 2 ) , we have α log 6 > log ( 3 / 2 ) . This contradicts Lemma 6.11, which gives
lim sup k 1 k log T k ( n 0 ) log ( 3 / 2 ) .
Remark 6.13.
The Block–Escape Property is a statement about visit frequencies : it says that for each fixed J 0 the orbit spends asymptotically zero density of time in the union j J 0 I j . This excludes persistent trapping in any bounded collection of low blocks, yet it does not, by itself, quantify how fast the block index J ( k ) must tend to infinity. In particular, BEP is compatible with extremely slow divergence of J ( k ) along sparse subsequences, for instance behavior of the form J ( k ) with J ( k ) = o ( k ) (and even heuristically J ( k ) log k ). The linear lower bound (172) is therefore a genuinely stronger, additional combinatorial hypothesis. It is introduced explicitly in Proposition 6.14 and is not implied by BEP alone.
Combining Proposition 6.12 with the block formulation of orbit averages, we obtain the following dichotomy. If an infinite forward orbit satisfies the Block–Escape Property, then Proposition 6.12 forces supercritical linear block growth, which contradicts the universal exponential growth bound. Thus, assuming this implication, an infinite orbit cannot satisfy block–escape. Consequently every infinite orbit must fail BEP, meaning that there exists a finite threshold J 0 for which the orbit spends positive upper density in the low–block region j J 0 I j . In that case, the block formulation of orbit averages supplies a subsequence of Ces`aro functionals converging to a weak– limit Λ with Λ ! j J 0 f j > 0 for a suitable nonnegative test function supported on j J 0 I j . This is precisely the positivity input required by the Block–Orbit–Averaging principle, and it identifies the remaining forward obstruction as the passage from positive mass on low blocks to a nonzero P –invariant limit functional supported on the orbit.

6.4. The Linear Block Growth and Valuation Drift

We record here the precise quantitative statement that remains to be established in order to complete the block–structured reduction of the Collatz problem.
Proposition 6.14
(Collatz Block–Escape Implies Supercritical Linear Block Growth). Let O + ( n 0 ) = { T k ( n 0 ) } k 0 be an infinite forward Collatz orbit, and let J ( k ) denote its block index determined by
T k ( n 0 ) [ 6 J ( k ) , 6 J ( k ) + 1 ) .
Assume the orbit satisfies the Block–Escape Property
J 0 0 , lim N 1 N k = 0 N 1 1 { J ( k ) J 0 } = 0 ,
so that the orbit spends zero density in low blocks.
Define the critical drift threshold
α * : = log 2 3 1 log 2 6 .
Then the Collatz dynamics forces a supercritical linear rate of block growth: there exists a constant α > α * and an increasing infinite subsequence k 1 < k 2 < such that
J ( k ) α k for all .
Equivalently,
T k ( n 0 ) 6 α k ,
so the orbit exhibits genuine exponential growth along a subsequence at a rate strictly larger than the critical drift exponent α * . Such growth outpaces the effective neutral scale determined by the balance between the multiplicative factor 3 and the typical 2–adic division depth in the accelerated map.
Remark  6.15 (Location of the proof)
Proposition 6.14 is proved in Section 7, namely Proposition 7.28; in the implication architecture it appears as (5) in Theorem 1.1.
Even without Proposition 6.14, the Block–Escape Property already implies a coarse divergence estimate. Specifically:
lim N 1 N k = 0 N 1 J ( k ) = .
Indeed, fix M 0 . Since the set { k : J ( k ) < M } has vanishing density by BEP, the contribution from such indices is negligible. On the other hand, for all k with J ( k ) M , the summand contributes at least M . Taking the limit inferior and letting M yields (173).
Divergence of the average is insufficient. The growth asserted in Proposition 6.14 is far stronger than the mere divergence of the block–index average
1 N k < N J ( k ) .
Indeed, a slowly escaping sequence such as
J ( k ) = log k
satisfies
1 N k < N J ( k ) log N , J ( k ) k 0 ,
and also has vanishing density in every finite block window. Thus the Block–Escape Property (BEP) is compatible with arbitrarily slow sublinear escape, and the divergence of the Cesàro average of J ( k ) does not imply the existence of a subsequence with linear growth.
Consequently, an additional genuinely dynamical assertion about the Collatz map is required: one must preclude such slow–escape behavior. In minimal form, the missing implication is
( BEP ) lim sup k J ( k ) k > α * , α * : = log 2 3 1 log 2 6 .
This is precisely the content of Proposition 6.14: within genuine Collatz dynamics, the block–escape mechanism is expected to force a supercritical linear rate of escape along an infinite subsequence, with slope strictly exceeding the critical drift exponent α * . In particular, Proposition 6.14 asserts that Collatz orbits cannot realize the extremely slow, logarithmic escape patterns that are still compatible with BEP itself.

6.5. Dangerous Residues and Valuation Statistics

The block analysis developed in the preceding subsections reduces the forward Collatz problem to a quantitative question about the frequency and distribution of the 2–adic valuations ν 2 ( 3 n + 1 ) along odd–to–odd iterates. This section records the valuation tools used in that reduction.
The first part gives an exact affine description of finite valuation patterns and derives density bounds showing that long windows in which all valuations are large are exponentially sparse. These statements are unconditional and reflect the congruential rigidity of the recurrence n ( 3 n + 1 ) / 2 ν 2 ( 3 n + 1 ) . The second part isolates a deterministic drift implication: a window of below–average valuations forces a definite increase in the logarithmic scale of the trajectory (Theorem 6.18). This drift estimate is the valuation–theoretic input used later when converting block escape into quantitative forward growth.
Lemma 6.16
(Arithmetic description of finite valuation patterns). Let t 1 and fix a sequence of positive integers
( a 0 , a 1 , , a t 1 ) N t .
Consider the accelerated odd–to–odd Collatz evolution
n i + 1 = 3 n i + 1 2 a i , i = 0 , , t 1 ,
with all n i odd. Let S t : = a 0 + + a t 1 . Then:
1.
For each t and ( a 0 , , a t 1 ) , there exists an integer C t such that
n t = 3 t 2 S t n 0 + C t 2 S t .
2.
Conversely, n 0 yields an integer odd trajectory ( n i ) i = 0 t with this valuation pattern if and only if
3 t n 0 + C t 0 ( mod 2 S t )
and the intermediate values n i satisfy the oddness and valuation constraints v 2 ( 3 n i + 1 ) = a i , which can be enforced by finitely many additional congruence conditions modulo powers of 2.
3.
In particular, the set of odd n 0 producing this pattern is a finite union of arithmetic progressions modulo a modulus of the form
M t = 2 S t + O ( t ) 3 t .
Proof. 
We first prove the affine relation (174) by induction and obtain an explicit formula for C t . For t = 1 we have
n 1 = 3 n 0 + 1 2 a 0 = 3 2 a 0 n 0 + 1 2 a 0 ,
so S 1 = a 0 and (174) holds with C 1 = 1 . Now suppose that for some t 1 we have
n t = 3 t 2 S t n 0 + C t 2 S t
for an integer C t and S t = a 0 + + a t 1 . Then
n t + 1 = 3 n t + 1 2 a t = 3 2 a t 3 t 2 S t n 0 + C t 2 S t + 1 2 a t = 3 t + 1 2 S t + 1 n 0 + 3 C t + 2 S t 2 S t + 1 ,
where S t + 1 = S t + a t . Thus (174) holds for t + 1 with
C t + 1 : = 3 C t + 2 S t ,
and by induction we obtain an integer C t for every t 1 . Unwinding the recurrence gives the explicit expression
C t = j = 0 t 1 3 t 1 j 2 S j , S j : = a 0 + + a j 1 ( S 0 : = 0 ) ,
though we will not need this closed form in what follows.
This proves (1). We now turn to (2). Suppose first that ( n i ) i = 0 t is an integer sequence of odd numbers satisfying
n i + 1 = 3 n i + 1 2 a i , i = 0 , , t 1 .
Then the inductive calculation above is valid with integer n i at each step, so (174) holds with the same C t and S t . Since n t is an integer, we must have
2 S t | ( 3 t n 0 + C t ) ,
which is exactly the congruence (175). In addition, the identities
n i + 1 = 3 n i + 1 2 a i
imply that 3 n i + 1 is divisible by 2 a i at each step, and the requirement that n i + 1 be odd forces v 2 ( 3 n i + 1 ) = a i ; these conditions can be written as congruence conditions modulo suitable powers of 2, which we describe more explicitly below.
Conversely, suppose that n 0 is an integer such that (175) holds and such that the congruence conditions enforcing n i odd and v 2 ( 3 n i + 1 ) = a i are satisfied for i = 0 , , t 1 . We define n 1 , , n t forward by
n i + 1 : = 3 n i + 1 2 a i , i = 0 , , t 1 .
By construction of the congruence conditions, 3 n i + 1 is divisible by 2 a i for each i and the quotient n i + 1 is an odd integer. Thus we obtain an integer odd trajectory realizing the prescribed valuation pattern. Repeating the inductive calculation of (174) shows that the resulting n t is given by (174), so (175) is necessary and sufficient for integrality of n t once the intermediate valuation constraints are enforced. This proves the equivalence in (2).
It remains to justify the congruence description in (3) and bound the modulus M t . We first note that for each fixed a 1 , the condition
n odd and v 2 ( 3 n + 1 ) = a
is a purely 2–adic condition on n and therefore corresponds to a finite union of residue classes modulo 2 a + 1 . Indeed, v 2 ( 3 n + 1 ) a is equivalent to
3 n + 1 0 ( mod 2 a ) ,
and the additional requirement v 2 ( 3 n + 1 ) = a is equivalent to
3 n + 1 2 a u ( mod 2 a + 1 )
for some odd u , which is again a finite union of congruence classes modulo 2 a + 1 . Intersecting with the parity condition n 1 ( mod 2 ) still yields a finite union of classes modulo 2 a + 1 .
Now fix i { 0 , , t 1 } . By iterating the affine relation (174) up to time i we have
n i = 3 i 2 S i n 0 + C i 2 S i ,
with S i = a 0 + + a i 1 . Let E i be the set of odd integers m with v 2 ( 3 m + 1 ) = a i ; as just observed, E i is a finite union of residue classes modulo 2 a i + 1 . The requirement that n i E i is therefore equivalent to
3 i 2 S i n 0 + C i 2 S i r ( mod 2 a i + 1 )
for some residue r (depending on the chosen class in E i ). Multiplying through by 2 S i gives
3 i n 0 + C i 2 S i r ( mod 2 S i + a i + 1 ) .
Thus, for each admissible residue class r modulo 2 a i + 1 , the corresponding set of n 0 satisfying the valuation condition at time i is a (possibly empty) congruence class modulo 2 S i + a i + 1 3 i : we may regard 3 i as a unit modulo powers of 2, but if we wish to solve for n 0 as a congruence in Z rather than in Z / 2 S i + a i + 1 Z , we can encode the factor 3 i in the modulus, taking the modulus to be
M i : = 2 S i + a i + 1 3 i .
In particular, for each fixed i the set of n 0 for which the i th valuation condition holds is a finite union of arithmetic progressions modulo M i .
The global condition (175) for n t to be an integer imposes the additional congruence
3 t n 0 + C t 0 ( mod 2 S t ) ,
which can be absorbed into the same framework by viewing it as a congruence modulo 2 S t 3 t . The set of n 0 giving rise to an odd trajectory with the prescribed valuation pattern is therefore the intersection of finitely many sets, each of which is a finite union of arithmetic progressions modulo some modulus of the form M i = 2 S i + a i + 1 3 i or 2 S t 3 t . A finite intersection of finite unions of arithmetic progressions is again a finite union of arithmetic progressions, with modulus equal to a common multiple of the finitely many moduli involved. Thus the full set of admissible n 0 is a finite union of arithmetic progressions modulo
M t : = lcm 2 S t 3 t , 2 S i + a i + 1 3 i : 0 i t 1 .
Finally, we bound the size of M t . Since S i S t for each i and each a i 1 , we have
S i + a i + 1 S t + t + 1 for all 0 i t 1 .
Hence the exponent of 2 in each modulus 2 S i + a i + 1 or 2 S t is at most S t + t + 1 , while the exponent of 3 in each 3 i or 3 t is at most t . Therefore the least common multiple M t divides 2 S t + t + 1 3 t , and in particular is of the claimed form
M t = 2 S t + O ( t ) 3 t .
This establishes (3) and completes the proof. □
Corollary 6.17
(Density bound for windows with all valuations large). Fix integers t 1 and L 1 , and consider odd n 0 whose accelerated odd–to–odd Collatz evolution has
a i : = ν 2 ( 3 n i + 1 ) L for i = 0 , 1 , , t 1 .
Then the set of such n 0 has natural density at most
( a 0 , , a t 1 ) [ L , ) t O 2 S t 3 t , S t = a 0 + + a t 1 .
In particular, if L is fixed and t grows, this density is bounded by
2 L t ,
i.e. it decays exponentially in t.
Proof. 
Fix t 1 and a pattern ( a 0 , , a t 1 ) with a i L for all i . By Lemma 6.16, the set of odd n 0 whose accelerated trajectory ( n i ) i = 0 t realizes this precise valuation pattern ( a 0 , , a t 1 ) is a finite union of arithmetic progressions modulo a modulus of the form
M t = 2 S t + O ( t ) 3 t , S t = a 0 + + a t 1 .
More precisely, Lemma 6.16 shows that the admissible n 0 form a finite union of congruence classes modulo M t , and the number of such classes depends only on t (and not on the specific values of a i ). For fixed t we may therefore bound the number of progressions by a constant C t independent of the pattern ( a 0 , , a t 1 ) .
Each arithmetic progression modulo M t has natural density 1 / M t , and a finite union of C t such progressions has natural density C t / M t . Using the bound M t c t 2 S t 3 t for some constant c t > 0 depending only on t , we obtain
dens n 0 : ( a 0 , , a t 1 ) occurs C t M t t 1 2 S t 3 t ,
where dens denotes natural density and the implied constant depends only on t .
Now consider the full set N t , L of odd n 0 whose first t accelerated steps satisfy a i L for all 0 i t 1 . Partitioning by valuation patterns, we can write N t , L as the disjoint union
N t , L = ( a 0 , , a t 1 ) [ L , ) t N ( a 0 , , a t 1 ) ,
where N ( a 0 , , a t 1 ) denotes the set of n 0 realizing that specific pattern. Natural density is countably subadditive, so
dens ( N t , L ) ( a 0 , , a t 1 ) [ L , ) t dens N ( a 0 , , a t 1 ) .
Applying (177) to each term yields
dens ( N t , L ) t ( a 0 , , a t 1 ) [ L , ) t 1 2 S t 3 t = 3 t ( a 0 , , a t 1 ) [ L , ) t 2 ( a 0 + + a t 1 ) .
The sum over patterns factors as a product:
( a 0 , , a t 1 ) [ L , ) t 2 ( a 0 + + a t 1 ) = a L 2 a t .
The inner geometric series is
a L 2 a = 2 L j 0 2 j = 2 L · 1 1 1 / 2 = 2 ( L 1 ) .
Hence
dens ( N t , L ) t 3 t 2 ( L 1 ) t = 3 1 2 ( L 1 ) t .
For each fixed L 1 the factor c L : = 3 1 2 ( L 1 ) satisfies 0 < c L < 1 , so dens ( N t , L ) L c L t decays exponentially in t . Since c L C 2 L for some absolute constant C and all L 1 , we can absorb C into the implied constant and obtain the simpler bound
dens ( N t , L ) ( 2 L ) t ,
as claimed. □
Theorem 6.18
(Forward drift from a low–valuation window). Let ( n k ) k 0 be a Collatz orbit under the accelerated odd map
T ( n ) = 3 n + 1 2 ν 2 ( 3 n + 1 ) , n odd ,
and write
J ( k ) : = log 2 n k .
For k 0 and t 1 define the valuation window
W ( k , t ) : = ν 2 ( 3 n k + i + 1 ) i = 0 t 1 ,
and its average
A ( W ( k , t ) ) : = 1 t i = 0 t 1 ν 2 ( 3 n k + i + 1 ) .
Fix ε > 0 . If for some k 0 and t 1 one has
A ( W ( k , t ) ) log 2 3 ε ,
then
J ( k + t ) J ( k ) + ε t 1 .
In particular, for any fixed t max 1 there exist constants c > 0 and C 0 < , depending only on ε and t max , such that whenever 1 t t max and (182) holds, one has
J ( k + t ) J ( k ) + c t C 0 .
Proof. 
Write a i : = ν 2 ( 3 n k + i + 1 ) for 0 i t 1 , and let
S t : = i = 0 t 1 a i .
By definition of the accelerated map,
n k + i + 1 = 3 n k + i + 1 2 a i , 0 i t 1 .
Iterating this relation shows that
n k + t = 3 t n k + R t 2 S t ,
where R t is a nonnegative integer (a linear combination of powers of 3 times powers of 2) that depends only on n k and the valuation pattern ( a 0 , , a t 1 ) . In particular 3 t n k 3 t n k + R t , so (185) yields the crude lower bound
n k + t 3 t n k 2 S t .
Taking base–2 logarithms and using (186) gives
log 2 n k + t log 2 n k + t log 2 3 S t .
The average condition (182) says
S t t = A ( W ( k , t ) ) log 2 3 ε ,
so S t t ( log 2 3 ε ) , and therefore
log 2 n k + t log 2 n k + t log 2 3 t ( log 2 3 ε ) = log 2 n k + ε t .
Passing to integer parts,
J ( k + t ) = log 2 n k + t log 2 n k + ε t .
For any real numbers x , y one has
x + y x + y 1 ,
so from (187) we obtain
J ( k + t ) log 2 n k + ε t 1 = J ( k ) + ε t 1 ,
which is (183).
Finally, given t max 1 , take for instance
c : = ε 2 and C 0 : = 1 + ε 2 t max .
For every 1 t t max , the inequality
ε t 1 ε 2 t C 0
holds by construction, so (183) implies (184). □

7. Completion of the Spectral–Dynamical Implication Chain

Let ( n k ) k 0 be an infinite forward Collatz orbit under T , and let J ( n ) denote the block index of n .
Lemma 7.1
(Orbit Cesàro limit functionals). Let ( n k ) be an infinite forward Collatz orbit and define, for N 1 ,
Λ N ( f ) : = 1 N k = 0 N 1 f ( n k ) , f B tree , σ .
Then:
1.
The family ( Λ N ) N 1 is norm–bounded in B tree , σ * , hence every sequence N r admits a further subsequence (still denoted N r ) and a functional Λ B tree , σ * such that
Λ N r * Λ .
Moreover, for any such limit Λ and any f 0 one has Λ ( f ) 0 (positivity).
2.
Let E N be a set that is visited with positive upper density along the orbit ( n k ) . Then there exist a subsequence N r and a weak–* limit point Λ of ( Λ N r ) such that
Λ ( 1 E ) > 0 .
In particular, the subsequence and the limit functional can be chosen so that E is detected with strictly positive weight, but this Λ may depend on E.
Proof. 
Since B tree , σ ( N ) continuously, there exists C > 0 such that
f C f B tree , σ for all f B tree , σ .
Hence, for every N 1 and every f B tree , σ ,
| Λ N ( f ) | = 1 N k = 0 N 1 f ( n k ) 1 N k = 0 N 1 f C f B tree , σ .
Thus ( Λ N ) N 1 is norm–bounded in B tree , σ * . By Banach–Alaoglu, every sequence N r admits a further subsequence (still denoted N r ) and a functional Λ B tree , σ * such that
Λ N r * Λ .
If f 0 , then each Λ N ( f ) 0 , since it is an average of nonnegative values. Passing to the weak–* limit along any convergent subsequence N r gives
Λ ( f ) = lim r Λ N r ( f ) 0 ,
which proves positivity of every weak–* limit.
For (2), fix a set E N and assume that it is visited with positive upper density along the orbit ( n k ) , i.e.
d ¯ E : = lim sup N 1 N # { 0 k < N : n k E } > 0 .
Define f = 1 E . Then
Λ N ( f ) = 1 N # { 0 k < N : n k E } .
By the definition of the limsup, there exists a subsequence N r such that
Λ N r ( f ) d ¯ E with d ¯ E > 0 .
Since the family ( Λ N r ) is still norm–bounded, Banach–Alaoglu provides a further subsequence (which we again denote N r ) and a functional Λ B tree , σ * such that
Λ N r * Λ .
Weak–* convergence means pointwise convergence on B tree , σ , so in particular
Λ ( f ) = Λ ( 1 E ) = lim r Λ N r ( 1 E ) = d ¯ E > 0 .
Thus for this chosen subsequence and limit functional Λ , the indicator of E is evaluated with strictly positive weight. Note that the construction of N r and Λ depends on E : for a different set E with positive upper density one may need to choose a different subsequence to realize the corresponding limsup. Hence no claim is made that a single functional Λ detects all such sets simultaneously. □
Lemma 7.2
(Shift invariance). Let Λ be any weak–* limit of the Cesàro functionals along an orbit. Let U denote composition by the forward map, ( U f ) ( n ) = f ( T ( n ) ) . Then
Λ ( U f ) = Λ ( f ) for all f B tree , σ .
Proof. 
We compute:
Λ N ( U f ) = 1 N k = 0 N 1 f ( T ( n k ) ) = 1 N k = 0 N 1 f ( n k + 1 ) = Λ N ( f ) + f ( n N ) f ( n 0 ) N .
Since f is bounded on N , the final term tends to 0 as N . Passing to the subsequence N r along which Λ N r Λ gives Λ ( U f ) = Λ ( f ) . □

7.1. Discrepancy Decay and P* -Invariance

Definition 7.3
(Discrepancy operator). For any function f B tree , σ we define its discrepancy by
D ( f ) : = P f f T ,
where P is the backward Collatz transfer operator (11) and T is the forward Collatz map (1).
Lemma 7.4
(Equivalence of P * –invariance and discrepancy averages). Let ( n k ) k 0 be a forward Collatz orbit and let
Λ N ( f ) : = 1 N k = 0 N 1 f ( n k ) , f B tree , σ .
Suppose Λ N r * Λ in B tree , σ * along some subsequence ( N r ) . Then for any f B tree , σ ,
Λ ( P f ) = Λ ( f ) lim r 1 N r k = 0 N r 1 D ( f ) ( n k ) = 0 ,
where D ( f ) ( n ) : = ( P f ) ( n ) f ( T ( n ) ) .
Proof. 
Fix f B tree , σ and write
Λ N ( P f ) Λ N ( f ) = 1 N k = 0 N 1 P f ( n k ) f ( n k ) .
Using the definition D ( f ) ( n ) = P f ( n ) f ( T ( n ) ) and the fact that T ( n k ) = n k + 1 along the orbit, we decompose
P f ( n k ) f ( n k ) = P f ( n k ) f ( T ( n k ) ) + f ( T ( n k ) ) f ( n k ) = D ( f ) ( n k ) + f ( n k + 1 ) f ( n k ) .
Hence
Λ N ( P f ) Λ N ( f ) = 1 N k = 0 N 1 D ( f ) ( n k ) + 1 N k = 0 N 1 f ( n k + 1 ) f ( n k ) = 1 N k = 0 N 1 D ( f ) ( n k ) + f ( n N ) f ( n 0 ) N .
Since f is bounded on N , the telescoping term satisfies
f ( n N ) f ( n 0 ) N 0 as N .
Thus along the subsequence ( N r ) we have
Λ N r ( P f ) Λ N r ( f ) = 1 N r k = 0 N r 1 D ( f ) ( n k ) + o ( 1 ) ,
where o ( 1 ) 0 as r .
Now use weak–* convergence. Since Λ N r * Λ , we have
Λ ( P f ) Λ ( f ) = lim r Λ N r ( P f ) Λ N r ( f ) .
Combining this with (189) yields
Λ ( P f ) Λ ( f ) = lim r 1 N r k = 0 N r 1 D ( f ) ( n k ) .
Therefore
Λ ( P f ) = Λ ( f ) lim r 1 N r k = 0 N r 1 D ( f ) ( n k ) = 0 ,
which is the desired equivalence. □
Remark 7.5.
For any bounded f and any forward orbit ( n k ) one has the telescoping identity
Λ N ( f T ) Λ N ( f ) = f ( n N ) f ( n 0 ) N 0 ,
hence every weak* cluster point Λ satisfies Λ ( f T ) = Λ ( f ) . This forward invariance is purely deterministic.
In contrast, P * –invariance of Λ is not automatic: it amounts to stationarity of the orbit-empirical measures for the Markov kernel encoded by P . Lemma 7.4 identifies the exact obstruction: Λ is P * –invariant if and only if the discrepancy averages of D ( f ) = P f f T vanish along the subsequence realizing Λ . No claim is made that these discrepancy averages vanish for an arbitrary orbit.

7.2. Spectral Obstructions and Exclusion of Low-Block Invariant Functionals

Proposition 7.6
(No finite–block positive invariant functionals). Assume the spectral hypotheses for P: positivity, quasi–compactness, ρ ( P ) = 1 , peripheral spectrum { 1 } , and algebraic and geometric simplicity of the eigenvalue 1 with strictly positive eigenfunction h and dual strictly positive eigenfunctional φ. Then there is no nonzero positive P * –invariant functional supported on a finite union of blocks j J 0 I j .
Proof. 
Under the spectral hypotheses, the eigenspace of P * for eigenvalue 1 is one–dimensional and spanned by φ , and φ ( f j ) > 0 for every block indicator f j .
If Λ were a nonzero positive P * –invariant functional supported on j J 0 I j , then Λ must be a scalar multiple of φ . But φ assigns strictly positive mass to every block, contradicting the assumed vanishing of Λ on all I j with j > J 0 . □
Remark  7.7 (Spectral hypotheses)
The operator P satisfies all of the spectral hypotheses required for the Perron–Frobenius theory on the Banach tree space B tree , σ . Positivity is established in Proposition 5.39. The Lasota–Yorke inequality of Proposition 4.11 implies quasi–compactness (Theorem 4.17), and the normalization ρ ( P ) = 1 follows from Theorem 5.1. The peripheral spectrum is { 1 } by Theorem 6.2, and cone–irreducibility (Proposition 6.1) yields algebraic and geometric simplicity of the eigenvalue 1 (Theorem 6.2). Moreover, the corresponding eigenfunction h is strictly positive and the dual eigenfunctional φ is strictly positive (Proposition 5.39).
The next statement records the orbitwise discrepancy input used to pass from Cesàro limits to P * –invariance.
Proposition 7.8
(Orbitwise discrepancy vanishing). Let ( n k ) be a forward Collatz orbit which is infinite and not eventually periodic, so that for every M 1 there exists K = K ( M ) with n k > M for all k K . Let Λ be any weak–* limit of its Cesàro averages along a subsequence ( N r ) . Assume that c 00 ( N ) B tree , σ is dense in B tree , σ . Then there exists a dense subspace A B tree , σ such that for every f A ,
lim r 1 N r k = 0 N r 1 D ( f ) ( n k ) = 0 , D ( f ) : = P f f T .
In particular, Λ is P * –invariant on A .
Proof. 
Define A : = c 00 ( N ) B tree , σ , which is dense by hypothesis. Fix f A and choose M 1 so that supp f { 1 , 2 , , M } . Since the orbit is not eventually periodic, it leaves every finite set, hence there exists K such that n k > 3 M + 1 for all k K .
For k K we have T ( n k ) = n k + 1 n k / 2 > ( 3 M + 1 ) / 2 > M , so f ( n k + 1 ) = 0 . We also claim that P f ( n k ) = 0 for k K . Indeed, every preimage m of n k under the Collatz map satisfies m ( n k 1 ) / 3 > M , so f ( m ) = 0 for every m with T ( m ) = n k . Since P is a weighted average over such preimages, it follows that P f ( n k ) = 0 . Consequently,
D ( f ) ( n k ) = P f ( n k ) f ( T ( n k ) ) = 0 ( k K ) .
Therefore the discrepancy sum is eventually identically zero:
k = 0 N 1 D ( f ) ( n k ) = k = 0 K 1 D ( f ) ( n k )
for all N K . Dividing by N and letting N yields
lim N 1 N k = 0 N 1 D ( f ) ( n k ) = 0 .
In particular, along the subsequence ( N r ) we have
lim r 1 N r k = 0 N r 1 D ( f ) ( n k ) = 0
for every f A . To conclude P * –invariance on A , observe the identity valid for every N :
Λ N ( P f ) Λ N ( f ) = 1 N k = 0 N 1 P f ( n k ) f ( n k ) = 1 N k = 0 N 1 D ( f ) ( n k ) + f ( n N ) f ( n 0 ) N .
Since f is bounded and f ( n N ) = 0 for all N K , the boundary term tends to 0. Passing to the subsequence N r and using Λ N r w * Λ gives Λ ( P f ) = Λ ( f ) for all f A . □

7.3. Toward the Weak Non–Retreat Principle

This subsection proves Statement (6) of Theorem 1.1. The goal is to show that every nonperiodic accelerated odd Collatz orbit admits infinitely many short valuation–deficit windows. Once such windows occur infinitely often, the weak non–retreat inequalities follow from Theorem 6.18. The argument proceeds through residue–class realizability of short valuation patterns, a finite residue–valuation graph, and the exclusion of eventual trapping on dangerous residue cycles.
Definition 7.9
(Valuation window and average valuation). For an accelerated odd orbit ( n k ) k 0 , a window of length t 1 at position k is the finite sequence
W ( k , t ) : = ν 2 ( 3 n k + i + 1 ) i = 0 t 1 .
Its average valuation is
A ( W ( k , t ) ) : = 1 t i = 0 t 1 ν 2 ( 3 n k + i + 1 ) .
Definition 7.10
(Dangerous window). Fix t max N and ε > 0 . A window W ( k , t ) with 1 t t max is dangerous if
A ( W ( k , t ) ) > log 2 3 ε .
Definition 7.11
(Catalogue of dangerous patterns). Fix a valuation cutoff L 1 . For each t 1 define
A t ( L ) : = ( a 0 , , a t 1 ) { 1 , , L } t : 1 t i = 0 t 1 a i > log 2 3 ε ,
and set
A t max ( L ) : = t = 1 t max A t ( L ) .
Elements of A t max ( L ) are the Type I dangerous patterns .
Lemma 7.12
(Congruence realization of dangerous patterns). Let a = ( a 0 , , a t 1 ) be any valuation pattern with 1 t t max and 1 a i L . There exists a modulus
M ( a ) = 2 K ( a ) 3 t ,
where K ( a ) C i = 0 t 1 ( a i + 1 ) for an absolute constant C, and a finite nonempty set of odd residue classes
R ( a ) ( Z / M ( a ) Z ) ×
such that an odd integer n k satisfies
ν 2 ( 3 n k + i + 1 ) = a i , i = 0 , , t 1 ,
if and only if
n k mod M ( a ) R ( a ) .
Proof. 
We prove this by induction along the window using the explicit odd–to–odd transition formula and Lemma 6.16.
(1) The congruence condition for a single valuation. Fix an integer a 1 . Lemma 6.16 shows that the condition
n odd , ν 2 ( 3 n + 1 ) = a
is equivalent to a finite disjunction of residue conditions
n r ( mod 2 a + 1 ) ,
where r ranges over a finite subset of odd residues modulo 2 a + 1 . More precisely, Lemma 6.16 proves that
ν 2 ( 3 n + 1 ) = a 3 n + 1 0 ( mod 2 a ) , 3 n + 1 ¬ 0 ( mod 2 a + 1 ) ,
and both congruences are linear conditions on n modulo 2 a + 1 . Thus there is a finite set R ( a ) ( Z / 2 a + 1 Z ) × such that n satisfies ν 2 ( 3 n + 1 ) = a if and only if n mod 2 a + 1 R ( a ) .
(2) Odd–to–odd transitions. For an odd integer n with ν 2 ( 3 n + 1 ) = a , the next odd iterate is
T ( n ) = 3 n + 1 2 a .
Since 3 n + 1 1 ( mod 3 ) , the map n T ( n ) is well defined on odd residue classes modulo 3 · 2 a + 1 and induces a map to odd residue classes modulo 3 · 2 a .
(3) Inductive construction of the residue conditions along the window. Let n 0 denote the starting odd integer at the beginning of the window. We encode the conditions
ν 2 ( 3 n i + 1 ) = a i , n i + 1 = T ( n i ) = 3 n i + 1 2 a i , i = 0 , , t 1 .
Assume inductively that for some i 0 there exists a modulus M i and a finite set of odd residue classes R i ( Z / M i Z ) × such that
n 0 mod M i R i ν 2 ( 3 n j + 1 ) = a j for j = 0 , , i 1 .
We now add the condition ν 2 ( 3 n i + 1 ) = a i . From Step (1), this is equivalent to
n i mod 2 a i + 1 R ( a i ) .
Next, n i is an affine function of n 0 : iterating
n j + 1 = 3 n j + 1 2 a j
gives (as in the derivation of formula (166)) an explicit expression
n i = 3 i 2 S i n 0 + C i 2 S i , S i = j = 0 i 1 a j ,
with C i an integer depending only on the pattern ( a 0 , , a i 1 ) . Thus the condition n i mod 2 a i + 1 R ( a i ) becomes a linear congruence condition on n 0 modulo 2 a i + 1 + S i .
Therefore, setting
M i + 1 : = lcm ( M i , 2 a i + 1 + S i , 3 i + 1 ) ,
the set of n 0 satisfying the conditions up to index i is a finite union of residue classes modulo M i + 1 . We denote this finite set by R i + 1 ( Z / M i + 1 Z ) × .
After t steps, we obtain a modulus
M ( a ) = M t = 2 K ( a ) 3 t ,
where
K ( a ) = a 0 + 1 + ( a 1 + 1 + S 1 ) + + ( a t 1 + 1 + S t 1 )
is bounded by a constant multiple of i ( a i + 1 ) . The corresponding residue set R ( a ) = R t satisfies
n 0 mod M ( a ) R ( a ) ν 2 ( 3 n i + 1 ) = a i ( 0 i < t ) .
Renaming n 0 as n k completes the proof. □
Definition 7.13
(Dangerous residue set). Fix L 0 1 . Let M be a common multiple of the moduli M ( a ) over all a A t max ( L 0 ) . Define the dangerous residue set modulo M by
V danger ( L 0 ) : = a A t max ( L 0 ) r Z / M Z : r mod M ( a ) R ( a ) .
Definition 7.14
(Finite residue–valuation graph). Fix L 0 and the corresponding modulus M . Consider the accelerated odd map
T ( n ) = 3 n + 1 2 ν 2 ( 3 n + 1 )
on odd residue classes modulo M . Define a finite directed graph G L 0 ( M ) with vertex set the odd residues r Z / M Z , and with an edge r s if T ( r ) s ( mod M ) . Each vertex carries the label
a ( r ) : = ν 2 ( 3 r + 1 ) .
Let
V L 0 + 1 : = { r : a ( r ) L 0 + 1 } .
Lemma 7.15
(Tail constraint under negation of weak non–retreat). Fix t max N and ε > 0 , and let L 0 1 be a valuation cutoff. Let ( n k ) k 0 be a nonperiodic accelerated odd orbit with the property that there exists k 0 such that for every k k 0 and every 1 t t max the valuation window W ( k , t ) is dangerous:
1 t i = 0 t 1 ν 2 ( 3 n k + i + 1 ) > log 2 3 ε .
Choose M so that the dangerous length–one patterns with values in { 1 , , L 0 } lift to a set V danger ( L 0 ) ( Z / M Z ) × as in Lemma 7.12, and define
V L 0 + 1 : = { r ( Z / M Z ) × : ν 2 ( 3 r + 1 ) L 0 + 1 } .
Then for all sufficiently large k,
n k mod M V danger ( L 0 ) V L 0 + 1 .
In particular, the tail of ( n k ) determines an infinite directed path in the induced subgraph of G L 0 ( M ) on V danger ( L 0 ) V L 0 + 1 ) .
Proof. 
By hypothesis, there exists k 0 such that for every k k 0 the length–one window
W ( k , 1 ) = ( a 0 ) , a 0 : = ν 2 ( 3 n k + 1 ) ,
is dangerous, hence
a 0 > log 2 3 ε .
We distinguish two cases.
Case 1: 1 a 0 L 0 . Then the length–one pattern ( a 0 ) lies in the dangerous catalogue with bound L 0 , hence Lemma 7.12 yields a congruence description for ν 2 ( 3 n k + 1 ) = a 0 . By construction of the common modulus M and the lifted union V danger ( L 0 ) , this implies
n k mod M V danger ( L 0 ) .
Case 2: a 0 L 0 + 1 . Then by definition,
n k mod M V L 0 + 1 .
Thus for every k k 0 ,
n k mod M V danger ( L 0 ) V L 0 + 1 .
Since n k + 1 = T ( n k ) , the residue sequence ( n k mod M ) follows edges in G L 0 ( M ) , and the tail lies in the induced subgraph on V danger ( L 0 ) V L 0 + 1 . □
Theorem 7.16
(Forward non–trapping in the dangerous regime). Fix t max N and 0 < ε < log 2 3 1 . Then there exists L 0 sufficiently large with the following property. Let M, V danger ( L 0 ) and V L 0 + 1 be as in the construction of the residue–valuation graph G L 0 ( M ) .
Let ( n k ) k 0 be a nonperiodic accelerated odd Collatz orbit such that there exists k 0 with the property that for every k k 0 and every 1 t t max the window W ( k , t ) is dangerous:
A ( W ( k , t ) ) = 1 t i = 0 t 1 ν 2 ( 3 n k + i + 1 ) > log 2 3 ε .
Then:
1.
High–valuation anti–trapping. The residue sequence ( n k mod M ) is not eventually contained in V L 0 + 1 .
2.
Confinement to the dangerous residue set. If ( n k mod M ) visits both V danger ( L 0 ) and V L 0 + 1 infinitely often, then there exists K k 0 such that
n k mod M V danger ( L 0 ) for all k K .
Proof. 
We break the proof into two parts, one for each statement.
Part 1: High–valuation anti–trapping. Let a k : = ν 2 ( 3 n k + 1 ) . If a k L 0 + 1 , then
n k + 1 = 3 n k + 1 2 a k 3 n k + 1 2 L 0 + 1 = 3 2 L 0 + 1 n k + 1 2 L 0 + 1 .
Define
λ : = 3 2 L 0 + 1 , C 0 : = 1 2 L 0 + 1 .
For L 0 2 one has λ < 1 , and hence
n k + 1 λ n k + C 0 whenever a k L 0 + 1 .
If there exists K such that a k L 0 + 1 for all k K , then iterating (190) gives
n K + t λ t n K + C 0 j = 0 t 1 λ j λ t n K + C 0 1 λ for all t 0 .
Thus the tail is bounded. Since T is a deterministic self–map of the positive odd integers, a bounded orbit is eventually periodic. This contradicts nonperiodicity. Hence ( n k mod M ) is not eventually contained in V L 0 + 1 .
Part 2: Confinement to the dangerous residue set. Assume ( n k mod M ) visits both V danger ( L 0 ) and V L 0 + 1 infinitely often, and all windows of length t max are dangerous. Write a k : = ν 2 ( 3 n k + 1 ) .
For n k V L 0 + 1 , the estimate (190) holds with λ < 1 .
For n k V danger ( L 0 ) , the case t = 1 of the dangerous–window condition gives
a k > log 2 3 ε , a k N ,
hence a k 2 since 0 < ε < log 2 3 1 . Therefore
n k + 1 = 3 n k + 1 2 a k 3 2 a k n k + 1 2 a k 3 2 log 2 3 ε n k + 1 = 2 ε n k + 1 .
Thus there exists C 1 such that
n k + 1 2 ε n k + C 1 whenever n k V danger ( L 0 ) .
Decompose the tail into alternating blocks: a block of t j 1 consecutive indices in V danger ( L 0 ) followed by a block of j 1 consecutive indices in V L 0 + 1 , returning to V danger ( L 0 ) at the end. Along one such combined block, the accelerated dynamics gives an affine relation
n k j + 1 = F ( P j ) n k j + β j ,
with multiplicative factor
F ( P j ) : = m = 0 t j + j 1 3 2 a k j + m .
Using a k j + m 2 throughout the combined block and a k j + m L 0 + 1 on the high–valuation part, one obtains a uniform bound F ( P j ) Λ < 1 for a constant Λ depending only on ε (and the choice of L 0 ). Hence there exists C 4 such that
n k j + 1 Λ n k j + C 4 for all j .
Iterating (192) shows that ( n k j ) is bounded, and the intermediate values inside each finite block are bounded by iterating (191) and (190). Thus the orbit tail is bounded, hence eventually periodic, contradicting nonperiodicity. Therefore there exists K such that n k mod M V danger ( L 0 ) for all k K . □
Theorem 7.17
(Non–realizability of dangerous residue cycles). Let H be the induced subgraph of the residue–valuation graph G L 0 ( M ) on the dangerous residue set V danger ( L 0 ) . Let
C = ( v 0 v 1 v p 1 v 0 )
be any directed cycle in H, where v i V danger ( L 0 ) and each edge v i v i + 1 (indices modulo p) is an edge in G L 0 ( M ) . Then no infinite accelerated odd Collatz orbit ( n k ) k 0 can eventually realize C modulo M in the sense that there do not exist integers i 0 and p 1 such that
n k v ( k i ) mod p ( mod M ) for all k i .
Equivalently, no nonperiodic accelerated odd orbit can have its residue sequence ( n k mod M ) eventually trapped on a directed cycle inside H.
Proof. 
We prove that no infinite nonperiodic accelerated odd Collatz orbit can eventually follow a directed cycle in the dangerous residue graph H .
(1) Structure of a dangerous cycle. Let
C = ( v 0 v 1 v p 1 v 0 )
be a directed cycle in H of length p 1 . For each i set
a i : = ν 2 ( 3 v i + 1 ) .
Since v i V danger ( L 0 ) , each a i satisfies 1 a i L 0 . Define
S p : = i = 0 p 1 a i , α : = 3 p 2 S p .
In lowest terms,
α = A B , A = 3 p odd , B = 2 S p > 1 ,
so α 1 .
(2) Lifting the cycle to an integer orbit. Assume for contradiction that there exists an infinite nonperiodic accelerated odd Collatz orbit ( n k ) k 0 that eventually follows C modulo M . Thus there exists i 0 0 such that for all k i 0 ,
n k v ( k i 0 ) mod p ( mod M ) .
Iterating the accelerated map along one period yields an affine relation
n k + p = 3 p 2 S p n k + B r 2 S p = α n k + β r ,
where k = i 0 + m p + r with 0 r < p , and B r Z depends only on the residue v r (with β r : = B r / B ).
(3) Recurrence and explicit solution. Fix r { 0 , , p 1 } and define
x m : = n i 0 + m p + r , m 0 .
Then (193) gives
x m + 1 = α x m + β r , α = A B 1 .
The affine recurrence has the explicit solution
x m = α m x 0 + β r α m 1 α 1 = α m C r + D r ,
with
C r : = x 0 β r α 1 , D r : = β r α 1 .
(4) Denominator growth forces C r = 0 . Write C r in lowest terms as
C r = c 2 t d ,
where c , d Z are odd and t 0 . Using α = 3 p / 2 S p yields
α m C r = 3 p 2 S p m c 2 t d = 3 p m c 2 m S p + t d .
Since x m Z for all m , the expression α m C r has bounded denominators only if c = 0 ; otherwise the denominator contains 2 m S p + t with S p 1 , which grows without cancellation against the odd numerator 3 p m c . Hence c = 0 and C r = 0 .
(5) Eventual periodicity. With C r = 0 , the formula reduces to x m = D r for all m 0 , so each residue–class subsequence ( n i 0 + m p + r ) m 0 is constant. Thus ( n k ) k i 0 is periodic with period dividing p , contradicting nonperiodicity. This proves the theorem. □
Lemma 7.18
(Confinement plus bounded dangerous run length implies dangerous long windows). Fix ε > 0 and set δ : = log 2 3 ε . Let ( n k ) k 0 be an accelerated odd Collatz orbit, so a k : = ν 2 ( 3 n k + 1 ) 1 for all k. Assume there exist integers L 0 1 , t max 1 , and k 0 0 such that
n k mod M V danger ( L 0 ) V L 0 + 1 for all k k 0 ,
and such that the orbit hits the high–valuation set at least once every t max steps, in the sense that
k k 0 , i { 0 , 1 , , t max 1 } with n k + i mod M V L 0 + 1 .
If L 0 is chosen so that
L 0 + t max t max > δ ,
then for every k k 0 the length– t max valuation window is dangerous:
1 t max i = 0 t max 1 ν 2 ( 3 n k + i + 1 ) > log 2 3 ε .
Proof. 
Fix k k 0 . By (195) there exists i { 0 , , t max 1 } with n k + i mod M V L 0 + 1 , hence a k + i = ν 2 ( 3 n k + i + 1 ) L 0 + 1 by definition of V L 0 + 1 . For all other indices j { 0 , , t max 1 } one has a k + j 1 since the orbit is odd. Therefore
j = 0 t max 1 a k + j ( L 0 + 1 ) + ( t max 1 ) = L 0 + t max .
Dividing by t max and using (196) gives
1 t max j = 0 t max 1 a k + j > δ = log 2 3 ε ,
as claimed. □
Lemma 7.19
(Finite graph periodicity). Let H be the induced subgraph of G L 0 ( M ) on V danger ( L 0 ) . Then every infinite directed path in H is eventually periodic. That is, there exist integers i 0 and p 1 such that the vertex sequence ( v k ) k 0 satisfies
v k + p = v k for all k i .
Proof. 
In the residue–valuation graph, each vertex has outdegree 1 since T ( r ) mod M is uniquely determined. Hence the directed path is a deterministic forward orbit in a finite set. Since H has finitely many vertices, there exist indices 0 i < j with v i = v j ; choose such a pair with j i minimal and set p : = j i 1 . Determinism then implies
v i + m = v j + m for all m 0 ,
hence v k + p = v k for all k i . □
Lemma 7.20
(Finite graph obstruction). No infinite directed path in H can arise as the tail residue sequence n k mod M of a valid infinite nonperiodic accelerated odd Collatz orbit ( n k ) k 0 .
Proof. 
Let ( n k ) k 0 be an infinite nonperiodic accelerated odd Collatz orbit and set v k : = n k mod M . Assume that v k V danger ( L 0 ) for all sufficiently large k , so ( v k ) is an infinite directed path in H from some time onward. By Lemma 7.19, there exist integers i and p such that v k + p = v k for all k i . The tail therefore follows a directed cycle in H , contradicting Theorem 7.17. Hence no such tail can occur. □
Theorem 7.21
(Weak non–retreat). Every nonperiodic accelerated odd Collatz orbit admits infinitely many windows of length 1 t t max whose average valuation satisfies
A ( W ( k , t ) ) log 2 3 ε .
Consequently, by Theorem 6.18 there exist constants c > 0 and C 0 such that
J ( k + t ) J ( k ) + c t C 0
for infinitely many pairs ( k , t ) , and Statement (6) of Theorem 1.1 follows.
Proof. 
Let ( n k ) k 0 be a nonperiodic accelerated odd Collatz orbit. Assume toward contradiction that only finitely many pairs ( k , t ) with 1 t t max satisfy
A ( W ( k , t ) ) log 2 3 ε .
Then there exists k 0 such that for all k k 0 and all 1 t t max , the window W ( k , t ) is dangerous. This places us in the dangerous–windows regime of Theorem 7.16. Fix L 0 and the associated modulus M and residue sets V danger ( L 0 ) and V L 0 + 1 . By Lemma 7.15 there exists k 1 k 0 such that for all k k 1 ,
n k mod M V danger ( L 0 ) V L 0 + 1 .
By Theorem 7.16(1), the residue sequence is not eventually contained in V L 0 + 1 , hence V danger ( L 0 ) is visited infinitely often. If V L 0 + 1 is visited only finitely many times, then n k mod M V danger ( L 0 ) for all sufficiently large k . If V L 0 + 1 is visited infinitely often, then Theorem 7.16(2) again yields an index K such that
n k mod M V danger ( L 0 ) for all k K .
In either case, the tail ( n k mod M ) k K is an infinite directed path in the induced subgraph H on V danger ( L 0 ) .
Lemma 7.20 excludes such a tail for a valid infinite nonperiodic orbit, contradicting the standing hypotheses. Therefore there exist infinitely many pairs ( k , t ) with 1 t t max such that
A ( W ( k , t ) ) log 2 3 ε .
For each such window, Theorem 6.18 yields
J ( k + t ) J ( k ) + c t C 0
with constants depending only on the global parameters. Since there are infinitely many such windows, these inequalities hold along an infinite subsequence, giving weak non–retreat. □

7.4. Logical Closure of the Spectral-Dynamical Framework

In this section we formalize all the propositions and their proofs for the Dynamical Forms Theorem 1.1 (3) ⇒ (2)
Proposition 7.22
(Orbit–supported invariant functionals are impossible). Let P be the backward Collatz transfer operator on B tree , σ and assume the Peripheral Spectral Classification Theorem 6.2. Then there is no nonzero P * –invariant functional in B tree , σ * whose support is contained in a single forward Collatz orbit.
In particular, statement (3) of Theorem 1.2 (existence of a nonzero orbit–supported invariant functional for every infinite orbit) cannot hold unless there are no infinite forward orbits at all. Hence (3) implies (2).
Proof. 
By Theorem 6.2 and positivity of P , there exist h > 0 and ϕ > 0 with P h = h , P * ϕ = ϕ , and the eigenspace for eigenvalue 1 of P * is one–dimensional and spanned by ϕ . Thus if Ψ B tree , σ * satisfies P * Ψ = Ψ , then Ψ = c ϕ for some scalar c .
Fix a forward orbit O + ( n 0 ) = { T k ( n 0 ) : k 0 } and suppose there exists a nonzero P * –invariant functional Ψ supported on this orbit, in the sense that Ψ ( f ) = 0 whenever f vanishes on O + ( n 0 ) . Then Ψ = c ϕ with c 0 .
Consider the nonnegative test function
f ( n ) : = 1 N O + ( n 0 ) ( n ) .
By support of Ψ on the orbit we have Ψ ( f ) = 0 . On the other hand, f 0 and f ¬ 0 , so strict positivity of ϕ gives ϕ ( f ) > 0 and hence
Ψ ( f ) = c ϕ ( f ) 0 ,
a contradiction. Therefore no nonzero P * –invariant functional can be supported on a single forward orbit. If statement (3) of Theorem 1.1 holds, then every infinite forward orbit would support such a functional, which we have just ruled out. Hence there are no infinite forward orbits, and in particular every orbit has bounded block index: sup k 0 j ( n k ) < , which is statement (2). □
(3) + (8) ⇒ (1)
Proposition 7.23
(Strong Collatz from orbit averages and residue graphs). Assume Theorem 6.3 ((3) of Theorem 1.1 ) with Theorems 7.17 and 7.16 . Then statement (1) of Theorem 1.1 (Strong Collatz Conjecture) holds.
Proof.
(1): Exclusion of nontrivial cycles. By Theorems 7.17 and 7.16, no nontrivial accelerated odd Collatz cycle can persist inside the dangerous residue graph, and every nonperiodic accelerated odd orbit eventually escapes it. The argument given previously in the proof of Theorem 7.21 therefore excludes all nontrivial cycles; we do not repeat it here.
(2) Exclusion of infinite orbits via orbit–averaging and spectral rigidity. Suppose for contradiction that there exists an infinite forward Collatz orbit ( n k ) k 0 with base point n 0 . By Orbit–Averaging, Theorem 6.3, this orbit produces a nonzero P * –invariant functional
Λ B tree , σ * , P * Λ = Λ ,
which is obtained as an orbit–average along ( n k ) k 0 . In particular, if a test function f B tree , σ satisfies f ( n k ) = 0 for all sufficiently large k , then the orbit–average of f along ( n k ) vanishes and hence
Λ ( f ) = 0 .
On the other hand, the spectral classification theorem (uniqueness of the positive eigenfunctional, Theorem 5.1) asserts that every P * –invariant functional is a scalar multiple of the distinguished Perron–Frobenius eigenfunctional ϕ :
Λ = α ϕ , α 0 .
We now construct a smooth cutoff that lies in B tree , σ and vanishes along the tail of the orbit. By the nontrapping and escape statement (8), the forward orbit ( n k ) cannot remain indefinitely in any fixed low region of the tree. Hence we may choose m N so large that
n k > m for all k K
for some K 0 . Define a test function h m : N [ 0 , ) by
h m ( n ) : = m 2 σ 1 { n m } n σ .
We first verify that h m B tree , σ . The weighted 1 norm is finite:
h m σ 1 = n 1 h m ( n ) n σ = n m m 2 σ n 2 σ m 2 σ m 1 2 σ = m 1 2 σ <
since σ > 1 . Moreover h m is supported on the finite set { 1 , , m } , so only finitely many tree blocks I j intersect its support. In each such block, the contribution to the block seminorm comes from finitely many points with uniformly bounded size, multiplied by the decaying prefactor m 2 σ . Thus the tree seminorm of h m is finite and h m B tree , σ .
By construction, we have h m ( n k ) = 0 for all k K , since n k > m implies h m ( n k ) = 0 . Therefore h m vanishes along the tail of the orbit, and the defining orbit–averaging property of Λ yields
Λ ( h m ) = 0
via (197). On the other hand, write ϕ ( f ) = n 1 f ( n ) h ( n ) with h the strictly positive Perron–Frobenius eigenfunction satisfying P h = h . Then
ϕ ( h m ) = n m h m ( n ) h ( n ) = m 2 σ n m n σ h ( n ) > 0 ,
because h ( n ) > 0 for all n 1 and the sum is finite and nonempty. Combining the two expressions for Λ ( h m ) gives
0 = Λ ( h m ) = α ϕ ( h m ) ,
which contradicts α 0 and ϕ ( h m ) > 0 . This contradiction shows that no infinite forward Collatz orbit can exist.
We have ruled out both nontrivial cycles (Step 1) and infinite orbits (Step 2). Hence every forward Collatz orbit is finite, and the only cycle is the trivial 1 4 2 1 cycle. This is precisely statement (1) of Theorem 1.1. □
(4) ⇒ (3)
Proposition 7.24
(Block orbit averages imply the Orbit–Averaging). Assume statement (4) of Theorem 1.1 Block–Orbit–Averaging . Then statement (3) of Theorem 1.1 (then Orbit–Averaging Theorem, Theorem 6.3) holds. In particular, (4) implies (3) in Theorem 1.1.
Proof. 
Let ( n k ) k 0 be an infinite forward Collatz orbit with base point n 0 . For each N 1 consider the orbit Cesàro functional
Φ N ( f ) : = 1 N k = 0 N 1 f ( n k ) , f B tree , σ .
By Lemma 5.30, the family ( Φ N ) N 1 is uniformly bounded in B tree , σ * , hence weak* relatively compact.
Statement (4) asserts, for this orbit, the existence of a nontrivial block–orbit average. Concretely, it guarantees that there exists a sequence N j and a nonzero functional
Λ B tree , σ * { 0 }
such that
Φ N j weak * Λ .
Thus Λ is a nonzero weak* limit point of the orbit Cesàro functionals for ( n k ) . By Proposition 5.31, every such weak* limit of ( Φ N ) along a forward Collatz orbit satisfies two key properties:
P * Λ = Λ ,
and Λ is supported entirely on the orbit O + ( n 0 ) in the sense that
f ( n k ) = 0 k 0 Λ ( f ) = 0 .
Since Λ 0 , this gives exactly the conclusion of statement (3) for the orbit O + ( n 0 ) : it produces a nonzero P * –invariant linear functional supported on that orbit.
Because the choice of the infinite orbit ( n k ) was arbitrary, the same argument applies to every infinite forward orbit. Hence Orbit–Averaging (3) holds whenever the Block–Orbit–Averaging statement (4) holds, and we have shown (4) ⇒ (3) in Theorem 1.1. □
(5) ⇒ (4)
Proposition 7.25
(Implication from supercritical linear block growth to BOA). Assume Statement (5) (Block–Escape Implies Supercritical Linear Block Growth, Proposition 6.14) . Then Statement (4) (Block–Orbit–Averaging, Proposition  6.7 ) holds.
Proof. 
Suppose for contradiction that Statement (4) fails. Then there exists an infinite forward Collatz orbit
O + ( n 0 ) = { T k ( n 0 ) } k 0
such that for every integer J 0 the orbit does not spend a positive proportion of time in the finite union of low blocks j J I j . In terms of the block index
J ( k ) : = the unique j 0 with T k ( n 0 ) I j ,
this means that for every J 0 the lower asymptotic frequency of visits to j J I j is zero. Equivalently, the orbit satisfies the block–escape condition of Definition 6.9, namely
lim N 1 N k = 0 N 1 1 { J ( k ) J } = 0 for every J 0 ,
or the Block–Escape Property. By Statement (5) (Proposition 6.14), every infinite orbit satisfying BEP must exhibit supercritical linear block growth. Concretely, there exists a constant
α > log 2 log 6
and a strictly increasing subsequence ( k ) 1 with k such that
J ( k ) α k for all 1 .
Thus the orbit O + ( n 0 ) satisfies both the block–escape condition (198) and the linear growth condition (199).
We now invoke Proposition 7.6. That result states that if an infinite orbit satisfies the block–escape condition
J 0 0 : lim N 1 N k = 0 N 1 1 { J ( k ) J 0 } = 0
and there exists β > 0 and a subsequence ( k ) with
J ( k ) β k for all ,
then the orbit violates the universal exponential growth bound of Lemma 6.11. In the proof of Proposition 7.6 one obtains
lim sup 1 k log T k ( n 0 ) β log 6 .
Taking β = α and using α > log 2 / log 6 , this yields
lim sup 1 k log T k ( n 0 ) α log 6 > log 2 ,
which contradicts Lemma 6.11, since that lemma gives the universal upper bound
lim sup k 1 k log T k ( n 0 ) log 2 .
Therefore no infinite orbit can simultaneously satisfy BEP and the supercritical linear growth condition (199). In particular, our initial assumption that there exists an infinite orbit for which Statement (4) fails leads to a contradiction, because failure of (4) was exactly what forced BEP in (198).
Taking the contrapositive, every infinite orbit must fail BEP. Thus for each infinite orbit O + ( n 0 ) there exists some J 0 0 such that the lower asymptotic frequency of visits to j J 0 I j is strictly positive:
lim inf N 1 N k = 0 N 1 1 { J ( k ) J 0 } > .
This is precisely the Block–Orbit–Averaging statement (4). Hence, under Statement (5), Statement (4) holds. □
(6) ⇒ (5)
Lemma 7.26
(Flexible choice of drift constants). Let ( n k ) be a forward orbit and suppose there exist ε > 0 , t max N , and K 0 0 such that for all k K 0 and all 1 t t max one has
j ( n k + t ) j ( n k ) + ε t 1 .
Then for any δ ( 0 , ε ) the choice
c : = ε δ , C 0 : = 1
also yields a valid weak non–retreat inequality:
j ( n k + t ) j ( n k ) + c t C 0 , 1 t t max , k K 0 .
Moreover, for these constants one has
c t max C 0 t max = ( ε δ ) 1 t max .
Proof. 
Fix δ ( 0 , ε ) and set c : = ε δ and C 0 : = 1 . Then for each 1 t t max we compute
ε t 1 c t C 0 = ( ε t 1 ) ( ε δ ) t 1 = δ t δ > 0 .
Thus the right–hand side of (200) is strictly larger than c t C 0 for every t [ 1 , t max ] , and hence (200) implies (201) for all k K 0 . The identity (202) is an immediate algebraic rearrangement:
c t max C 0 t max = ( ε δ ) t max 1 t max = ( ε δ ) 1 t max .
This proves the lemma. □
Corollary 7.27
(Consistency of quantitative deficiency with valuation drift). Let α * = ( log 2 3 1 ) / log 2 6 . Suppose that for some choice of L 0 and t max the residue–graph analysis and Theorem 7.21 produce a deficit parameter ε = ε ( L 0 , t max ) such that
ε ( L 0 , t max ) > α * + η + 1 t max
for a given η > 0 . Then there exist constants c > 0 , C 0 0 , and K 0 0 such that
j ( n k + t ) j ( n k ) + c t C 0 , 1 t t max , k K 0 ,
and such that
c t max C 0 t max > α * + η .
Proof. 
Apply Theorem 6.18 with the deficit parameter ε = ε ( L 0 , t max ) to obtain an inequality of the form (200) for some K 0 . Fix η > 0 and choose δ 0 , ε ( L 0 , t max ) so small that
ε ( L 0 , t max ) δ 1 t max > α * + η .
(This is possible precisely because of the strict inequality (203).) With this choice of δ , define c : = ε δ and C 0 : = 1 as in Lemma 7.26. Then (201) holds for all 1 t t max and all k K 0 , and by (202) we have
c t max C 0 t max = ( ε δ ) 1 t max > α * + η .
Proposition 7.28
(Quantitative weak non–retreat implies supercritical linear block growth). Let O + ( n 0 ) = { T k ( n 0 ) } k 0 be an infinite forward orbit and let J ( k ) be defined by
T k ( n 0 ) [ 6 J ( k ) , 6 J ( k ) + 1 ) .
Assume the orbit satisfies the Block–Escape Property
J 0 0 , lim N 1 N k = 0 N 1 1 { J ( k ) J 0 } = 0 .
Assume in addition that there exist constants c > 0 , C 0 0 , t max N , and J 0 0 such that for all sufficiently large k with J ( k ) J 0 one has the quantitative weak non–retreat estimate
J ( k + t ) J ( k ) + c t C 0 ( 1 t t max ) .
Define
α * : = log 2 3 1 log 2 6 = log ( 3 / 2 ) log 6 , γ : = c t max C 0 t max .
If γ > α * , then there exists α with α * < α < γ and an increasing subsequence k such that
J ( k ) α k ( 1 ) .
Equivalently, T k ( n 0 ) 6 α k for all ℓ.
Proof. 
Let n k = T k ( n 0 ) and J ( k ) be the block index. Since the orbit satisfies the Block–Escape Property, there exist arbitrarily large k with J ( k ) J 0 . Choose k 0 so large that J ( k 0 ) J 0 and such that the weak non–retreat estimate
J ( k + t ) J ( k ) + c t C 0
holds for every k k 0 with J ( k ) J 0 and every 1 t t max .
Define a subsequence by
k : = k 0 + t max ( 0 ) .
Since γ = c t max C 0 t max > α * > 0 , one has c t max C 0 > 0 . Applying (205) with t = t max at k = k gives
J ( k + 1 ) J ( k ) + ( c t max C 0 ) ,
hence by induction J ( k ) J 0 for all , and the estimate can be iterated for all . Summing yields
J ( k ) J ( k 0 ) + ( c t max C 0 ) = J ( k 0 ) + γ t max .
Using k = k 0 + t max , this becomes
J ( k ) γ k C , C : = γ k 0 J ( k 0 ) .
Choose any α with α * < α < γ . Then ( γ α ) k , so for all sufficiently large one has γ k C α k , hence
J ( k ) α k
for all sufficiently large . The inequality n k 6 J ( k ) 6 α k follows from the block definition. □
(8) ⇒ (6) This is Theorem 7.21.

8. Toward a Spectral Calculus for Arithmetic Dynamical Systems

The analytic framework developed here for the backward Collatz operator highlights a broader spectral calculus for discrete arithmetic maps. For any map T : N N with finitely many inverse branches, one may associate the transfer operator
( P f ) ( n ) = m : T ( m ) = n f ( m ) w ( m ) ,
whose spectral behavior reflects the combinatorial and arithmetic structure of T .
When P acts on weighted sequence spaces such as σ 1 or on the multiscale tree space B tree , σ , its Dirichlet transform intertwines
D ( P f ) ( s ) = L s D ( f ) ( s ) , D ( f ) ( s ) = n 1 f ( n ) n s ,
so that spectral information for P passes directly to the analytic continuation and pole structure of the complex family L s . In this duality, the arithmetic operator P and its analytic realization L s represent two facets of a single dynamical mechanism: backward iteration in arithmetic space mirrored by analytic continuation in Dirichlet space.
For quasi–compact operators satisfying Lasota–Yorke inequalities on B tree , σ , one obtains the spectral decomposition
P = | λ i | > ρ ess ( P ) λ i Π i + N , ρ ess ( P ) < 1 ,
together with the associated dynamical zeta function
ζ P ( s ) = det ( I s P ) 1 = exp k 1 s k k Tr ( P k ) ,
whose poles coincide with eigenvalues of P outside the essential spectrum and with the resonant singularities of L s . This creates a coherent analytic picture in which resolvents, spectral projections, Dirichlet envelopes, and dynamical determinants arise as aspects of the same operator geometry.
Beyond the Collatz operator, analogous structures arise for general affine–congruence systems
n a j n + b j , a j , b j N ,
for which
( P f ) ( m ) = j 1 { m b j ( mod a j ) } f m b j a j .
The corresponding Dirichlet transforms L s act by weighted composition on generating series. A unified spectral calculus would classify such arithmetic systems according to whether their backward operators are quasi–compact, admit meromorphic decompositions, or exhibit a spectral gap on suitable Banach geometries. This analytic classification parallels the dynamical trichotomy into terminating, periodic, and divergent regimes.
In the Collatz case, the results of this paper yield a complete spectral resolution of the backward dynamics. The operator P and its Dirichlet realization L s together provide a model of an arithmetic transfer operator in which analytic continuation, spectral gaps, and decay of correlations follow from explicit Lasota–Yorke estimates on B tree , σ . The contraction of L s for ( s ) > 1 , combined with the bound λ LY < 1 on B tree , σ , ensures that P is quasi–compact with a strict spectral gap. Consequently, the associated dynamical Dirichlet series admit uniform pole–remainder decompositions, and the invariant density exhibits an averaged 1 / n law: its block averages satisfy
c j = 1 6 j n I j h ( n ) = c 6 j + o ( 6 j ) ,
which corresponds to the mass distribution behaving like c / n to leading order on each block I j .
Boundary spectral geometry and parameter optimization Theorems 4.17 and 4.1 show that the Lasota–Yorke inequality on B tree yields a strict spectral gap at the boundary σ = 1 . A natural next step is to optimize the parameters ( α , ϑ ) defining the tree seminorm and to determine whether B tree is minimal or universal among Banach geometries admitting contraction. A quantitative study of
P f tree C P λ | f | tree + f 1
may reveal how λ depends on ϑ and how this dependence reflects asymmetries in the Collatz preimage tree. Showing that λ ( ϑ ) 0 as ϑ 0 would relate analytic contraction rates to the combinatorial entropy of inverse trajectories.
Residues, duality, and forward–backward correspondence The residue coefficients A k ( 1 ) , which decay geometrically as λ k , represent spectral invariants of the pole part of the dynamical Dirichlet zeta function. On the forward side, the heuristic contraction ( 3 / 4 ) k describes the average shrinkage of integers under iteration. A precise duality between these quantities would connect analytic and probabilistic aspects of the dynamics, expressing average stopping times and fluctuations in terms of the spectral radius of a normalized backward operator. Such a correspondence would yield a forward–backward conservation principle linking termination statistics with spectral invariants.
Extensions and universality The multiscale tree space equipped with a hybrid 1 –oscillation norm provides a flexible analytic environment for nonlinear integer maps. Future work may examine metric entropy, measure concentration, and universality phenomena induced by the tree geometry, seeking optimal weights or identifying extremal systems among those with λ < 1 . Such analysis would illuminate how nonlinear arithmetic recursions embed naturally into Banach geometries enforcing global contraction.
Dynamical Dirichlet zeta functions The functions
ζ C ( s , k ) = n 1 1 ( C k ( n ) ) s
belong to a broader class of dynamical Dirichlet zeta functions   ζ T ( s , k ) associated with iterates of arithmetic maps with finitely many inverse branches. Spectral gaps govern their meromorphic structure, and residues of their poles capture dynamical invariants. Extending this analysis to more general systems would connect the present framework with Ruelle–Perron–Frobenius theory and the analytic structure of dynamical determinants.
Broader outlook The spectral resolution of the Collatz dynamics developed here suggests a general spectral calculus for arithmetic dynamics in which termination, recurrence, and periodicity correspond to spectral features of noninvertible operators on Banach spaces of arithmetic functions. Future work should clarify how universal the Lasota–Yorke mechanism is for nonlinear arithmetic recursions, how arithmetic symmetries influence spectral gaps, and how probabilistic models of integer iteration emerge as weak limits of deterministic transfer operators. The Collatz operator studied here provides a detailed example in which a complete spectral description is obtained through explicit Lasota–Yorke estimates on a multiscale Banach space.

References

  1. Applegate, D.; Lagarias, J. C. Distribution trees for the 3x+1 problem. Exp. Math. 2003, 12(4), 475–490. [Google Scholar]
  2. Applegate, D.; Lagarias, J. C. Density bounds for the 3x+1 problem. Experimental Mathematics 2005, 14(2), 129–146. [Google Scholar]
  3. Eliahou, S. The 3x+1 problem: new lower bounds on nontrivial cycle lengths. Discrete Math. 1993, 118, 45–56. [Google Scholar] [CrossRef]
  4. Everett, C. J. Iteration of the number-theoretic function f(2n)=n, f(2n+1)=3n+2. Adv. Math. 1977, 25(1), 42–45. [Google Scholar] [CrossRef]
  5. Furstenberg, H. Recurrence in Ergodic Theory and Combinatorial Number Theory; Princeton Univ. Press, 1981. [Google Scholar]
  6. Hennion, H. Sur un théorème spectral et son application aux noyaux lipschitziens. Proceedings of the American Mathematical Society 1993, 118(2), 627–634. [Google Scholar]
  7. Tulcea, C. T. Ionescu; Marinescu, G. Théorie ergodique pour des classes d’opérations non complètement continues. Annals of Mathematics 1950, 52, 140–147. [Google Scholar] [CrossRef]
  8. Katok, A.; Hasselblatt, B. Introduction to the Modern Theory of Dynamical Systems; Cambridge Univ. Press, 1995. [Google Scholar]
  9. Kontorovich, A. V.; Lagarias, J. C. Stochastic models for the 3x+1 and 5x+1 problems. Unif. Distrib. Theory 2010, 5(2), 121–164. [Google Scholar]
  10. Korec, I. A density estimate for the 3x+1 problem. Math. Slovaca 1994, 44(1), 85–89. [Google Scholar]
  11. Krasikov, I. How many numbers satisfy the 3x+1 conjecture? Internat. J. Math. Math. Sci. 1989, 12(4), 791–796. [Google Scholar] [CrossRef]
  12. Lagarias, J. C. The 3x+1 problem and its generalizations. Amer. Math. Monthly 1985, 92(1), 3–23. [Google Scholar] [CrossRef]
  13. Lagarias, J. C. The Ultimate Challenge: The 3x+1 Problem. In Amer. Math. Soc.; 2010. [Google Scholar]
  14. Leventides, J.; Poulios, C. An operator theoretic approach to the 3x + 1 dynamical system. IFAC-PapersOnLine 24th International Symposium on Mathematical Theory of Networks and Systems MTNS 2020, 2021; 54, pp. 225–230. [Google Scholar]
  15. Meinardus, G. Some analytic aspects concerning the collatz problem. Technical Report 261, Universität Mannheim, Fakultät für Mathematik und Informatik, 2001. [Google Scholar]
  16. Neklyudov, M. Functional analysis approach to the collatz conjecture. arXiv 2022, arXiv:2106.11859. [Google Scholar] [CrossRef]
  17. Ruelle, D. Statistical mechanics of a one-dimensional lattice gas. Communications in Mathematical Physics 1968, 9, 267–278. [Google Scholar] [CrossRef]
  18. Ruelle, D. A measure associated with Axiom A attractors. American Journal of Mathematics 1976, 98(3), 619–654. [Google Scholar] [CrossRef]
  19. Sinai, Y. G. Statistical (3x+1) problem. Comm. Pure Appl. Math. 2003, 56(7), 1016–1028. [Google Scholar] [CrossRef]
  20. Tao, T. Almost all orbits of the collatz map attain almost bounded values. arXiv 2019, arXiv:1909.03562. [Google Scholar] [CrossRef]
  21. Terras, R. A stopping time problem on the positive integers. Acta Arith. 1976, 30, 241–252. [Google Scholar] [CrossRef]
  22. Terras, R. A stopping time problem on the positive integers. ii. Acta Arith. 1979, 33(3), 241–255. [Google Scholar] [CrossRef]
  23. Wirsching, G. J. The Dynamical System Generated by the 3n+1 Function. In of Lecture Notes in Mathematics; Springer, 1998; volume 1681. [Google Scholar]
1
The 6–adic scale is the one that matches the spectral normalization and the block recursion c j C 6 j in the multiscale fixed–point analysis; the half–blocks I j are a technical window used to localize the even/odd preimage contributions at scale j .
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.
Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

Disclaimer

Terms of Use

Privacy Policy

Privacy Settings

© 2026 MDPI (Basel, Switzerland) unless otherwise stated