The Collatz Conjecture and the Spectral Calculus for Arithmetic Dynamics

James Hateley

doi:10.20944/preprints202511.1440.v5

Submitted:

06 December 2025

Posted:

08 December 2025

Read the latest preprint version here

Abstract

We develop a complete operator-theoretic and spectral framework for the Collatz map by analyzing its backward transfer operator on weighted Banach spaces of arithmetic functions. The associated Dirichlet transforms form a holomorphic family that isolates a zeta-type pole at s=1, while on a finer multiscale space adapted to the dyadic-triadic geometry of the Collatz preimage tree we establish a two-norm Lasota-Yorke inequality with an explicit contraction constant, yielding quasi-compactness, a spectral gap, and a Perron-Frobenius theorem in which the eigenvalue 1 is algebraically and geometrically simple, no other spectrum meets the unit circle, and the unique invariant density is strictly positive. The fixed-point relation is converted into an exact multiscale recursion for the block averages c_j, revealing a rigid second-order coupling with exponentially small error terms and asymptotic profile c_j~ 6^-j. This spectral classification forces every weak* limit of the Cesàro averages derived from any hypothetical infinite forward orbit to be either 0 or a scalar multiple of the Perron-Frobenius functional, with convergence to 0 occurring precisely under the Block-Escape Property. Since the forward map satisfies an unconditional exponential upper bound, whereas Block-Escape combined with linear block growth along a subsequence would impose an incompatible exponential lower bound, all analytic and spectral components needed for such a contradiction are complete, reducing the Collatz conjecture to excluding infinite orbits exhibiting Block-Escape without the supercritical linear block growth prohibited by the spectral theory.

Keywords:

Collatz conjecture

;

transfer operators

;

spectral gap

;

quasi--compactness

;

Perron--Frobenius theory

;

Dirichlet transforms

;

arithmetic dynamics

;

multiscale analysis

;

block structure

;

Lasota--Yorke inequality

Subject:

Computer Science and Mathematics - Algebra and Number Theory

1. Introduction

The Collatz conjecture asserts that every positive integer n eventually reaches the trivial

1 \to 4 \to 2 \to 1

cycle under repeated application of

T (n) = \{\begin{matrix} n / 2, & n even, \\ 3 n + 1, & n odd . \end{matrix}

(1)

Equivalently, every forward orbit

O^{+} (n) = {T^{k} (n) : k \geq 0}

is conjectured to terminate in this trivial cycle. Despite its elementary definition, the iteration exhibits highly irregular behavior, with long episodes of apparent expansion intertwined with rapid collapses. This mixture of multiplicative and divisive effects has motivated extensive probabilistic, analytic, and computational investigation for more than eight decades. Foundational work of Terras [1,2] established early density results and stopping-time estimates, while the surveys of Lagarias [3,4] synthesized a broad range of heuristic and structural approaches. Later analytic contributions, including those of Meinardus [5] and Applegate–Lagarias [6], developed refined density bounds and asymptotic models for the distribution of orbit values. Nevertheless, the global termination problem remains open, and the fine-scale structure of Collatz trajectories continues to resist direct combinatorial analysis.

The goal of this paper is to place the Collatz conjecture within an analytic and operator–theoretic framework and to formulate a system of dynamical criteria whose interrelations isolate the precise mechanisms that must be ruled out in any proof of the conjecture. Rather than studying T directly, we analyze its inverse dynamics through the backward transfer operator

(P f) (n) : = \sum_{m : T (m) = n} \frac{f (m)}{m},

(2)

acting on arithmetic functions

f : N \to C

. Such operators arise naturally in statistical mechanics and dynamical systems [7,8] and have recently been applied to

3 x + 1

–type maps in analytic and functional–analytic settings [9,10]. For the Collatz map (1), each n has an even preimage

2 n

and, when

n \equiv 4 (mod 6)

, an odd preimage

(n - 1) / 3

, giving

(P f) (n) = \frac{f (2 n)}{2 n} + 1_{{n \equiv 4 (mod 6)}} \frac{f ((n - 1) / 3)}{(n - 1) / 3} .

(3)

The weights

1 / m

normalize the operator so that P acts as a logarithmically mass–preserving averaging operator on nonnegative

ℓ^{1}

sequences, reflecting the contraction inherent in the preimage structure of T.

On weighted spaces

ℓ_{σ}^{1}

, with norm

{∥ f ∥}_{σ} = \sum_{n \geq 1} | f (n) | n^{- σ}

, the Dirichlet transform

D f (s) = \sum_{n \geq 1} \frac{f (n)}{n^{s}},

(4)

intertwines P with analytic continuation in the half-plane

ℜ (s) > σ

. Uniform bounds on

P^{k}

yield exponential envelopes for

D (P^{k} f) (s)

and translate into meromorphic continuations of the associated Collatz–Dirichlet series. In this picture, the spectral radius of P on

ℓ_{σ}^{1}

encodes the global weighted expansion rate of inverse branches and determines the analytic location of dominant singularities. However, the

ℓ^{1}

theory does not capture the multiscale fluctuations central to Collatz dynamics, such as the dyadic–triadic block structure that organizes the Collatz preimage tree.

To resolve these finer features, we refine the analytic setting to a multiscale Banach space

B_{tree, σ}

built from dyadic–triadic block averages and oscillation seminorms mirroring the hierarchical geometry of the Collatz preimage tree. On this space the operator P satisfies a two-norm Lasota–Yorke inequality

{[P f]}_{tree, σ} \leq λ_{LY} {[f]}_{tree, σ} + C {∥ f ∥}_{σ}, 0 < λ_{LY} < 1,

placing the dynamics squarely within the Ionescu–Tulcea–Marinescu and Hennion theories for quasi–compact transfer operators [11,12]. The explicit verification of this inequality, including the strict contraction in the odd branch and the uniform blockwise control, is carried out in Section 5.

These spectral estimates give rise to a set of dynamical criteria describing block escape, orbitwise averaging, valuation windows, and backward-operator invariants. Each furnishes a logically distinct formulation of the Collatz problem, and the implications among them clarify what phenomena must be precluded in order to establish global termination. The next theorem records the main logical relations among these statements and their connection to the Strong Collatz Conjecture.

Theorem 1.1

(Dynamical Forms Connected to the Strong Collatz Conjecture). Let

I_{j} = [2 \cdot 3^{j}, 2 \cdot 3^{j + 1})

denote the jth Collatz block, sand for

n \in N

let

j (n)

be the unique index with

n \in I_{j (n)}

. Consider the following statements:

(1) Strong Collatz Conjecture. Every forward orbit is finite and the only cycle is the trivial

1 \to 4 \to 2 \to 1

cycle.

Remark 1.2. This is the classical problem stated by Collatz, also called the

3 x + 1

conjecture. Surveys by Lagarias [3,4] and Wirsching [13] provide comprehensive overviews of known results and approaches.

(2) No infinite block escape. For every forward orbit

(n_{k})

,

sup_{k \geq 0} j (n_{k}) < \infty .

Remark 1.3. With the hypothesis, there is only the trivial orbit this is equivalent to (1) by definition of

j (n)

. The negation (existence of an orbit with unbounded

j (n_{k})

) corresponds to a divergent orbit. Korec [14] showed that divergence implies visiting arbitrarily large numbers, but not specifically structured by powers of 3. The block structure

I_{j} = [2 \cdot 3^{j}, 2 \cdot 3^{j + 1})

appears in Wirsching’s functional-analytic approach [13] as a natural partition for studying growth rates.

(3) Orbit–Averaging Conjecture (Conjecture 6.3). Every infinite orbit produces a nonzero

P^{*}

–invariant linear functional supported entirely on that orbit.

Remark 1.4. This is an orbit-wise version of invariant measure constructions. Lagarias [3] constructed a 2-adic invariant measure for the

3 x + 1

map, while Tao [15] used almost-periodicity and limit functionals in his "almost all" result. The notion of extracting invariant functionals from single orbits relates to Birkhoff averages and unique ergodicity along orbits [16].

(4) Block–Orbit–Averaging (Conjecture 6.4). Every infinite orbit spends a positive proportion of time inside a finite union of low blocks

⋃_{j \leq J} I_{j}

for some J.

Remark 1.5.Stronger than Tao’s result [15] that almost all orbits eventually go below

f (N)

for some slow-growing f. This claims positive density returns foreveryinfinite orbit. Similar "recurrence to compact sets" appears in dynamical systems as a consequence of invariant measures [17]. For Collatz, it would follow from sufficiently fast growth in high blocks conflicting with known exponential bounds

T^{k} (n_{0}) \leq C 2^{k}

[18].

(5) Block–Escape Implies Supercritical Linear Block Growth (Conjecture 6.10). For every infinite orbit satisfying the Block–Escape Property, there exist

α > \frac{log 2}{log 6}

and a subsequence

k_{ℓ} \to \infty

such that

j (n_{k_{ℓ}}) \geq α k_{ℓ} .

Remark 1.6. The constant

α > log 2 / log 6 \approx 0.3868

is a minimal expansion factor. Similar linear-in-k lower bounds appear in heuristic models where

ν_{2} (3 n + 1)

averages less than

{log}_{2} 3

[19]. In branching random walk models [20], the growth rate per step is

log (3 / 2^{a})

where a is typical valuation; requiring

α > log 2 / log 6

corresponds to

a < {log}_{2} 3 - ε

on average, i.e., deficit windows as in Theorem 6.13.

(6) Weak Non–Retreat Principle. (Thm 7.20).For every infinite orbit

(n_{k})

there exist constants

c > 0

,

C_{0} \geq 0

, and

t_{max} \in N

with the following property: once

j (n_{k})

exceeds a threshold

J_{0}

, then for all sufficiently large such k there is some

1 \leq t \leq t_{max}

with

j (n_{k + t}) \geq j (n_{k}) + c t - C_{0} .

Remark 1.7. This is a finite-horizon forward drift condition. Similar "advance in bounded time" ideas appear in stopping time analyses [1,21], but those guarantee eventual decrease, not forward increase. The closest is Krasikov’s lower bounds on numbers satisfying

T^{k} (n) > n

[18], but not orbit-wise. Sinai’s random walk model [22] suggests that when

log n_{k}

is large, the expected drift per step is positive, but (6) demands a deterministic, uniform forward jump within bounded

t_{max}

.

(7) Orbitwise Discrepancy Vanishing (Conjecture 7.7).Which is later to be shown true by 7.28 Every

{weak}^{*}

limit Λ of Cesàro averages along an infinite orbit is

P^{*}

–invariant.

Remark 1.8. This is a version of the von Neumann mean ergodic theorem for a single orbit, assuming the limit exists. Related to unique ergodicity along orbits [16]. For

3 x + 1

, it posits asymptotic regularity of time averages, akin to the existence of rotation numbers for circle maps. Tao’s argument [15] uses such almost-periodicity for almost all orbits.

(8) Residue–Graph Nontrapping and Cycle Obstruction.These are Theorems 7.17 and 7.16 For some sufficiently large valuation cutoff

L_{0}

and window bound

1 \leq t \leq t_{max}

, every nonperiodic accelerated odd Collatz orbit escapes both the dangerous residue set

V_{danger} (L_{0})

and the high–valuation set

V_{\geq L_{0} + 1}

. Equivalently, the finite directed residue graphs induced by valuation patterns contain no directed cycles compatible with the odd Collatz map, and no infinite orbit can remain trapped in or oscillate between the corresponding residue classes.

Remark 1.9. Statement (8) places the Collatz problem into a finite combinatorial framework by encoding the accelerated odd map on residue graphs and requiring that no infinite trajectory can remain trapped in either the dangerous or high–valuation subgraphs. This approach extends several earlier graph–based and congruence–based analyses. Applegate and Lagarias [20] performed exhaustive computations of the accelerated map modulo

2^{N}

(for

N \leq 30

), finding no nontrivial cycles and thereby anticipating the cycle–obstruction aspect of (8). Eliahou’s modular constraints on possible cycles [23] likewise translate into forbidden subgraphs in this residue framework. Kontorovich and Lagarias [19] modeled the dynamics via Markov chains on residue classes determined by 2–adic valuations, and Sinai’s logarithmic random–walk formulation [22] analyzed when orbits may appear trapped in specific congruence sets.

The contribution of (8) is to integrate these ideas into a unified dynamical criterion: it couples the residue graph to the valuation window formalism, distinguishes dangerous and high–valuation regions via

V_{danger} (L_{0})

and

V_{\geq L_{0} + 1}

, and asserts that no infinite orbit can persist in, or oscillate between, these regions. This strengthened nontrapping principle links finite residue obstructions directly to the weak non–retreat mechanism used later in the paper.

The following implications hold:

(i) (3)+(8)⟹(1) by Proposition 7.22

(ii) (4)⟹(3) by Proposition 7.23

(iii) (5)⟹(4) by Proposition 7.24

(iv) (6)⟹(5) by Proposition 7.27

(v) (8)⟹(6) by Theorem 7.20.

(vi) (8)= Theorem 7.17 and Theorem 7.16.

The resulting chain shows the Collatz Conjecture is proven.

(8) ⟹ (6) ⟹ (5) ⟹ (4) ⟹ (3) and (3) + (8) ⟹ (1)

Other implications:

(3)⟹(2) by Proposition 7.21

(3)⟹(1) by Theorems 5.28, 5.32, 5.41

(7)⟹(3) by Lemmas 7.1, 7.4

(7)= Proposition 7.28

Remark 1.10.

Several statements in this paper initially appear in conjectural form. This is deliberate. Later in the paper, when a dynamical or spectral principle first arises, we present it as a conjecture and defer the development of the required analytic tools. Subsequent sections supply the necessary machinery, and the conjectures are later confirmed as theorems or propositions—most notably in Section 7, where the essential conjectural statements are fully resolved. We retain their original presentation as conjectures to preserve the logical flow of the exposition and to highlight the structural ideas as they emerge.

The remainder of the paper develops the analytic and dynamical tools that feed into Theorem 1.1. Section 2 introduces the weighted

ℓ_{σ}^{1}

spaces and their Dirichlet transforms, establishing the basic analytic framework used throughout. Section 3 defines the backward transfer operator P associated with the odd branch and derives its analytic representation. Section 4 constructs the multiscale space

B_{tree, σ}

adapted to the Collatz preimage tree and proves the Lasota–Yorke inequalities that control the action of P on this space. Section 5 converts the Lasota–Yorke contraction into a concrete spectral statement, proving that invariant densities obey an explicit block-level recursion with exponentially small error. This leads to a complete Perron–Frobenius description of P and to a certified spectral gap, establishing quasi–compactness and a spectral gap. Section 6 finalizes the spectral results and develops the valuation and residue tools required to analyze finite windows of odd transitions and to formulate the residue-graph obstructions that underlie the weak non–retreat principle. Section 6.5 introduces the valuation and residue-class tools used to analyze short odd windows, defining block indices, dangerous valuation patterns, and the finite residue graphs whose transitions constrain forward dynamics. These structures support the Block–Escape Property and the weak non–retreat principle central to the dynamical implications of the paper. Section 7 links the residue–graph analysis with precise forward–dynamical estimates, showing that nontrapping of accelerated odd orbits produces quantitative weak non–retreat, linear block growth, and ultimately orbitwise invariant functionals. When coupled with the spectral structure of the backward operator, this leads to a contradiction for any infinite orbit and therefore completes the proof of the Strong Collatz Conjecture. Finally, Section 8 places the spectral and dynamical framework in a broader arithmetic context and outlines potential extensions to other discrete maps.

2. Analytic Foundations for the Transfer Operator

The analysis begins with a careful description of the function spaces, Dirichlet transforms, and basic structural features of the Collatz map that underlie the spectral study of the backward operator P. Throughout we work with complex-valued arithmetic functions

f : N \to C

. We start with a simple unbounded estimate.

Lemma 2.1

(Coarse k-step envelopes). Let

T : N \to N

denote the Collatz map (1). For every

n \in N

and

k \in N_{0}

,

\frac{n}{2^{k}} \leq T^{k} (n) \leq 3^{k} n + \frac{3^{k} - 1}{2} .

(5)

Proof.

For every

m \geq 1

, the definition of T gives

\frac{m}{2} \leq T (m) \leq 3 m + 1 .

Iterating the lower bound yields

T^{k} (n) \geq n / 2^{k}

. For the upper bound, the recurrence

T^{k + 1} (n) \leq 3 T^{k} (n) + 1

immediately gives, by a simple induction on k, the explicit estimate

T^{k} (n) \leq 3^{k} n + (3^{k} - 1) / 2

. This proves (5). □

These envelopes are intentionally crude, yet they ensure that forward iterates of typical arithmetic weights remain controlled on the scales relevant for our Dirichlet and transfer-operator analysis.

2.1. Weighted Dirichlet Spaces and Transform Estimates

For

σ > 0

we define the weighted

ℓ^{1}

space

ℓ_{σ}^{1} : = \{f : N \to {C : ∥ f ∥}_{σ} : = \sum_{n \geq 1} \frac{| f (n) |}{n^{σ}} < \infty\} .

(6)

The weight exponent

σ

measures polynomial decay and is chosen so that Dirichlet series associated with f converge absolutely in a half-plane

ℜ (s) > σ

. Given

f \in ℓ_{σ}^{1}

, we define its Dirichlet transform

D f (s) : = \sum_{n \geq 1} \frac{f (n)}{n^{s}}, ℜ (s) > σ .

(7)

Lemma 2.2

(Dirichlet convergence). Let

σ > 0

and let

f \in ℓ_{σ}^{1}

, so that

{∥ f ∥}_{σ} : = \sum_{n \geq 1} \frac{| f (n) |}{n^{σ}} < \infty .

Then the Dirichlet transform

D f (s) : = \sum_{n \geq 1} \frac{f (n)}{n^{s}}

converges absolutely for

ℜ (s) > σ

and defines a bounded holomorphic function on every half-plane

ℜ (s) \geq σ + ε

,

ε > 0

. Moreover,

| D f (s) | \leq {∥ f ∥}_{σ} sup_{n \geq 1} n^{σ - ℜ (s)} = {∥ f ∥}_{σ} (ℜ (s) > σ) .

(8)

Proof.

Let

s \in C

with

ℜ (s) > σ

. Then

\sum_{n \geq 1} |\frac{f (n)}{n^{s}}| = \sum_{n \geq 1} \frac{| f (n) |}{n^{ℜ (s)}} = \sum_{n \geq 1} \frac{| f (n) |}{n^{σ}} n^{σ - ℜ (s)} .

Since

ℜ (s) > σ

implies

σ - ℜ (s) < 0

, the sequence

n^{σ - ℜ (s)}

is decreasing to 0, and hence

sup_{n \geq 1} n^{σ - ℜ (s)} = 1 .

Therefore,

\sum_{n \geq 1} |\frac{f (n)}{n^{s}}| \leq {∥ f ∥}_{σ} < \infty,

so the Dirichlet series converges absolutely.

For every

ε > 0

, the same bound holds uniformly on the half-plane

ℜ (s) \geq σ + ε

, since then

σ - ℜ (s) \leq - ε

and

n^{σ - ℜ (s)} \leq n^{- ε} \to 0

as

n \to \infty

. Thus the convergence is locally uniform in

ℜ (s) \geq σ + ε

, and classical Dirichlet-series theory implies that

D f

is holomorphic on this region. The bound (8) follows directly from the estimate above. □

We write

ℓ^{1} = ℓ_{0}^{1}

for the unweighted space with norm

{∥ f ∥}_{1} = \sum_{n \geq 1} | f (n) |

.

2.2. Collatz Preimage Geometry and the Backward Recursion

For each

n \geq 1

, define the even and odd preimage sets

E (n) : = {m \in N : T (m) = n, m even}, O (n) : = {m \in N : T (m) = n, m odd} .

Lemma 2.3

(Preimage structure). For every

n \in N

,

E (n) = {2 n}, O (n) = \{\begin{matrix} {(n - 1) / 3}, & n \equiv 4 (mod 6), \\ ⌀, & otherwise, \end{matrix}

(9)

and in the first case

(n - 1) / 3

is odd. In particular, each n has either one preimage (even) or two preimages (one even and one odd), and the odd preimage occurs with natural density

1 / 6

.

Proof.

If m is even and

T (m) = n

, then

m / 2 = n

, so

m = 2 n

, establishing

E (n) = {2 n}

. If m is odd and

T (m) = n

, then

3 m + 1 = n

, so

m = (n - 1) / 3

. This is an integer precisely when

n \equiv 1 (mod 3)

. For m to be odd,

n - 1

must be divisible by 3 but not by 6, so

n \equiv 4 (mod 6)

. In that case

(n - 1) / 3

is odd. The density statement follows since the congruence class

n \equiv 4 (mod 6)

has natural density

1 / 6

. □

Hence each n admits exactly one even preimage and possibly one odd preimage when

n \equiv 4 (mod 6)

. The corresponding backward transfer operator is defined as

(P f) (n) : = \sum_{m : T (m) = n} \frac{f (m)}{m} = \frac{f (2 n)}{2 n} + 1_{{n \equiv 4 (6)}} \frac{f (\frac{n - 1}{3})}{(n - 1) / 3} .

(10)

The normalization by

1 / m

reflects the logarithmic contraction of the forward map and ensures a natural mass-balance property.

Lemma 2.4

(Weighted mass preservation). Let

f : N \to [0, \infty)

satisfy

\sum_{m \geq 1} \frac{f (m)}{m} < \infty .

Then the backward transfer operator

(P f) (n) : = \sum_{m : T (m) = n} \frac{f (m)}{m}

preserves the weighted mass in the sense that

\sum_{n \geq 1} (P f) (n) = \sum_{m \geq 1} \frac{f (m)}{m} .

(11)

Proof.

Since

f \geq 0

and

\sum_{m \geq 1} f (m) / m < \infty

, Tonelli’s theorem justifies rearranging the nonnegative double series. Using the definition of P,

\sum_{n \geq 1} (P f) (n) = \sum_{n \geq 1} \sum_{m : T (m) = n} \frac{f (m)}{m} .

Each

m \geq 1

has exactly one image

T (m)

, so it appears in exactly one of the inner sums. Hence we can rewrite the double sum directly over m:

\sum_{n \geq 1} \sum_{m : T (m) = n} \frac{f (m)}{m} = \sum_{m \geq 1} \frac{f (m)}{m},

which is precisely (11). □

2.3. Dirichlet Envelopes for Iterated Backward Dynamics

The preimage structure allows a crude but useful bound on P acting on

ℓ_{σ}^{1}

.

Proposition 2.5

(Backward operator bound). Let

σ > 0

and let P be defined by (10). Then

P : ℓ_{σ}^{1} \to ℓ_{σ}^{1}

is bounded and

{∥ P f ∥}_{σ} \leq C_{σ} {∥ f ∥}_{σ}, C_{σ} : = 2^{σ} + 3^{- σ},

(12)

for all

f \in ℓ_{σ}^{1}

. Consequently, for every

k \geq 1

,

∥ P^{k} {f ∥}_{σ} \leq C_{σ}^{k} {∥ f ∥}_{σ} .

(13)

Proof.

From (10),

(P f) (n) = \frac{f (2 n)}{2 n} + 1_{{n \equiv 4 (6)}} \frac{f (\frac{n - 1}{3})}{(n - 1) / 3} .

Hence

{∥ P f ∥}_{σ} \leq S_{even} + S_{odd},

with

S_{even} : = \sum_{n \geq 1} \frac{| f (2 n) |}{2 n n^{σ}}, S_{odd} : = \sum_{\begin{matrix} n \geq 1 \\ n \equiv 4 (6) \end{matrix}} \frac{|f (\frac{n - 1}{3})|}{(\frac{n - 1}{3}) n^{σ}} .

For the even branch, set

m = 2 n

, so

n = m / 2

and

S_{even} = \sum_{\begin{matrix} m \geq 1 \\ m even \end{matrix}} \frac{| f (m) |}{m {(m / 2)}^{σ}} = \sum_{\begin{matrix} m \geq 1 \\ m even \end{matrix}} \frac{2^{σ} | f (m) |}{m^{σ + 1}} \leq 2^{σ} \sum_{m \geq 1} \frac{| f (m) |}{m^{σ}} = 2^{σ} {∥ f ∥}_{σ} .

For the odd branch, write

m = (n - 1) / 3

, so

n = 3 m + 1

and m is odd. Then

S_{odd} = \sum_{\begin{matrix} m \geq 1 \\ m odd \end{matrix}} \frac{| f (m) |}{m {(3 m + 1)}^{σ}} \leq \sum_{m \geq 1} \frac{| f (m) |}{m {(3 m)}^{σ}} = 3^{- σ} \sum_{m \geq 1} \frac{| f (m) |}{m^{σ + 1}} \leq 3^{- σ} {∥ f ∥}_{σ} .

Combining the two estimates gives (12), and iterating yields (13). □

The constant

C_{σ} = 2^{σ} + 3^{- σ}

is an explicit growth factor for P on

ℓ_{σ}^{1}

. It is not

< 1

in this normalization, so no contraction is claimed at this level. The genuine contraction mechanism is obtained later on the multiscale Banach space

B_{tree}

, where a strong seminorm captures oscillatory decay along the Collatz tree while the

ℓ^{1}

component provides compactness.

3. Transfer Operator Formulation

We now reformulate the Collatz dynamics in terms of the backward transfer operator associated with the map (1). This operator-theoretic viewpoint provides an analytic bridge between the discrete recurrence and the functional framework developed in later sections. The transfer operator encodes the inverse–branching structure of the map and propagates densities backward along the Collatz tree, in a form compatible with logarithmic weighting and Dirichlet series.

Recall that the Collatz map, (1), by Lemma 2.3, each

n \geq 1

has the even preimage

2 n

, together with an additional odd preimage

(n - 1) / 3

precisely when

n \equiv 4 (mod 6)

.

3.1. Backward Transfer Operator

Definition 3.1

(Backward transfer operator). For an arithmetic function

f : N \to C

, define

(P f) (n) : = \sum_{m : T (m) = n} \frac{f (m)}{m} = \frac{f (2 n)}{2 n} + 1_{{n \equiv 4 (6)}} \frac{f (\frac{n - 1}{3})}{(n - 1) / 3}, n \in N,

(14)

where

1_{A}

denotes the indicator of the condition A.

Lemma 3.2

(Dirichlet transform intertwining). Let

f \in ℓ_{σ}^{1}

with

σ > 1

, and define the Dirichlet transform

D (f) (s) = \sum_{n \geq 1} f (n) n^{- s} .

Then for every s with

ℜ (s) > σ

, the series

D (f) (s)

and

D (P f) (s)

converge absolutely. Moreover, if we write

D (f) (s) = \sum_{n \geq 1} a_{n} n^{- s}, a_{n} : = f (n),

and define a linear operator

L_{s}

on Dirichlet series by

(L_{s} F) (s) : = 2^{s} \sum_{\begin{matrix} m \geq 1 \\ m even \end{matrix}} a_{m} m^{- 1 - s} + \sum_{\begin{matrix} k \geq 1 \\ k odd \end{matrix}} a_{k} k^{- 1} {(3 k + 1)}^{- s}, F (s) = \sum_{n \geq 1} a_{n} n^{- s},

(15)

then

D (P f) (s) = (L_{s} D (f)) (s) for all ℜ (s) > σ .

Proof.

Fix

f \in ℓ_{σ}^{1}

with

σ > 1

. By definition of the

ℓ_{σ}^{1}

-norm,

\sum_{n \geq 1} | f (n) | n^{- σ} < \infty .

If

ℜ (s) > σ

, then

n^{- ℜ (s)} \leq n^{- σ}

for all

n \geq 1

, so

\sum_{n \geq 1} | f (n) | n^{- ℜ (s)} \leq \sum_{n \geq 1} | f (n) | n^{- σ} < \infty .

Thus

D (f) (s) = \sum_{n \geq 1} f (n) n^{- s}

converges absolutely for

ℜ (s) > σ

.

Next we show that

D (P f) (s)

converges absolutely for the same range. From the definition of P,

(P f) (n) = \frac{f (2 n)}{2 n} + 1_{{n \equiv 4 (\mod 6)}} \frac{f ((n - 1) / 3)}{(n - 1) / 3},

so

| P f (n) | \leq \frac{| f (2 n) |}{2 n} + 1_{{n \equiv 4 (\mod 6)}} \frac{| f ((n - 1) / 3) |}{(n - 1) / 3} .

Hence

\sum_{n \geq 1} | P f (n) | n^{- ℜ (s)} \leq S_{even} + S_{odd},

where

S_{even} : = \sum_{n \geq 1} \frac{| f (2 n) |}{2 n} n^{- ℜ (s)}, S_{odd} : = \sum_{\begin{matrix} n \geq 1 \\ n \equiv 4 (6) \end{matrix}} \frac{| f ((n - 1) / 3) |}{(n - 1) / 3} n^{- ℜ (s)} .

For the even contribution, set

m = 2 n

so

n = m / 2

and m is even. Then

S_{even} = \sum_{\begin{matrix} m \geq 1 \\ m even \end{matrix}} \frac{| f (m) |}{m} {(\frac{m}{2})}^{- ℜ (s)} = 2^{ℜ (s)} \sum_{\begin{matrix} m \geq 1 \\ m even \end{matrix}} | f (m) | m^{- 1 - ℜ (s)} .

Since

ℜ (s) > σ

implies

ℜ (s) + 1 > σ

, we have

m^{- 1 - ℜ (s)} \leq m^{- σ}

, and therefore

S_{even} \leq 2^{ℜ (s)} \sum_{\begin{matrix} m \geq 1 \\ m even \end{matrix}} | f (m) | m^{- σ} \leq 2^{ℜ (s)} \sum_{m \geq 1} | f (m) | m^{- σ} < \infty .

For the odd contribution, write

n = 3 k + 1

with

k \geq 1

odd (equivalent to

n \equiv 4 (mod 6)

and

(n - 1) / 3 = k

odd). Then

S_{odd} = \sum_{\begin{matrix} k \geq 1 \\ k odd \end{matrix}} \frac{| f (k) |}{k} {(3 k + 1)}^{- ℜ (s)} .

Since

3 k + 1 \geq k

for all

k \geq 1

, we have

{(3 k + 1)}^{- ℜ (s)} \leq k^{- ℜ (s)}

, and hence

S_{odd} \leq \sum_{\begin{matrix} k \geq 1 \\ k odd \end{matrix}} | f (k) | k^{- 1 - ℜ (s)} \leq \sum_{k \geq 1} | f (k) | k^{- 1 - ℜ (s)} .

Again

ℜ (s) + 1 > σ

gives

k^{- 1 - ℜ (s)} \leq k^{- σ}

, so

S_{odd} \leq \sum_{k \geq 1} | f (k) | k^{- σ} < \infty .

Thus

S_{even} + S_{odd} < \infty

, and

D (P f) (s)

converges absolutely for

ℜ (s) > σ

.

We now compute

D (P f) (s)

explicitly and identify it with

(L_{s} D (f)) (s)

. By definition,

D (P f) (s) = \sum_{n \geq 1} (P f) (n) n^{- s} .

Substituting the formula for P and splitting according to the two branches,

D (P f) (s) = \sum_{n \geq 1} \frac{f (2 n)}{2 n} n^{- s} + \sum_{\begin{matrix} n \geq 1 \\ n \equiv 4 (6) \end{matrix}} \frac{f ((n - 1) / 3)}{(n - 1) / 3} n^{- s} .

For the even part, set again

m = 2 n

:

\sum_{n \geq 1} \frac{f (2 n)}{2 n} n^{- s} = \sum_{\begin{matrix} m \geq 1 \\ m even \end{matrix}} \frac{f (m)}{m} {(\frac{m}{2})}^{- s} = 2^{s} \sum_{\begin{matrix} m \geq 1 \\ m even \end{matrix}} f (m) m^{- 1 - s} .

For the odd part, write

n = 3 k + 1

with

k \geq 1

odd and

(n - 1) / 3 = k

:

\sum_{\begin{matrix} n \geq 1 \\ n \equiv 4 (6) \end{matrix}} \frac{f ((n - 1) / 3)}{(n - 1) / 3} n^{- s} = \sum_{\begin{matrix} k \geq 1 \\ k odd \end{matrix}} \frac{f (k)}{k} {(3 k + 1)}^{- s} .

Putting the two contributions together,

D (P f) (s) = 2^{s} \sum_{\begin{matrix} m \geq 1 \\ m even \end{matrix}} f (m) m^{- 1 - s} + \sum_{\begin{matrix} k \geq 1 \\ k odd \end{matrix}} f (k) k^{- 1} {(3 k + 1)}^{- s} .

Now write

F (s) = D (f) (s) = \sum_{n \geq 1} a_{n} n^{- s}

with

a_{n} = f (n)

. By the definition (15) of

L_{s}

, we have

(L_{s} F) (s) = 2^{s} \sum_{\begin{matrix} m \geq 1 \\ m even \end{matrix}} a_{m} m^{- 1 - s} + \sum_{\begin{matrix} k \geq 1 \\ k odd \end{matrix}} a_{k} k^{- 1} {(3 k + 1)}^{- s},

which matches exactly the expression for

D (P f) (s)

obtained above when

a_{n} = f (n)

. Hence

D (P f) (s) = (L_{s} D (f)) (s)

for all

ℜ (s) > σ

, as claimed. □

The multiplicative factor

1 / m

assigns to each inverse branch a logarithmic weight, so that P acts as a normalized backward average along preimages. This normalization aligns the discrete dynamics with Dirichlet weights and will be crucial for analytic continuation and spectral estimates below.

Positivity. If

f (n) \geq 0

for all n, then

(P f) (n) \geq 0

for all n, since P is a positive linear combination of values of f.

Weighted mass preservation. A direct change of variables shows that for every nonnegative f satisfying

\sum_{m \geq 1} | f (m) | / m < \infty

,

\sum_{n \geq 1} (P f) (n) = \sum_{m \geq 1} \frac{f (m)}{m} .

(16)

Thus P preserves the logarithmically weighted mass

\sum f (m) / m

; plain

ℓ^{1}

mass is not preserved under this normalization.

Boundedness on weighted spaces. Let

ℓ_{σ}^{1} : = \{f : N \to {C : ∥ f ∥}_{ℓ_{σ}^{1}} : = \sum_{n \geq 1} \frac{| f (n) |}{n^{σ}} < \infty\}, σ > 0 .

A direct change of variables in (14) yields, for all

f \in ℓ_{σ}^{1}

,

\begin{matrix} {∥ P f ∥}_{ℓ_{σ}^{1}} & = \sum_{n \geq 1} \frac{| (P f) (n) |}{n^{σ}} \leq \sum_{n \geq 1} (\frac{| f (2 n) |}{2 n^{1 + σ}} + 1_{{n \equiv 4 (6)}} \frac{|f ((n - 1) / 3)|}{{((n - 1) / 3)}^{1 + σ}}) \\ = \frac{1}{2} \sum_{n \geq 1} \frac{| f (2 n) |}{n^{1 + σ}} + 3^{1 + σ} \sum_{\begin{matrix} n \geq 1 \\ n \equiv 4 (6) \end{matrix}} \frac{| f ((n - 1) / 3) |}{{(n - 1)}^{1 + σ}} . \end{matrix}

(17)

Changing variables

m = 2 n

in the first sum and

m = (n - 1) / 3

in the second gives

\begin{matrix} \sum_{n \geq 1} \frac{| f (2 n) |}{2 n^{1 + σ}} & = 2^{σ} \sum_{\begin{matrix} m \geq 1 \\ m even \end{matrix}} \frac{| f (m) |}{m^{1 + σ}} \leq 2^{σ} {∥ f ∥}_{ℓ_{σ}^{1}}, \\ 3^{1 + σ} \sum_{\begin{matrix} n \geq 1 \\ n \equiv 4 (6) \end{matrix}} \frac{| f ((n - 1) / 3) |}{{(n - 1)}^{1 + σ}} & = 3^{- σ} \sum_{\begin{matrix} m \geq 1 \\ 3 m + 1 \equiv 4 (6) \end{matrix}} \frac{| f (m) |}{m^{σ}} \leq 3^{- σ} {∥ f ∥}_{ℓ_{σ}^{1}} . \end{matrix}

Hence

{∥ P f ∥}_{ℓ_{σ}^{1}} \leq (2^{σ} + 3^{- σ}) {∥ f ∥}_{ℓ_{σ}^{1}},

(18)

and therefore

∥ P^{k} {f ∥}_{ℓ_{σ}^{1}} \leq {(2^{σ} + 3^{- σ})}^{k} {∥ f ∥}_{ℓ_{σ}^{1}}, k \geq 0 .

(19)

Action on the weighted sup space. For the Banach space

B_{σ} : = \{f : N \to {C : ∥ f ∥}_{B_{σ}} : = sup_{n \geq 1} n^{σ} | f (n) | < \infty\},

the normalization factor

1 / m

in (14) improves decay at each branch but does not make P a contraction. Setting

g (n) : = n f (n)

, one obtains

n (P f) (n) = g (2 n) + 1_{{n \equiv 4 (6)}} g (\frac{n - 1}{3}), (P f) (n) = \frac{(Q g) (n)}{n}, (Q g) (n) : = g (2 n) + 1_{{n \equiv 4 (6)}} g (\frac{n - 1}{3}) .

Using

{∥ f ∥}_{B_{σ}} = {∥ g ∥}_{B_{σ - 1}}

, one obtains the bound

\begin{matrix} {∥ P f ∥}_{B_{σ}} & = sup_{n \geq 1} n^{σ - 1} | (Q g) (n) | \leq sup_{n \geq 1} (n^{σ - 1} | g (2 n) | + n^{σ - 1} 1_{{n \equiv 4 (6)}} |g (\frac{n - 1}{3})|) \\ \leq (2^{- (σ - 1)} + 3^{σ - 1}) {∥ g ∥}_{B_{σ - 1}} = (2^{- (σ - 1)} + 3^{σ - 1}) {∥ f ∥}_{B_{σ}} . \end{matrix}

(20)

In particular, the constant

2^{- (σ - 1)} + 3^{σ - 1} \geq 1

for all

σ > 0

, so P is bounded but not contractive on

(B_{σ}, ∥ \cdot ∥_{B_{σ}})

. This coarse boundedness provides an upper envelope for the operator norm but does not imply any decay of

P^{k}

on

B_{σ}

.

These limitations motivate the refinement of the functional setting in later sections, where the multiscale tree spaces

B_{tree}

and

B_{tree, σ}

are introduced to obtain genuine Lasota–Yorke-type contractions with

λ < 1

and a provable spectral gap.

3.2. Dirichlet–Ruelle Operator and Intertwining Relations

For

f \in ℓ_{σ}^{1}

with

σ > 0

, the Dirichlet transform

D f (s) : = \sum_{n \geq 1} \frac{f (n)}{n^{s}}, ℜ (s) > σ,

(21)

is absolutely convergent. Writing

D f (s) = \sum_{n \geq 1} a_{n} n^{- s}

with

a_{n} = f (n)

and substituting (14), we obtain

\begin{matrix} D (P f) (s) & = \sum_{n \geq 1} (\frac{a_{2 n}}{2 n} + 1_{{n \equiv 4 (6)}} \frac{a_{(n - 1) / 3}}{(n - 1) / 3}) \frac{1}{n^{s}} . \end{matrix}

(22)

Thus

D (P f)

is again a Dirichlet series whose coefficients depend linearly on those of

D f

.

Definition 3.3

(Dirichlet–Ruelle operator). Let

D_{σ}

denote the space of Dirichlet series

F (s) = \sum_{n \geq 1} a_{n} n^{- s} with \sum_{n \geq 1} \frac{| a_{n} |}{n^{σ}} < \infty .

Define

L : D_{σ} \to D_{σ}

by

(L F) (s) : = \sum_{n \geq 1} b_{n} n^{- s}, b_{n} : = \frac{a_{2 n}}{2 n} + 1_{{n \equiv 4 (6)}} \frac{a_{(n - 1) / 3}}{(n - 1) / 3} .

(23)

Lemma 3.4

(Operator norm of L). For

σ > 0

, let

{∥ F ∥}_{σ} : = \sum_{n \geq 1} | a_{n} | / n^{σ}

. Then

L : D_{σ} \to D_{σ}

is bounded and

{∥ L ∥}_{σ} \leq 2^{σ} + 3^{- σ} .

(24)

Proof.

From (23),

{∥ L F ∥}_{σ} = \sum_{n \geq 1} \frac{| b_{n} |}{n^{σ}} \leq \sum_{n \geq 1} \frac{| a_{2 n} |}{2 n n^{σ}} + \sum_{\begin{matrix} n \geq 1 \\ n \equiv 4 (6) \end{matrix}} \frac{| a_{(n - 1) / 3} |}{(n - 1) / 3} \frac{1}{n^{σ}} = : S_{even} + S_{odd} .

For the even term, set

m = 2 n

. Then

S_{even} = \sum_{m even} \frac{| a_{m} |}{2 {(m / 2)}^{1 + σ}} = \sum_{m even} \frac{2^{σ} | a_{m} |}{m^{1 + σ}} \leq 2^{σ} \sum_{m even} \frac{| a_{m} |}{m^{σ}} \leq 2^{σ} {∥ F ∥}_{σ} .

For the odd term, write

m = (n - 1) / 3

, so

n = 3 m + 1

and

S_{odd} = \sum_{m \geq 1} \frac{| a_{m} |}{m {(3 m + 1)}^{σ}} \leq 3^{- σ} \sum_{m \geq 1} \frac{| a_{m} |}{m^{σ}} = 3^{- σ} {∥ F ∥}_{σ} .

Combining the two estimates gives

{∥ L F ∥}_{σ} \leq (2^{σ} + 3^{- σ}) {∥ F ∥}_{σ},

proving (24). □

Lemma 3.5

(Intertwining of P and L). For every

f \in ℓ_{σ}^{1}

with

σ > 0

,

D (P f) = L (D f), D (P^{k} f) = L^{k} (D f), k \geq 0,

(25)

whenever the series converge absolutely.

Proof.

The Dirichlet coefficients of

D (P f)

in (22) are precisely the

b_{n}

of (23), so

D (P f) = L (D f)

; iteration gives the second identity. □

The intertwining relation shows that spectral information for P on

ℓ_{σ}^{1}

transfers to L on

D_{σ}

. However, since P is not contractive on

ℓ_{σ}^{1}

or

B_{σ}

, the inequality (24) provides only a uniform boundedness envelope for

∥ L^{k} ∥_{σ}

, not exponential decay. Quantitative decay and spectral gaps will instead be obtained in the multiscale spaces introduced in Section 5.

Define

w_{k} : = P^{k} 1

with

1 (n) \equiv 1

and

ζ_{C} (s, k) : = \sum_{n \geq 1} \frac{w_{k} (n)}{n^{s}}, ℜ (s) large .

(26)

By Lemma 3.5,

ζ_{C} (s, 0) = ζ (s), ζ_{C} (s, k) = (L^{k} ζ) (s), k \geq 1 .

(27)

The quantity

w_{k} (n)

represents the total normalized weight of all k–step backward paths from n in the Collatz tree under the logarithmic weighting

1 / m

. The family

ζ_{C} (s, k)

therefore encodes, in Dirichlet form, the distribution of these weighted backward configurations at depth k. By Lemma 3.4,

∥ L^{k} ∥_{σ} \leq {(2^{σ} + 3^{- σ})}^{k},

so the Dirichlet coefficients of

ζ_{C} (s, k)

are uniformly bounded in

ℜ (s) > σ

but do not necessarily decay in k. Later sections refine this estimate by passing to the multiscale tree space

B_{tree, σ}

, where the Lasota–Yorke inequality ensures a true spectral gap and exponential decay of

P^{k}

.

4. Spectral Framework and Multiscale Contraction Theory

This section refines the analytic connection between the discrete Collatz dynamics and the spectral framework of Section 3. Our goal is to express analytic information about the Dirichlet series associated with iterates of the backward operator P in terms of the spectral data of P—equivalently, of the Dirichlet–Ruelle operator L—acting on suitable Banach spaces continuously embedded in

ℓ_{σ}^{1}

. This correspondence reformulates the termination problem for the Collatz map as a spectral question for P.

Throughout this section we fix

σ > 1

and a Banach space

B_{σ, 1}

of arithmetic functions such that

B_{σ, 1} \subset ℓ_{σ}^{1}

continuously,

P (B_{σ, 1}) \subset B_{σ, 1}

, and the Dirichlet transform

D f (s) = \sum_{n \geq 1} \frac{f (n)}{n^{s}}

defines a holomorphic function for

ℜ (s) > σ

whenever

f \in B_{σ, 1}

. The intertwining relation (25) then yields, for all

k \geq 0

,

D (P^{k} f) (s) = \sum_{n \geq 1} \frac{(P^{k} f) (n)}{n^{s}}, ℜ (s) > σ .

Since

B_{σ, 1} \subset ℓ_{σ}^{1}

, each series converges absolutely. By the

ℓ_{σ}^{1}

estimate (18),

| D (P^{k} f) (s) | \leq ∥ P^{k} {f ∥}_{ℓ_{σ}^{1}} \leq {(2^{σ} + 3^{- σ})}^{k} {∥ f ∥}_{ℓ_{σ}^{1}}, ℜ (s) > σ .

(28)

The bound (28) shows that the iterates of P are uniformly bounded on

ℓ_{σ}^{1}

, though not contractive; a genuine contraction will appear only after the refinement to the multiscale tree spaces introduced in Section 4.4.

Generating function and operator resolvent. For

z \in C

with

| z | < {(2^{σ} + 3^{- σ})}^{- 1}

, define the two–variable generating function

G_{f} (s, z) : = \sum_{k \geq 0} z^{k} D (P^{k} f) (s) .

(29)

The series converges absolutely and locally uniformly for

ℜ (s) > σ

, hence

G_{f}

is holomorphic in

(s, z)

on the domain

Ω_{σ} : = {(s, z) \in C^{2} : ℜ (s) > σ, | z | < {(2^{σ} + 3^{- σ})}^{- 1}} .

On the operator side, for such z the Neumann series

{(I - z P)}^{- 1} = \sum_{k \geq 0} z^{k} P^{k}

converges in operator norm on

B_{σ, 1}

, and thus

G_{f} (s, z) = D [{(I - z P)}^{- 1} f] (s), (s, z) \in Ω_{σ} .

(30)

The poles of

{(I - z P)}^{- 1}

in the z–plane occur precisely at the reciprocals of the spectral values of P on

B_{σ, 1}

. Consequently the analytic structure of

G_{f}

as a function of z is governed by the spectrum of P.

At this point we recall that the backward Collatz transfer operator P preserves total mass on

ℓ^{1}

:

\sum_{n \geq 1} (P f) (n) = \sum_{m \geq 1} f (m),

so 1 is the maximal eigenvalue of P on

ℓ^{1}

. On the Banach space

B_{tree, σ}

, however, the associated invariant object is not the constant function, but rather the unique positive eigenfunction h satisfying

P h = h,

as constructed in Section 5. Thus the spectral analysis of P is directed toward establishing a spectral gap at the eigenvalue 1, meaning that every other spectral value

λ

satisfies

| λ | \leq λ_{LY} < 1

, where

λ_{LY}

is the Lasota–Yorke contraction constant for the tree norm. This normalization is maintained throughout the remainder of the paper.

Consequently, the resolvent expansion (30) is analytic for

| z | < 1

except at the simple pole at

z = 1

. The residue at this pole coincides with the Riesz projection onto the eigenspace spanned by h, and therefore encodes the invariant functional associated with the positive eigenfunction h.

The coarse resolvent radius

{(2^{σ} + 3^{- σ})}^{- 1}

merely provides an elementary domain of convergence. A sharper meromorphic continuation—reflecting the true spectral radius

r (P) = 1

and the subdominant bound

ρ_{ess} (P) \leq λ_{LY} < 1

—will be obtained on the refined spaces

B_{tree}

and

B_{tree, σ}

, where the Lasota–Yorke inequality gives quantitative contraction of oscillations between adjacent scales.

Finally, for the constant function

1 (n) \equiv 1

(whenever

1 \in B_{σ, 1}

), the coefficients of

G_{1} (s, z)

are precisely the Collatz Dirichlet series

ζ_{C} (s, k)

defined in (26). Thus the analytic continuation and asymptotic decay of

ζ_{C} (s, k)

as

k \to \infty

are controlled by the spectral properties of P through (30); their exponential decay emerges once the spectral gap on the multiscale tree spaces is established.

4.1. Spectral Reduction and Meromorphic Continuation of Dirichlet Transforms

Recall that the Dirichlet–Ruelle operator L is defined on

D_{σ}

by (23). The intertwining Lemma 3.5 asserts that for all

f \in ℓ_{σ}^{1}

,

D (P f) = L (D f) .

Since

D

is injective on

ℓ_{σ}^{1}

, every eigenpair

(λ, f)

of P with

f \in ℓ_{σ}^{1}

produces an eigenpair

(λ, D f)

of L. Conversely, if

L F = λ F

and

F = D f

lies in the image of

D

, then

P f = λ f

. Hence the point spectra of P on

B_{σ, 1}

and of L on

D_{σ}

coincide on the subspace

D (B_{σ, 1})

. In particular,

ρ (L) \geq ρ (P),

(31)

and any spectral gap or peripheral spectral property of P transfers to the induced action of L on Dirichlet series arising from

B_{σ, 1}

.

We emphasize that equality

σ (L) = σ (P)

is not assumed. The partial correspondence (31) suffices for analytic reduction: the Dirichlet-side continuation of

D (P^{k} f)

reflects the spectral geometry of P.

Mass preservation and spectral gap. Because P only preserves total mass up to a logarithmic factor, we have

\sum_{n \geq 1} (P f) (n) = \sum_{m \geq 1} \frac{f (m)}{m},

so the constant function

1 (n) \equiv 1

is not an eigenvector. Instead, P admits a unique positive invariant density

h \in B_{tree, σ}

and a unique positive invariant functional

ϕ \in B_{tree, σ}^{*}

with

P h = h, ϕ \circ P = ϕ, ϕ (h) = 1 .

(32)

Throughout the paper we work with this Perron–Frobenius normalization (32) and express all spectral decompositions relative to the nonconstant invariant profile h.

Within this framework, the Dirichlet–Ruelle operator L inherits the same dominant eigenvalue 1 and the same spectral gap on the subspace

D (B_{σ, 1})

. The analytic behavior of the Collatz Dirichlet series

ζ_{C} (s, k) = D (P^{k} 1) (s)

is then determined by how

P^{k}

approaches the spectral projector onto the invariant subspace spanned by

1

.

Theorem 4.1

(Spectral reduction and analytic continuation). Let

B_{σ, 1}

be a Banach space of arithmetic functions continuously embedded in

ℓ_{σ}^{1}

such that

P : B_{σ, 1} \to B_{σ, 1}

is quasi-compact and satisfies the mass-preserving normalization (11). Assume further that 1 is a simple eigenvalue of P and that all other spectral values lie in the closed disk

| λ | \leq λ_{LY} < 1

. Then for every

f \in B_{σ, 1}

the Dirichlet transforms

D (P^{k} f) (s)

extend holomorphically to

ℜ (s) > σ

and admit the decomposition

D (P^{k} f) (s) = Π_{1} (f) D (1) (s) + R_{k} (s), | R_{k} (s) | \leq C_{f} (s) λ_{LY}^{k},

(33)

where

Π_{1}

is the spectral projection associated with the eigenvalue 1 and

C_{f} (s)

is locally bounded on

{ℜ (s) > σ}

. In particular, for f with

Π_{1} (f) = 0

, the functions

D (P^{k} f) (s)

decay exponentially in k uniformly on compact subsets of

ℜ (s) > σ

.

When

f = 1

, the same conclusion applies to

ζ_{C} (s, k) = D (P^{k} 1) (s)

, whose exponential stabilization corresponds to convergence toward the invariant density associated with the Collatz operator.

Proof.

By quasi-compactness, the spectrum of P decomposes as

σ (P) = {1} \cup σ_{ess} (P), ρ_{ess} (P) \leq λ_{LY} < 1,

and the Riesz projection

Π_{1} = \frac{1}{2 π i} \oint_{| z - 1 | = ε} {(z I - P)}^{- 1} d z

is a bounded projection onto the one-dimensional invariant subspace spanned by

1

. Then

P^{k} = Π_{1} + N^{k}

, where

∥ N^{k} ∥_{B_{σ, 1}} \leq C λ_{LY}^{k}

for some constant

C > 0

. Applying the Dirichlet transform and using

| D (g) (s) | \leq {∥ g ∥}_{ℓ_{σ}^{1}}

for

ℜ (s) > σ

gives

D (P^{k} f) (s) = D (Π_{1} f) (s) + D (N^{k} f) (s), | D (N^{k} f) (s) | \leq C λ_{LY}^{k} {∥ f ∥}_{B_{σ, 1}} .

Since

Π_{1} f

is a multiple of

1

, we may write

D (Π_{1} f) = Π_{1} (f) D (1)

, yielding (33). Analyticity for

ℜ (s) > σ

follows from absolute convergence and locally uniform bounds. □

This form aligns with the quasi-compactness obtained later on the multiscale tree space

B_{tree, σ}

, where the Lasota–Yorke inequality ensures

ρ_{ess} (P) \leq λ_{LY} < 1

. The exponential term

λ_{LY}^{k}

in (33) corresponds to the essential spectral radius and controls the rate of decay of correlations and Dirichlet coefficients. Under stronger spectral assumptions, the representation can be refined to a meromorphic decomposition in which each isolated eigenvalue

λ_{j}

contributes a term

λ_{j}^{k} D (Π_{j} f)

, generalizing the usual Ruelle–Perron expansion.

4.2. Spectral Criteria for Boundedness on Weighted Spaces

The preceding analysis shows that sufficiently strong spectral control of P on an appropriate Banach space

B_{σ, 1}

forces all Dirichlet data generated by the backward Collatz tree to exhibit exponential stabilization toward the invariant profile. Since P is not contractive on

ℓ_{σ}^{1}

or

B_{σ}

, such behavior can only arise on refined Banach spaces where a genuine spectral gap at the eigenvalue 1 has been established. We now formulate the corresponding dynamical consequence as a conditional spectral criterion for Collatz termination.

Theorem 4.2

(Spectral criterion for Collatz termination). Let P act on a Banach space

B_{σ, 1} \subset ℓ_{σ}^{1}

such that

P (B_{σ, 1}) \subset B_{σ, 1}

and

1 \in B_{σ, 1}

. Assume that P is quasi-compact on

B_{σ, 1}

, that 1 is a simple eigenvalue of P corresponding to the unique positive invariant density h, and that all other spectral values satisfy

σ (P) ∖ {1} \subset {z \in C : | z | \leq λ_{LY} < 1} .

Then every

f \in B_{σ, 1}

admits a decomposition

P^{k} f = Π_{1} f + N^{k} f, ∥ N^{k} {f ∥}_{B_{σ, 1}} \leq C λ_{LY}^{k} {∥ f ∥}_{B_{σ, 1}},

where

Π_{1}

is the spectral projection onto

span {h}

. Consequently, there exists no nontrivial invariant or periodic density for the backward Collatz dynamics in

B_{σ, 1}

; the only invariant direction is the positive eigenfunction h. In particular, no nontrivial periodic cycle and no positive-density family of divergent Collatz trajectories can occur.

Proof.

By quasi-compactness, the spectrum of P decomposes as

σ (P) = {1} \cup σ_{ess} (P)

with

ρ_{ess} (P) \leq λ_{LY} < 1

. The associated Riesz projection

Π_{1} = \frac{1}{2 π i} \oint_{| z - 1 | = ε} {(z I - P)}^{- 1} d z

is bounded and satisfies

P Π_{1} = Π_{1} P = Π_{1}

. Since 1 is a simple eigenvalue with positive eigenfunction h, we have

Π_{1} f = (ϕ (f)) h,

where

ϕ

is the corresponding eigenfunctional normalized so that

ϕ (h) = 1

.

Hence the power iterates decompose as

P^{k} = Π_{1} + N^{k}, {∥ N^{k} ∥}_{B_{σ, 1}} \leq C λ_{LY}^{k},

for some constant

C > 0

.

If a nontrivial invariant density

f \in B_{σ, 1}

satisfied

P f = f

, then f would belong to the eigenspace of

λ = 1

. Since this eigenspace is one-dimensional and spanned by h, we must have

f = c h

for some constant c. Thus no additional invariant densities exist beyond

span {h}

.

If a periodic density f satisfied

P^{q} f = f

for some

q > 0

, then f would belong to an eigenspace associated with an eigenvalue

λ

satisfying

| λ | = 1

. Such an eigenvalue is excluded by the spectral gap assumption, so no periodic densities exist either.

Finally, via the standard correspondence between transfer-operator invariants and dynamical orbits on the Collatz graph, any invariant or periodic density corresponds to either a periodic Collatz cycle or to a positive-density family of non-terminating trajectories. The spectral gap therefore precludes these dynamical behaviors. □

Section 4.4 constructs the multiscale tree Banach space

B_{tree}

and establishes a Lasota–Yorke inequality that ensures quasi-compactness of P with an explicit contraction constant

λ_{LY} < 1

in the strong seminorm. Verification of the hypotheses of Theorem 4.2 on

B_{tree, σ}

provides the analytic–spectral bridge: a strict spectral gap for P on

B_{tree, σ}

rules out the spectral signatures associated with any non-terminating Collatz behavior.

4.3. Construction of the Multiscale Tree Space $B_{tree, σ}$

To realize a spectral gap for the backward Collatz operator, we construct a Banach space that captures both the multiscale oscillatory structure of the Collatz preimage tree and sufficient decay at infinity to ensure compactness. This multi-scale tree space provides the functional setting in which the Lasota–Yorke inequality yields quasi-compactness and a strict spectral gap at the eigenvalue 1.

For

j \geq 0

define the scale blocks

I_{j} : = [6^{j}, 2 \cdot 6^{j}) \cap N .

(34)

The factor 6 reflects the approximate scale multiplication under the backward map, combining the even branch

m = 2 n

and the odd branch

m = (n - 1) / 3

(defined for

n \equiv 4 (mod 6)

). Fix parameters

0 < α < 1

and

0 < ϑ < 1

. For indices

u, v > 0

, define the scale-sensitive weight

W_{α} (u, v) : = \frac{u v}{{| u - v | (u + v)}^{α}}, u \neq v .

(35)

This weight penalizes small separations between indices, emphasizing local oscillations of f, while the factor

{(u + v)}^{- α}

damps sensitivity at large scales. The geometric coefficient

ϑ^{j}

provides exponential attenuation of oscillations across successive levels of the tree.

Definition 4.3

(Multiscale tree seminorm and space). For

f : N \to C

define

{[f]}_{tree} : = \sum_{j \geq 0} ϑ^{j} sup_{\begin{matrix} m, n \in I_{j} \\ m \neq n \end{matrix}} W_{α} (m, n) | f (m) - f (n) | .

(36)

The corresponding Banach space

B_{tree} : = \{f : N \to {C : ∥ f ∥}_{1} + {[f]}_{tree} < \infty\} {, ∥ f ∥}_{tree} : = {∥ f ∥}_{1} + {[f]}_{tree},

is called the multiscale tree space.

Standard arguments for weighted variation-type seminorms show that

(B_{tree}, ∥ \cdot ∥_{tree})

is complete. The seminorm

{[f]}_{tree}

controls the oscillatory irregularity of f within each scale block

I_{j}

, while the

ℓ^{1}

component controls the overall magnitude. However,

B_{tree}

alone does not impose sufficient decay as

n \to \infty

to guarantee compactness.

Weighted extension. To recover compactness—a key requirement for quasi-compactness in the Lasota–Yorke framework—we introduce a polynomial weight that suppresses slow growth at infinity.

Definition 4.4

(Weighted tree space with block decay). For parameters

0 < α < 1

,

0 < ϑ < 1

,

σ > 1

, and

η > 1

, set

{∥ f ∥}_{σ} : = \sum_{n \geq 1} \frac{| f (n) |}{n^{σ}},

{[f]}_{osc} : = \sum_{j \geq 0} ϑ^{j} sup_{\begin{matrix} m, n \in I_{j} \\ m \neq n \end{matrix}} W_{α} (m, n) | f (m) - f (n) |,

and

{[f]}_{mass} : = sup_{j \geq 0} η^{j} \sum_{n \in I_{j}} \frac{| f (n) |}{n^{σ}} .

Then define

B_{tree, σ} : = \{f : N \to {C : ∥ f ∥}_{tree, σ} < \infty\} {, ∥ f ∥}_{tree, σ} : = {∥ f ∥}_{σ} + {[f]}_{osc} + {[f]}_{mass} .

The weight

n^{- σ}

enforces global summability of f, while the block–decay term

{[f]}_{mass}

forces the total weighted mass on each block

I_{j}

to decrease geometrically with j, ruling out escape of mass to high levels of the tree. The oscillation term

{[f]}_{osc}

controls the local multiscale variation of f within blocks. Together these components provide the strong–weak structure required for the Lasota–Yorke framework: the strong part

{[f]}_{osc}

contracts under the transfer operator, while the weak part

{∥ f ∥}_{σ} + {[f]}_{mass}

yields compactness of the unit ball and ensures tightness of mass across scales.

Lemma 4.5

(Compact embedding). Fix

0 < α < 1

,

0 < ϑ < 1

,

σ > 1

, and

η > 1

. Then the unit ball of

B_{tree, σ}

is relatively compact in

ℓ_{σ}^{1}

.

Proof.

Let

U : = \{f \in B_{tree, σ} : {∥ f ∥}_{tree, σ} \leq 1\} .

(i) Uniform boundedness. For

f \in U

,

{∥ f ∥}_{σ} \leq {∥ f ∥}_{tree, σ} \leq 1,

so

U

is bounded in

ℓ_{σ}^{1}

.

(ii) Uniform tail control. From

{[f]}_{mass} \leq 1

we obtain, for every

j \geq 0

,

\sum_{n \in I_{j}} \frac{| f (n) |}{n^{σ}} \leq η^{- j} .

Fix

ε > 0

. Choose J so large that

\sum_{j > J} η^{- j} < ε .

Pick N so large that all blocks

I_{j}

with

j \leq J

are contained in

{1, \dots, N}

. Then for any

f \in U

,

\sum_{n > N} \frac{| f (n) |}{n^{σ}} = \sum_{j > J} \sum_{n \in I_{j}} \frac{| f (n) |}{n^{σ}} \leq \sum_{j > J} η^{- j} < ε .

So the tails of

U

are uniformly small in

ℓ_{σ}^{1}

.

(iii) Compactness on finite windows. Fix

N \geq 1

and consider the restriction map

f \mapsto (f (1), \dots, f (N))

from

U

into

C^{N}

. For each

1 \leq n \leq N

and

f \in U

,

\frac{| f (n) |}{n^{σ}} \leq {∥ f ∥}_{σ} \leq 1,

hence

| f (n) | \leq n^{σ} \leq N^{σ}

. Therefore

{f |_{{1, \dots, N}} : f \in U}

is bounded in the finite-dimensional space

C^{N}

and so relatively compact there.

(iv) Diagonal extraction and

ℓ_{σ}^{1}

convergence. Let

(f^{(k)}) \subset U

be any sequence. By (iii) we can extract a subsequence

(f^{(k_{ℓ})})

that converges in

C^{1}

at coordinate

n = 1

. Repeating this on coordinates

1, \dots, 2

, then

1, \dots, 3

, etc., and passing to a diagonal subsequence, we obtain a subsequence, still denoted

(f^{(k_{ℓ})})

, that converges pointwise on all of

N

:

f^{(k_{ℓ})} (n) \to f (n) for each n \geq 1 .

We now show

f^{(k_{ℓ})} \to f

in

ℓ_{σ}^{1}

. Fix

ε > 0

and choose N as in (ii) so that

sup_{g \in U} \sum_{n > N} \frac{| g (n) |}{n^{σ}} < \frac{ε}{3} .

By pointwise convergence on

{1, \dots, N}

and finite-dimensional compactness, there exists L such that for all

ℓ \geq L

,

\sum_{n \leq N} \frac{| f^{(k_{ℓ})} (n) - f (n) |}{n^{σ}} < \frac{ε}{3} .

For the tail, note that f also belongs to

U

as the limit of elements of

U

coordinatewise with uniform

ℓ_{σ}^{1}

bound, so

\sum_{n > N} \frac{| f (n) |}{n^{σ}} \leq sup_{g \in U} \sum_{n > N} \frac{| g (n) |}{n^{σ}} < \frac{ε}{3} .

Therefore, for

ℓ \geq L

,

\sum_{n > N} \frac{| f^{(k_{ℓ})} (n) - f (n) |}{n^{σ}} \leq \sum_{n > N} \frac{| f^{(k_{ℓ})} (n) |}{n^{σ}} + \sum_{n > N} \frac{| f (n) |}{n^{σ}} < \frac{2 ε}{3} .

Combining the head and tail estimates gives

\sum_{n \geq 1} \frac{| f^{(k_{ℓ})} (n) - f (n) |}{n^{σ}} < ε for all ℓ \geq L,

so

f^{(k_{ℓ})} \to f

in

ℓ_{σ}^{1}

. This shows that every sequence in

U

has a convergent subsequence in

ℓ_{σ}^{1}

, hence

U

is relatively compact. □

Remark 4.6.

The additional block-decay seminorm

{[f]}_{mass}

is essential. If one worked only with

{∥ f ∥}_{σ} + {[f]}_{osc}

, the unit ball would not be precompact in

ℓ^{1}

or in

ℓ_{σ}^{1}

: one can construct a sequence of functions supported on disjoint blocks, constant on each block, whose oscillation seminorms vanish and whose

ℓ_{σ}^{1}

norms stay uniformly bounded, while the supports drift to infinity. The factor

η^{j}

in

{[f]}_{mass}

rules out this escape mechanism by forcing the weighted mass in block

I_{j}

to decay at least geometrically in j.

The space

B_{tree, σ}

thus provides the natural functional environment for the Lasota–Yorke inequality. Its compact embedding into

ℓ_{σ}^{1}

ensures that the essential spectral radius of P on

B_{tree, σ}

is strictly smaller than its spectral radius, a prerequisite for establishing a genuine spectral gap. The strong seminorm captures multiscale regularity across the Collatz tree, while the weighted

ℓ^{1}

norm supplies the compactness that underlies the spectral analysis of the backward transfer operator.

4.4. Lasota–Yorke Contraction on the Multiscale Tree Space

Recall from (10) that

(P f) (n) = \frac{f (2 n)}{2 n} + 1_{{n \equiv 4 (6)}} \frac{f (\frac{n - 1}{3})}{(n - 1) / 3} .

It is convenient to split P into its even and odd components:

(P_{even} f) (n) : = \frac{f (2 n)}{2 n}, (P_{odd} f) (n) : = 1_{{n \equiv 4 (6)}} \frac{f (\frac{n - 1}{3})}{(n - 1) / 3},

(37)

so that

P = P_{even} + P_{odd}

.

From the

ℓ^{1}

estimates of Section 2, both branches are bounded on

ℓ^{1}

, hence on

B_{tree}

. The Lasota–Yorke inequality arises from the fact that

P_{even}

is strongly contracting in the tree seminorm, while

P_{odd}

is a controlled perturbation whose contribution is damped by the multiscale factor

ϑ^{j}

.

4.4.1. Even-Branch Distortion and Contraction Estimates

We first record the even-branch estimate.

Lemma 4.7

(Even branch contraction on

B_{tree, σ}

). Fix

0 < α < 1

,

0 < ϑ < 1

,

σ > 1

, and

η > 1

. Let

P_{even} f (n) = f (2 n) / (2 n)

. There exist constants

λ_{even} = 2^{- (2 - α)} ϑ^{- 1} and C_{even} > 0,

depending only on

α, ϑ, σ, η

, such that for all

f \in B_{tree, σ}

,

{[P_{even} f]}_{osc} \leq λ_{even} {[f]}_{osc} + C_{even} ({∥ f ∥}_{σ} + {[f]}_{mass}) .

(38)

In particular, if

ϑ > 2^{- (2 - α)}

, then

0 < λ_{even} < 1

, and the even branch is a strict contraction in the oscillation seminorm up to a controlled weak error.

Proof.

Write

{[f]}_{osc} = \sum_{j \geq 0} ϑ^{j} δ_{j} (f), δ_{j} (f) : = sup_{\begin{matrix} m \neq n \in I_{j} \end{matrix}} W_{α} (m, n) | f (m) - f (n) | .

Step 1: Decomposition and the

D_{1}

/

D_{2}

split Fix

j \geq 0

and

u, v \in I_{j} = [2^{j}, 2^{j + 1})

,

u \neq v

. We write

(P_{even} f) (u) - (P_{even} f) (v) = \frac{f (2 u)}{2 u} - \frac{f (2 v)}{2 v} = \frac{f (2 u) - f (2 v)}{2 u} + f (2 v) (\frac{1}{2 u} - \frac{1}{2 v}) = : D_{1} (u, v) + D_{2} (u, v) .

We estimate the contributions of

D_{1}

and

D_{2}

separately.

Step 2: Oscillatory term

D_{1}

: corrected contraction Using the scaling of

W_{α}

, we have

W_{α} (2 u, 2 v) = 2^{1 - α} W_{α} (u, v), W_{α} (u, v) = 2^{- (1 - α)} W_{α} (2 u, 2 v) .

Thus

W_{α} (u, v) | D_{1} (u, v) | = W_{α} (u, v) \frac{| f (2 u) - f (2 v) |}{2 u} = 2^{- (1 - α)} \frac{W_{α} (2 u, 2 v)}{2 u} | f (2 u) - f (2 v) | .

Since

u \in I_{j} = [2^{j}, 2^{j + 1})

, we have

u \geq 2^{j}

and hence

\frac{1}{2 u} \leq \frac{1}{2 \cdot 2^{j}} = 2^{- (j + 1)} .

Therefore

W_{α} (u, v) | D_{1} (u, v) | \leq 2^{- (1 - α)} 2^{- (j + 1)} W_{α} (2 u, 2 v) | f (2 u) - f (2 v) | .

Now

2 u, 2 v \in I_{j + 1} = [2^{j + 1}, 2^{j + 2})

, so

W_{α} (2 u, 2 v) | f (2 u) - f (2 v) | \leq δ_{j + 1} (f) .

Taking the supremum over

u, v \in I_{j}

,

δ_{j} (P_{even} f; D_{1}) : = sup_{u \neq v \in I_{j}} W_{α} (u, v) | D_{1} (u, v) | \leq 2^{- (1 - α)} 2^{- (j + 1)} δ_{j + 1} (f) .

Multiply by

ϑ^{j}

and sum over

j \geq 0

:

\sum_{j \geq 0} ϑ^{j} δ_{j} (P_{even} f; D_{1}) \leq 2^{- (1 - α)} 2^{- 1} \sum_{j \geq 0} {(ϑ 2^{- 1})}^{j} δ_{j + 1} (f) .

Change index

k = j + 1

, so

j = k - 1

:

= 2^{- (2 - α)} \sum_{k \geq 1} {(ϑ 2^{- 1})}^{k - 1} δ_{k} (f) .

Since

{(ϑ 2^{- 1})}^{k - 1} = ϑ^{k - 1} 2^{- (k - 1)} \leq ϑ^{k - 1}

and

\sum_{k \geq 1} ϑ^{k - 1} δ_{k} (f) = ϑ^{- 1} \sum_{k \geq 1} ϑ^{k} δ_{k} (f) \leq ϑ^{- 1} {[f]}_{osc},

we obtain

\sum_{j \geq 0} ϑ^{j} δ_{j} (P_{even} f; D_{1}) \leq 2^{- (2 - α)} ϑ^{- 1} {[f]}_{osc} .

Thus

{[P_{even} f]}_{osc}^{(D_{1})} \leq λ_{even} {[f]}_{osc}, λ_{even} : = 2^{- (2 - α)} ϑ^{- 1} .

This is the corrected contraction constant; in particular

λ_{even} < 1

whenever

ϑ > 2^{- (2 - α)}

.

Step 3: Denominator term

D_{2}

: weak control For

u > v

(the case

u < v

is symmetric),

|\frac{1}{2 u} - \frac{1}{2 v}| = \frac{| u - v |}{2 u v},

so

| D_{2} (u, v) | = | f (2 v) | \frac{| u - v |}{2 u v} .

Using

W_{α} (u, v) = \frac{u v}{{| u - v | (u + v)}^{α}},

we get the exact simplification

W_{α} (u, v) | D_{2} (u, v) | = \frac{| f (2 v) |}{2 {(u + v)}^{α}} .

For

u, v \in I_{j}

we have

u + v \geq 2 \cdot 2^{j} = 2^{j + 1}

, hence

W_{α} (u, v) | D_{2} (u, v) | \leq 2^{- (1 + α)} 2^{- α j} | f (2 v) | = : C_{α} 2^{- α j} | f (2 v) | .

Taking the supremum over

u, v \in I_{j}

gives

δ_{j} (P_{even} f; D_{2}) \leq C_{α} 2^{- α j} sup_{w \in I_{j + 1}} | f (w) | .

To bound

{sup}_{w \in I_{j + 1}} | f (w) |

via the weak norm, observe that for

w \in I_{j + 1}

,

| f (w) | \leq w^{σ} \frac{| f (w) |}{w^{σ}} \leq 2^{σ (j + 1)} \sum_{n \in I_{j + 1}} \frac{| f (n) |}{n^{σ}} .

By the definition of

{[f]}_{mass}

,

\sum_{n \in I_{j + 1}} \frac{| f (n) |}{n^{σ}} \leq η^{- (j + 1)} {[f]}_{mass},

so

sup_{w \in I_{j + 1}} | f (w) | \leq 2^{σ (j + 1)} η^{- (j + 1)} {[f]}_{mass} = 2^{σ} {(2^{σ} η^{- 1})}^{j} η^{- 1} {[f]}_{mass} .

Combining,

δ_{j} (P_{even} f; D_{2}) \leq C_{α, σ} {(2^{- α} 2^{σ} η^{- 1})}^{j} η^{- 1} {[f]}_{mass} = C_{α, σ} {(2^{σ - α} η^{- 1})}^{j} η^{- 1} {[f]}_{mass} .

Multiplying by

ϑ^{j}

and summing in j,

\sum_{j \geq 0} ϑ^{j} δ_{j} (P_{even} f; D_{2}) \leq C_{α, σ} η^{- 1} \sum_{j \geq 0} {(ϑ 2^{σ - α} η^{- 1})}^{j} {[f]}_{mass} .

Choosing

η > 1

so that

ϑ 2^{σ - α} η^{- 1} < 1,

the geometric series converges and we obtain

{[P_{even} f]}_{osc}^{(D_{2})} \leq C_{even}^{(2)} ({∥ f ∥}_{σ} + {[f]}_{mass}),

for some constant

C_{even}^{(2)} > 0

depending only on

α, ϑ, σ, η

(we absorb

{∥ f ∥}_{σ}

for later convenience). Adding the contributions of

D_{1}

and

D_{2}

,

{[P_{even} f]}_{osc} \leq {[P_{even} f]}_{osc}^{(D_{1})} + {[P_{even} f]}_{osc}^{(D_{2})} \leq λ_{even} {[f]}_{osc} + C_{even} ({∥ f ∥}_{σ} + {[f]}_{mass}),

with

λ_{even} = 2^{- (2 - α)} ϑ^{- 1}

and

C_{even} = C_{even}^{(2)}

, which proves (38). □

The odd branch requires more care because it shifts indices from n to

(n - 1) / 3

and only acts on the congruence class

n \equiv 4 (mod 6)

. Its effect is nonetheless small once weighted by

ϑ^{j}

.

4.4.2. Odd-Branch Distortion and Contraction Estimates

Lemma 4.8

(Odd–branch distortion on scale blocks). Let

0 < α < 1

and

W_{α} (u, v)

be the tree weight 35. If

n \equiv 4 (mod 6)

and

n \in I_{j} = [6^{j}, 2 \cdot 6^{j})

, then the odd preimage

m = (n - 1) / 3

satisfies

m ≍ 6^{j - 1}

. Moreover, there exists a constant

C_{α} > 0

depending only on α such that for every pair

n_{1}, n_{2} \in I_{j}

lying on the same ray and

m_{i} = (n_{i} - 1) / 3

,

W_{α} (m_{1}, m_{2}) \leq C_{α} W_{α} (n_{1}, n_{2}) .

(39)

Proof.

Fix

j \geq 1

and

n \in I_{j} = [6^{j}, 2 \cdot 6^{j})

with

n \equiv 4 (mod 6)

, so

m = (n - 1) / 3

. Then

\frac{6^{j} - 1}{3} \leq m < \frac{2 \cdot 6^{j} - 1}{3},

hence for all sufficiently large j we have

6^{j - 1} \leq m \leq 4 \cdot 6^{j - 1},

so

m ≍ 6^{j - 1}

as claimed.

Now take

n_{1}, n_{2} \in I_{j}

on the same ray and define

m_{i} = \frac{n_{i} - 1}{3}, i = 1, 2 .

Then

n_{i} = 3 m_{i} + 1, | n_{1} - n_{2} | = 3 | m_{1} - m_{2} | .

We estimate the ratio

R : = \frac{W_{α} (m_{1}, m_{2})}{W_{α} (n_{1}, n_{2})} = \frac{m_{1} m_{2}}{n_{1} n_{2}} \cdot \frac{| n_{1} - n_{2} |}{| m_{1} - m_{2} |} \cdot \frac{{(n_{1} + n_{2})}^{α}}{{(m_{1} + m_{2})}^{α}} .

Product factor. From

6^{j - 1} \leq m_{i} \leq 4 \cdot 6^{j - 1}

and

6^{j} \leq n_{i} \leq 2 \cdot 6^{j}

we get

m_{1} m_{2} \leq {(4 \cdot 6^{j - 1})}^{2} = 16 \cdot 6^{2 (j - 1)}, n_{1} n_{2} \geq 6^{2 j} .

Thus

\frac{m_{1} m_{2}}{n_{1} n_{2}} \leq \frac{16 \cdot 6^{2 (j - 1)}}{6^{2 j}} = \frac{16}{6^{2}} = \frac{4}{9} .

(40)

Difference factor. We have the exact identity

\frac{| n_{1} - n_{2} |}{| m_{1} - m_{2} |} = 3 .

Sum factor. Again using the scale bounds, for

m_{i}

we have

2 \cdot 6^{j - 1} \leq m_{1} + m_{2} \leq 8 \cdot 6^{j - 1},

and for

n_{i}

,

2 \cdot 6^{j} \leq n_{1} + n_{2} \leq 4 \cdot 6^{j} .

Therefore

\frac{n_{1} + n_{2}}{m_{1} + m_{2}} \leq \frac{4 \cdot 6^{j}}{2 \cdot 6^{j - 1}} = \frac{4 \cdot 6}{2} = 12,

so

\frac{{(n_{1} + n_{2})}^{α}}{{(m_{1} + m_{2})}^{α}} \leq 12^{α} .

(41)

Combining (40), the difference factor, and (41), we obtain

R \leq \frac{4}{9} \cdot 3 \cdot 12^{α} = \frac{4}{3} 12^{α} .

Thus we may take

C_{α} : = \frac{4}{3} 12^{α},

which depends only on

α

and is finite for all

0 < α < 1

. This yields

W_{α} (m_{1}, m_{2}) \leq C_{α} W_{α} (n_{1}, n_{2}),

which is (39). □

Lemma 4.9

(Odd branch on

B_{tree, σ}

). Let

0 < α < 1

,

0 < ϑ < 1

,

σ > 1

, and

η > 1

. Consider the odd-branch operator

(P_{odd} f) (n) = 1_{{n \equiv 4 (6)}} \frac{f (\frac{n - 1}{3})}{(n - 1) / 3} .

Then there exist constants

0 < λ_{odd} < 1

and

C_{odd} > 0

, depending only on

α, ϑ, σ, η

, such that for all

f \in B_{tree, σ}

one has

{[P_{odd} f]}_{osc} \leq λ_{odd} {[f]}_{osc} + C_{odd} ({∥ f ∥}_{σ} + {[f]}_{mass}) .

(42)

In particular, for

α = \frac{1}{2}

one can take

λ_{odd} \leq C_{0} ϑ,

where

C_{0} = \frac{16}{3^{3 / 2}}

is the odd-branch distortion constant from Lemma 4.14.

Proof.

Write

δ_{j} (g) : = sup_{\begin{matrix} m \neq n \in I_{j} \end{matrix}} W_{α} (m, n) | g (m) - g (n) |, {[g]}_{osc} = \sum_{j \geq 0} ϑ^{j} δ_{j} (g),

so that

{[P_{odd} f]}_{osc} = \sum_{j \geq 0} ϑ^{j} δ_{j} (P_{odd} f)

. We bound

δ_{j} (P_{odd} f)

in terms of

δ_{k} (f)

at nearby scales and the weak/mass terms.

Fix

j \geq 0

and

m, n \in I_{j} = [6^{j}, 2 \cdot 6^{j})

with

m \neq n

. We split into three cases.

Case 1: neither m nor n is

4 (mod 6)

. Then

P_{odd} f (m) = P_{odd} f (n) = 0

, so this pair contributes nothing to

δ_{j} (P_{odd} f)

.

Case 2: exactly one of

m, n

is

4 (mod 6)

. Without loss of generality, assume

m \equiv 4 (mod 6)

and

n \neg \equiv 4 (mod 6)

, and set

k : = (m - 1) / 3

. Then

P_{odd} f (m) - P_{odd} f (n) = \frac{f (k)}{k},

and therefore

W_{α} (m, n) |P_{odd} f (m) - P_{odd} f (n)| = W_{α} (m, n) \frac{| f (k) |}{k} .

As in the previous estimates for the tree seminorm, one has

W_{α} (m, n) ≪ 6^{(2 - α) j}

and

k ≍ 6^{j - 1}

, so

W_{α} (m, n) \frac{| f (k) |}{k} ≪ 6^{(1 - α) j} | f (k) | .

Multiplying by

ϑ^{j}

and summing over j, one obtains a contribution bounded by

\sum_{j \geq 0} ϑ^{j} sup_{\begin{matrix} m, n \in I_{j} \\ exactly one \equiv 4 (6) \end{matrix}} W_{α} (m, n) |P_{odd} f (m) - P_{odd} f (n)| \leq C_{odd, 1} ({∥ f ∥}_{σ} + {[f]}_{mass}),

for a suitable constant

C_{odd, 1}

depending on

α, ϑ, σ, η

. Thus Case 2 contributes only to the weak/mass error term in (42).

Case 3: both m and n are

4 (mod 6)

. Set

m^{'} = \frac{m - 1}{3}, n^{'} = \frac{n - 1}{3},

so that

m^{'}, n^{'} \in I_{j - 1}

and

P_{odd} f (m) = \frac{f (m^{'})}{m^{'}}, P_{odd} f (n) = \frac{f (n^{'})}{n^{'}} .

We decompose

\frac{f (m^{'})}{m^{'}} - \frac{f (n^{'})}{n^{'}} = \frac{f (m^{'}) - f (n^{'})}{m^{'}} + f (n^{'}) (\frac{1}{m^{'}} - \frac{1}{n^{'}}) = : D_{1} + D_{2} .

Case 3a: the

D_{1}

term (oscillatory contribution). Here

D_{1} = \frac{f (m^{'}) - f (n^{'})}{m^{'}},

and we want to control

W_{α} (m, n) | D_{1} | = W_{α} (m, n) \frac{| f (m^{'}) - f (n^{'}) |}{m^{'}} .

For

α = \frac{1}{2}

, Lemma 4.14 gives the two-sided distortion bound

\frac{1}{C_{0}} \leq \frac{W_{1 / 2} (m, n)}{W_{1 / 2} (m^{'}, n^{'})} \leq C_{0}, C_{0} = \frac{16}{3^{3 / 2}} .

Using the upper bound, we obtain

W_{1 / 2} (m, n) | D_{1} | \leq C_{0} W_{1 / 2} (m^{'}, n^{'}) \frac{| f (m^{'}) - f (n^{'}) |}{m^{'}} .

Since

m^{'} ≍ 6^{j - 1}

for

m \in I_{j}

, we have

1 / m^{'} ≪ 6^{- (j - 1)}

, so

W_{1 / 2} (m, n) | D_{1} | ≪ C_{0} 6^{- (j - 1)} W_{1 / 2} (m^{'}, n^{'}) | f (m^{'}) - f (n^{'}) | .

Taking the supremum over

m, n \in I_{j}

with

m \equiv n \equiv 4 (mod 6)

gives

δ_{j} (P_{odd} f; D_{1}) \leq C_{0}^{'} 6^{- (j - 1)} δ_{j - 1} (f),

for some constant

C_{0}^{'}

comparable to

C_{0}

.

Multiplying by

ϑ^{j}

and summing over

j \geq 1

yields

\sum_{j \geq 1} ϑ^{j} δ_{j} (P_{odd} f; D_{1}) \leq C_{0}^{'} ϑ \sum_{k \geq 0} {(ϑ / 6)}^{k} δ_{k} (f) \leq C_{0}^{''} ϑ {[f]}_{osc},

with

C_{0}^{''}

depending only on

C_{0}

and

ϑ

(and uniformly bounded as long as

ϑ < 6

). Thus the

D_{1}

-part of Case 3 contributes a term of the form

λ_{odd}^{(1)} {[f]}_{osc}

with

λ_{odd}^{(1)} \leq C_{0}^{''} ϑ

.

Case 3b: the

D_{2}

term (denominator contribution). For

D_{2} = f (n^{'}) (\frac{1}{m^{'}} - \frac{1}{n^{'}})

one uses again the scale relations

m^{'}, n^{'} ≍ 6^{j - 1}

and the fact that

| m^{'} - n^{'} | \leq C 6^{j - 1}

to deduce

W_{α} (m, n) | D_{2} | ≪ 6^{- α j} | f (n^{'}) | .

As in the even-branch analysis, the amplitude

| f (n^{'}) |

on blocks

I_{j - 1}

is controlled by a combination of the weak norm

{∥ f ∥}_{σ}

and the mass seminorm

{[f]}_{mass}

, and the geometric factor

6^{- α j}

ensures convergence of the sum over j. Thus there exists a constant

C_{odd, 2} > 0

such that

\sum_{j \geq 0} ϑ^{j} δ_{j} (P_{odd} f; D_{2}) \leq C_{odd, 2} ({∥ f ∥}_{σ} + {[f]}_{mass}) .

Conclusion. Combining Cases 1–3, we obtain

{[P_{odd} f]}_{osc} = \sum_{j \geq 0} ϑ^{j} δ_{j} (P_{odd} f) \leq λ_{odd} {[f]}_{osc} + C_{odd} ({∥ f ∥}_{σ} + {[f]}_{mass}),

with

λ_{odd} : = λ_{odd}^{(1)} \leq C_{0}^{''} ϑ, C_{odd} : = C_{odd, 1} + C_{odd, 2},

depending only on

α, ϑ, σ, η

. For

α = \frac{1}{2}

the explicit distortion constant

C_{0} = 16 / 3^{3 / 2}

from Lemma 4.14 can be absorbed into

C_{0}^{''}

, and in particular one may choose

λ_{odd} \leq C_{0} ϑ

after adjusting constants. Choosing

ϑ > 0

small enough so that

λ_{odd} < 1

gives the desired Lasota–Yorke contraction. □

4.5. Derivation of the Global Lasota–Yorke Inequality

Lemma 4.10

(Invariance and boundedness on

B_{tree, σ}

). Let

0 < α < 1

,

σ > 1

, and

0 < ϑ < 1

satisfy

ϑ 6^{1 - α + σ} < 1 .

(43)

Then the backward Collatz transfer operator P maps

B_{tree, σ}

into itself and is bounded: there exists

C > 0

such that

{∥ P f ∥}_{tree, σ} \leq C {∥ f ∥}_{tree, σ} for all f \in B_{tree, σ},

where

{∥ f ∥}_{tree, σ} : = {[f]}_{tree} + {∥ f ∥}_{σ} .

Proof.

Recall the even/odd decomposition

(P f) (n) = (P_{even} f) (n) + (P_{odd} f) (n) = \frac{f (2 n)}{2 n} + 1_{{n \equiv 4 (6)}} \frac{f (\frac{n - 1}{3})}{(n - 1) / 3} .

Step 1: Weighted

ℓ_{σ}^{1}

bound. The weak norm estimate is given by the same computation as in the weighted

ℓ_{σ}^{1}

lemma: for all

f \in B_{tree, σ}

,

{∥ P f ∥}_{σ} = ∥ P_{even} f + P_{odd} {f ∥}_{σ} \leq (2^{σ} + 3^{- σ}) {∥ f ∥}_{σ} .

(44)

This uses only the change of variables

m = 2 n

for the even branch and

m = (n - 1) / 3

for the odd branch.

Step 2: Tree seminorm bound via the even/odd lemmas. By subadditivity of

{[\cdot]}_{tree}

,

{[P f]}_{tree} \leq {[P_{even} f]}_{tree} + {[P_{odd} f]}_{tree} .

The even-branch lemma (even branch on

B_{tree, σ}

) states that under the admissibility condition (43) there exists

C_{even} > 0

, depending only on

α, σ, ϑ

, such that

{[P_{even} f]}_{tree} \leq C_{even} {∥ f ∥}_{σ} for all f \in B_{tree, σ} .

(45)

The proof is obtained by estimating each block seminorm

Δ_{j} (P_{even} f)

and summing the geometric series

\sum_{j \geq 0} {(ϑ 6^{1 - α + σ})}^{j}

, which converges precisely when (43) holds.

Similarly, the odd-branch lemma (odd branch on

B_{tree, σ}

) gives a constant

C_{odd} > 0

, depending only on

α, σ, ϑ

, such that

{[P_{odd} f]}_{tree} \leq C_{odd} {∥ f ∥}_{σ} for all f \in B_{tree, σ},

(46)

again using the same geometric factor

ϑ 6^{1 - α + σ}

and the admissibility (43).

Combining (45) and (46) we obtain

{[P f]}_{tree} \leq (C_{even} + C_{odd}) {∥ f ∥}_{σ} .

(47)

Step 3: Combine strong and weak parts. By definition,

{∥ P f ∥}_{tree, σ} = {[P f]}_{tree} + {∥ P f ∥}_{σ} .

Using (44) and (47),

{∥ P f ∥}_{tree, σ} \leq (C_{even} + C_{odd}) {∥ f ∥}_{σ} + (2^{σ} + 3^{- σ}) {∥ f ∥}_{σ} .

Since

{∥ f ∥}_{tree, σ} = {[f]}_{tree} + {∥ f ∥}_{σ} \geq {∥ f ∥}_{σ}

, we have

{∥ f ∥}_{σ} \leq {∥ f ∥}_{tree, σ},

and therefore

{∥ P f ∥}_{tree, σ} \leq (C_{even} + C_{odd} + 2^{σ} + 3^{- σ}) {∥ f ∥}_{tree, σ} .

Thus P is bounded on

B_{tree, σ}

with operator norm at most

C : = C_{even} + C_{odd} + 2^{σ} + 3^{- σ},

which depends only on

α, σ, ϑ

. □

Proposition 4.11

(Lasota–Yorke inequality on

B_{tree, σ}

). Let

0 < α < 1

,

0 < ϑ < 1

, and

σ > 1

satisfy the admissibility condition

ϑ 6^{1 - α + σ} < 1 .

(48)

Then there exists a constant

C_{LY, σ} > 0

such that for all

f \in B_{tree, σ}

,

{[P f]}_{tree} \leq C_{LY, σ} {∥ f ∥}_{σ} .

(49)

In particular, for every

n \geq 1

one has

{[P^{n} f]}_{tree} \leq C_{LY, σ} ∥ P^{n - 1} {f ∥}_{σ} \leq C_{LY, σ} {(2^{σ} + 3^{- σ})}^{n - 1} {∥ f ∥}_{σ}, f \in B_{tree, σ},

(50)

so each iterate

P^{n}

is smoothing in the strong seminorm

{[\cdot]}_{tree}

with a quantitative bound in terms of the weak norm

{∥ \cdot ∥}_{σ}

of the previous iterate.

Proof.

We use the even/odd decomposition

P = P_{even} + P_{odd}

. From the estimates established in the proofs of the even and odd branch lemmas, and under the admissibility condition (48), there exist constants

C_{even}, C_{odd} > 0

such that for all

f \in B_{tree, σ}

,

{[P_{even} f]}_{tree} \leq C_{even} {∥ f ∥}_{σ}, {[P_{odd} f]}_{tree} \leq C_{odd} {∥ f ∥}_{σ} .

(51)

The convergence of the geometric series

\sum_{j \geq 0} {(ϑ 6^{1 - α + σ})}^{j}

is precisely what guarantees that the constants

C_{even}

and

C_{odd}

are finite.

By subadditivity of

{[\cdot]}_{tree}

,

{[P f]}_{tree} = {[P_{even} f + P_{odd} f]}_{tree} \leq {[P_{even} f]}_{tree} + {[P_{odd} f]}_{tree} \leq (C_{even} + C_{odd}) {∥ f ∥}_{σ} .

Setting

C_{LY, σ} : = C_{even} + C_{odd}

gives (49).

For the iterates, apply (49) with f replaced by

P^{n - 1} f

:

{[P^{n} f]}_{tree} = {[P (P^{n - 1} f)]}_{tree} \leq C_{LY, σ} {∥ P^{n - 1} f ∥}_{σ} .

From the weighted

ℓ_{σ}^{1}

bound

{∥ P f ∥}_{σ} \leq (2^{σ} + 3^{- σ}) {∥ f ∥}_{σ}

(see (44)), iterating gives

∥ P^{n - 1} {f ∥}_{σ} \leq {(2^{σ} + 3^{- σ})}^{n - 1} {∥ f ∥}_{σ}

, which yields (50). □

Remark 4.12

(Parameter window). The lift from

{∥ \cdot ∥}_{1}

to

{∥ \cdot ∥}_{σ}

in the remainder terms occurs through the same geometric factor that appears in the proofs of the even and odd branch bounds, namely

ϑ 6^{1 - α + σ}

. The only requirement is the admissibility condition

ϑ 6^{1 - α + σ} < 1,

which ensures convergence of

\sum_{j \geq 0} {(ϑ 6^{1 - α + σ})}^{j}

and hence finiteness of

C_{even}

and

C_{odd}

.

For fixed α and σ this simply means

0 < ϑ < 6^{- (1 - α + σ)} .

A convenient concrete choice (used later) is

(α, ϑ, σ) = (\frac{1}{2}, \frac{1}{20}, 1 + ε) with any sufficiently small ε > 0,

since then

ϑ 6^{1 - α + σ} = \frac{1}{20} 6^{\frac{3}{2} + ε} < 1

for all ε in a small interval

(0, ε_{0}]

. Together with the explicit odd–branch distortion constant computed in Section 5 and the compact embedding

B_{tree, σ} ↪ ℓ_{σ}^{1}

, this smoothing Lasota–Yorke inequality is exactly what is needed to apply the Ionescu–Tulcea–Marinescu/Hennion theory and deduce quasi–compactness of P on

B_{tree, σ}

.

Corollary 4.13

(Essential spectral radius on

B_{tree, σ}

). Let

0 < α < 1

,

0 < ϑ < 1

, and

σ > 1

satisfy the admissibility condition (43). Assume the Lasota–Yorke inequality from Proposition 4.11 and the compact embedding

B_{tree, σ} ↪ ℓ_{σ}^{1}

(from Lemma 4.5). Then

P : B_{tree, σ} \to B_{tree, σ}

is quasi–compact, and its essential spectral radius satisfies

ρ_{ess} (P ↾_{B_{tree, σ}}) = 0 .

(52)

Proof.

By Proposition 4.11, for every

f \in B_{tree, σ}

,

{[P f]}_{tree} \leq C_{LY, σ} {∥ f ∥}_{σ},

where

{[\cdot]}_{tree}

is the strong seminorm and

{∥ \cdot ∥}_{σ}

is the weak norm on

B_{tree, σ}

. This is a Lasota–Yorke (Doeblin–Fortet) inequality with strong contraction coefficient equal to 0, i.e.

{∥ P f ∥}_{strong} : = {[P f]}_{tree} \leq 0 \cdot {[f]}_{tree} + C_{LY, σ} {∥ f ∥}_{weak} {, ∥ f ∥}_{weak} : = {∥ f ∥}_{σ} .

By Lemma 4.5, the unit ball of

B_{tree, σ}

is relatively compact in

ℓ_{σ}^{1}

, so the inclusion

B_{tree, σ} ↪ ℓ_{σ}^{1}

is compact. Thus the hypotheses of the Ionescu–Tulcea–Marinescu/Hennion quasi–compactness theorem are satisfied with strong contraction constant

a = 0

. The theorem then implies that P is quasi–compact on

B_{tree, σ}

and that its essential spectral radius is bounded above by a, i.e.

ρ_{ess} (P ↾_{B_{tree, σ}}) \leq 0 .

Since the essential spectral radius is always nonnegative, we conclude

ρ_{ess} (P ↾_{B_{tree, σ}}) = 0,

which is (52). □

4.6. Quasi-Compactness and Spectral Gap for P

Lemma 4.14

(Odd–branch weight distortion at

α = \frac{1}{2}

). Let

W_{α} (m, n)

be the tree weight 35 and let

m^{'} = (m - 1) / 3

,

n^{'} = (n - 1) / 3

. For

α = \frac{1}{2}

there exists an absolute constant

C_{0} = \frac{16}{3^{3 / 2}} < 3.1

such that for all

m \equiv n \equiv 4 (mod 6)

,

m \neq n

, one has the two–sided distortion bounds

\frac{1}{C_{0}} \leq \frac{W_{1 / 2} (m, n)}{W_{1 / 2} (m^{'}, n^{'})} \leq C_{0} .

(53)

Consequently, the oscillatory part of the odd branch satisfies

λ_{odd} (\frac{1}{2}, ϑ) \leq \frac{C_{0}}{\sqrt{6}} ϑ,

as used in Lemma 4.9 and Lemma 4.15.

Proof.

Plug in

α = \frac{1}{2}

into Lemma 4.8 and the result follows from the Lemma. □

Lemma 4.15

(Explicit odd-branch constant). For

α = \frac{1}{2}

and

ϑ = \frac{1}{20}

there exist constants

C_{α} > 0

and

C_{odd} > 0

such that for all

f \in B_{tree, σ}

,

{[P_{odd} f]}_{tree} \leq λ_{odd} (α, ϑ) {[f]}_{tree} + C_{odd} {∥ f ∥}_{σ},

(54)

with

λ_{odd} (α, ϑ) \leq \frac{C_{α}}{\sqrt{6}} ϑ < 1 .

(55)

Proof.

This is a direct specialization of Lemma 4.9 to the parameter choice

α = \frac{1}{2}

,

ϑ = \frac{1}{20}

. Lemma 4.9 gives the Lasota–Yorke type estimate

{[P_{odd} f]}_{tree} \leq λ_{odd} (α, ϑ) {[f]}_{tree} + C_{odd} {∥ f ∥}_{σ},

with

λ_{odd} (α, ϑ) \leq \frac{C_{α}}{\sqrt{6}} ϑ,

where

C_{α} > 0

is the odd-branch distortion constant defined in Lemma 4.14. For

α = \frac{1}{2}

, Lemma 4.14 gives the explicit value

C_{α} = C_{0} = \frac{16}{3^{3 / 2}} < 3.1 .

Substituting

α = \frac{1}{2}

and

ϑ = \frac{1}{20}

into the above bound,

λ_{odd} (\frac{1}{2}, \frac{1}{20}) \leq \frac{C_{0}}{\sqrt{6}} \cdot \frac{1}{20} < \frac{3.1}{\sqrt{6}} \cdot \frac{1}{20} < 1 .

Thus (54) holds with

λ_{odd} (\frac{1}{2}, \frac{1}{20}) < 1

and some constant

C_{odd} > 0

depending only on

σ

and the block geometry, as asserted. □

Proposition 4.16

(Verified Lasota–Yorke contraction). Let

(α, ϑ) = (\frac{1}{2}, \frac{1}{20})

and

σ > 1

satisfy the admissibility condition (43), i.e.

ϑ 6^{1 - α + σ} < 1 .

Define

λ_{LY} : = 2^{- (1 - α)} ϑ + λ_{odd} (α, ϑ), λ_{odd} (α, ϑ) \leq \frac{C_{0}}{\sqrt{6}} ϑ,

with

C_{0} = 16 / 3^{3 / 2}

from Lemma 4.14. Then

λ_{LY} < 1

, and for all

f \in B_{tree, σ}

,

{[P f]}_{tree} \leq λ_{LY} {[f]}_{tree} + C_{LY} {∥ f ∥}_{σ},

(56)

for some constant

C_{LY} > 0

depending only on the fixed parameters and the block geometry.

Proof.

We use the decomposition

P = P_{even} + P_{odd}

and the branchwise Lasota–Yorke estimates already established.

(1) Combine even and odd branch inequalities. For any

f \in B_{tree, σ}

,

{[P f]}_{tree} \leq {[P_{even} f]}_{tree} + {[P_{odd} f]}_{tree} .

By the even-branch estimate (Lemma 4.7, specialized to

B_{tree, σ}

and the fixed parameters

α = \frac{1}{2}

,

ϑ = \frac{1}{20}

), there exists

C_{even} > 0

such that

{[P_{even} f]}_{tree} \leq 2^{- (1 - α)} ϑ {[f]}_{tree} + C_{even} {∥ f ∥}_{σ} .

(57)

By the explicit odd-branch lemma (Lemma 4.15) for

α = \frac{1}{2}

and

ϑ = \frac{1}{20}

, there exist

C_{α} > 0

and

C_{odd} > 0

such that

{[P_{odd} f]}_{tree} \leq λ_{odd} (α, ϑ) {[f]}_{tree} + C_{odd} {∥ f ∥}_{σ},

(58)

with

λ_{odd} (α, ϑ) \leq \frac{C_{α}}{\sqrt{6}} ϑ = \frac{C_{0}}{\sqrt{6}} ϑ < 1,

(59)

where

C_{0} = 16 / 3^{3 / 2}

is given by Lemma 4.14.

Adding (57) and (58),

{[P f]}_{tree} \leq (2^{- (1 - α)} ϑ + λ_{odd} (α, ϑ)) {[f]}_{tree} + (C_{even} + C_{odd}) {∥ f ∥}_{σ} .

Define

λ_{LY} : = 2^{- (1 - α)} ϑ + λ_{odd} (α, ϑ), C_{LY} : = C_{even} + C_{odd},

and we obtain (56).

(2) Verification that

λ_{LY} < 1

. With

(α, ϑ) = (\frac{1}{2}, \frac{1}{20})

,

2^{- (1 - α)} ϑ = 2^{- 1 / 2} \cdot \frac{1}{20} = \frac{1}{20 \sqrt{2}} \approx 0.0354 .

From Lemma 4.15 and Lemma 4.14 we have

λ_{odd} (\frac{1}{2}, \frac{1}{20}) \leq \frac{C_{0}}{\sqrt{6}} \cdot \frac{1}{20}, C_{0} = \frac{16}{3^{3 / 2}} < 3.1 .

Numerically,

\frac{C_{0}}{\sqrt{6}} \cdot \frac{1}{20} < \frac{3.1}{\sqrt{6}} \cdot \frac{1}{20} \approx 0.063 .

Therefore

λ_{LY} = 2^{- (1 - α)} ϑ + λ_{odd} (α, ϑ) < 0.0354 + 0.063 < 0.1 < 1 .

Thus

λ_{LY}

is a strict contraction factor depending only on the fixed parameters, and the proposition follows. □

We now record the standard consequence of the Lasota–Yorke inequality and the compact embedding of

B_{tree}

into

ℓ^{1}

.

Theorem 4.17

(Quasi-compactness on

B_{tree, σ}

). Let

0 < α < 1

,

0 < ϑ < 1

, and

σ > 1

. Assume that the Lasota–Yorke constant

λ (α, ϑ) : = 2^{- (1 - α)} + λ_{odd} (α, ϑ)

satisfies

λ (α, ϑ) < 1

, where

λ_{odd} (α, ϑ)

is as in Lemma 4.9. Then the backward transfer operator P acting on

B_{tree, σ}

is quasi-compact, and its essential spectral radius satisfies

ρ_{ess} (P |_{B_{tree, σ}}) \leq λ (α, ϑ) < 1 .

(60)

Proof.

We work on the Banach space

B_{tree, σ}

with norm

{∥ \cdot ∥}_{tree, σ} = {∥ \cdot ∥}_{σ} + {[\cdot]}_{tree}

, where

{∥ \cdot ∥}_{σ}

is the weighted

ℓ_{σ}^{1}

-norm and

{[\cdot]}_{tree}

is the tree seminorm defined in Section 4.3.

Step 1: Lasota–Yorke inequality. By Proposition 4.11 (applied in the weighted setting, with

{∥ f ∥}_{1}

replaced by

{∥ f ∥}_{σ}

) we have, for all

f \in B_{tree, σ}

,

{[P f]}_{tree} \leq λ (α, ϑ) {[f]}_{tree} + C_{LY} {∥ f ∥}_{σ},

(61)

with

λ (α, ϑ) < 1

by assumption. On the weak norm side, since P is bounded on

ℓ_{σ}^{1}

, there exists

C_{σ} > 0

(e.g.

C_{σ} = Λ_{σ}

from (17)) such that

{∥ P f ∥}_{σ} \leq C_{σ} {∥ f ∥}_{σ} for all f \in B_{tree, σ} .

(62)

Thus P satisfies a standard two-norm Lasota–Yorke inequality on

B_{tree, σ}

with strong seminorm

{∥ \cdot ∥}_{s} : = {[\cdot]}_{tree}

and weak norm

{∥ \cdot ∥}_{w} : = {∥ \cdot ∥}_{σ}

:

{∥ P f ∥}_{s} \leq {λ ∥ f ∥}_{s} + C_{LY} {∥ f ∥}_{w} {, ∥ P f ∥}_{w} \leq C_{σ} {∥ f ∥}_{w} .

(63)

Step 2: Compact embedding. By Lemma 4.5, the embedding

J : (B_{tree, σ} {, ∥ \cdot ∥}_{tree, σ}) ↪ (ℓ_{σ}^{1} {, ∥ \cdot ∥}_{σ})

is compact. Since

{∥ \cdot ∥}_{w} = {∥ \cdot ∥}_{σ}

is exactly the weak norm used in (63), this shows that the unit ball of

B_{tree, σ}

is relatively compact for the weak norm.

Step 3: Application of Ionescu–Tulcea–Marinescu / Hennion. We now invoke the standard quasi-compactness criterion (Ionescu–Tulcea and Marinescu theorem): if a bounded operator T on a Banach space X satisfies

(i): a Lasota–Yorke inequality ${∥ T x ∥}_{s} \leq {λ ∥ x ∥}_{s} + C {∥ x ∥}_{w}$ with $λ < 1$ ,
(ii): a weak bound ${∥ T x ∥}_{w} \leq C^{'} {∥ x ∥}_{w}$ , and
(iii): the injection ${(X, ∥ \cdot ∥}_{s} {) ↪ (X, ∥ \cdot ∥}_{w})$ has relatively compact unit ball,

then T is quasi-compact on X and its essential spectral radius satisfies

ρ_{ess} (T) \leq λ .

Conditions (i)–(iii) are exactly (63) and Lemma 4.5 for

T = P

and

X = B_{tree, σ}

. Therefore P is quasi-compact on

B_{tree, σ}

and

ρ_{ess} (P |_{B_{tree, σ}}) \leq λ (α, ϑ) < 1,

which is (60). □

Remark 4.18

(On the choice of parameters). The explicit bound (59) shows that

λ_{odd} (α, ϑ)

decreases linearly with ϑ. For fixed α, one can therefore choose ϑ sufficiently small so that

λ (α, ϑ) < 1

, provided the constant

C_{α}

is effectively controlled. Subsequent sections make this optimization quantitative by computing

C_{α}

and exhibiting admissible parameter pairs

(α, ϑ)

that give a strict spectral gap.

The Lasota–Yorke framework developed here supplies the functional-analytic backbone for the spectral approach to the Collatz problem: once explicit parameters with

λ (α, ϑ) < 1

are verified, the quasi-compactness and spectral gap of P on

B_{tree}

follow, and the spectral criteria of Section 4 can be invoked to constrain or rule out non-terminating configurations.

5. Invariant Profiles, Block Recursion, and Perron–Frobenius Rigidity

Having established in Section 4.4 that the backward Collatz operator P is quasi-compact on the multi-scale tree space

B_{tree}

, we now turn to the spectral consequences of this result. The Lasota–Yorke inequality ensures the existence of a spectral gap, which in turn controls the structure of invariant densities and the long-term behavior of iterates

P^{k}

. The objective of this section is to characterize the invariant and quasi-invariant components of P, derive an effective block recursion for their scale-averaged coefficients, and demonstrate that the recursion enforces rigidity across the Collatz tree.

Throughout this section,

h \in B_{tree, σ}

will denote an invariant density of P, i.e. a function satisfying

P h = h

. The analysis proceeds in several stages. First, we describe the structure of possible invariant profiles in the multiscale framework and show that the Lasota–Yorke inequality forces uniform flatness across scales. Next, we translate this flatness into an explicit two-sided recurrence relation for block averages

c_{j}

. Finally, we verify that the coefficients of this recurrence satisfy a spectral bound consistent with the contraction constant

λ_{odd} (α, ϑ)

computed earlier.

Theorem 5.1

(Perron–Frobenius structure on

B_{tree, σ}

). Let P be the backward Collatz transfer operator acting on

B_{tree, σ}

with parameters

(α, ϑ, σ)

chosen so that the Lasota–Yorke inequality and quasi–compactness hold. Then:

The spectral radius of P equals 1, and 1 is a simple eigenvalue.
There exists a unique eigenvector $h \in B_{tree, σ}$ with $h > 0$ and $P h = h$ , normalized by $ϕ (h) = 1$ .
There exists a unique positive eigenfunctional $ϕ \in B_{tree, σ}^{*}$ such that $ϕ \circ P = ϕ$ .
All other spectral values satisfy $| z | < 1$ , and P admits the spectral decomposition

$P = h \otimes ϕ + Q, ρ (Q) < 1,$

where Q is quasi–compact.

Proof.

We combine the Lasota–Yorke inequality on

B_{tree, σ}

with standard Perron–Frobenius theory for positive quasi–compact operators.

Step 1: Spectral radius and quasi–compactness. By construction P is a bounded linear operator on

B_{tree, σ}

and is positive in the sense that

f \geq 0

implies

P f \geq 0

. The Lasota–Yorke inequality on

B_{tree, σ}

(Proposition 4.11, say) together with the compact embedding of the strong seminorm into the weak norm implies that P is quasi–compact on

B_{tree, σ}

with essential spectral radius strictly less than 1:

ρ_{ess} (P) < 1 .

(64)

On the other hand, the logarithmic mass–preservation identity (Lemma 2.4) shows that the spectral radius of P is at least 1; the boundedness of P implies

ρ (P) \leq 1

, hence

ρ (P) = 1 .

(65)

In particular, 1 lies in the spectrum of P and, by (64), is an isolated spectral value.

Step 2: Existence of a positive eigenvector. Consider the positive cone

C : = {f \in B_{tree, σ} : f \geq 0},

which is closed, convex, and reproducing. Since P is positive and

ρ (P) = 1

, the Krein–Rutman theorem for positive operators on Banach spaces implies the existence of a nonzero

h \in C

such that

P h = h .

(66)

Moreover, h can be chosen strictly positive in the sense that

h (n) > 0

for all

n \in N

: indeed, by the preimage structure of the Collatz map (Lemma 2.3) and the connectivity of the backward tree, any nontrivial

f \in C

is eventually propagated by iterates of P to a function that is positive on every block

I_{j}

, so

P^{k} f > 0

for all sufficiently large k. Replacing h by

P^{k} h

if necessary yields

h > 0

.

Step 3: Uniqueness and simplicity of the eigenvalue 1. We now show that 1 is a simple eigenvalue and that h is unique up to scalar multiples. Suppose

g \in B_{tree, σ}

satisfies

P g = g

. Decompose

g = g^{+} - g^{-}

into positive parts. Positivity of P implies

P g^{\pm} = g^{\pm}

. By the strong positivity argument above, any nonzero

f \in C

with

P f = f

must be strictly positive; hence

g^{+}

and

g^{-}

are both either 0 or strictly positive. If both were nonzero, then

g^{+}

and

g^{-}

would be linearly independent positive eigenvectors for the eigenvalue 1, and the positive cone would contain a two-dimensional face of eigenvectors. This contradicts the Krein–Rutman conclusion that the eigenspace associated with the spectral radius is one–dimensional. Therefore one of

g^{+}, g^{-}

must vanish and g is either nonnegative or nonpositive; by replacing g by

- g

if necessary,

g \geq 0

, and the strong positivity then forces g to be a scalar multiple of h. Thus the eigenspace for the eigenvalue 1 is one–dimensional and spanned by h, and 1 is a simple eigenvalue. This proves (1) and the first part of (2) after normalizing by

ϕ (h) = 1

below.

Step 4: Dual eigenfunctional. Consider the dual operator

P^{*}

acting on

B_{tree, σ}^{*}

. Since P is positive, so is

P^{*}

on the dual cone

C^{*} : = {Λ \in B_{tree, σ}^{*} : Λ (f) \geq 0 for all f \in C} .

The quasi–compactness of P implies quasi–compactness of

P^{*}

on the dual space. By (65),

P^{*}

also has spectral radius 1. Applying the same Krein–Rutman argument to

P^{*}

yields a nonzero

ϕ \in C^{*}

and

ϕ \circ P = ϕ,

(67)

with

ϕ

strictly positive on nonzero elements of

C

. The same simplicity argument as in Step 3 shows that the eigenspace of

P^{*}

for the eigenvalue 1 is one–dimensional and spanned by

ϕ

. Normalizing by the condition

ϕ (h) = 1

gives the uniquely determined eigenpair

(h, ϕ)

appearing in the statement. This establishes (2) and (3).

Step 5: Spectral decomposition and spectral gap. Quasi–compactness of P on

B_{tree, σ}

, together with (64) and the simplicity of the eigenvalue 1, implies that the spectrum of P is contained in

{1} \cup {z : | z | < r}

for some

r < 1

. Let

Π

denote the spectral projection onto the eigenspace associated with

λ = 1

; by the previous steps,

Π f = h ϕ (f), f \in B_{tree, σ},

so that

Π = h \otimes ϕ

as a rank–one operator. Writing

P = Π + Q = h \otimes ϕ + Q,

(68)

we have

Q = P - Π

and

Q Π = Π Q = 0

. The spectrum of Q is contained in

{z : | z | < r}

, so in particular

ρ (Q) < 1 .

Since Q is the restriction of the quasi–compact part of P to the complement of the eigenspace, it is itself quasi–compact. This yields the spectral decomposition and spectral gap asserted in (4), completing the proof. □

Proposition 5.2

(Cesàro averages and invariant P–functionals). Let X be a Banach space of functions

f : N \to C

such that

B_{tree, σ} \subset X

continuously and the backward transfer operator P extends to a bounded operator

P : X \to X

.

Let

X^{*}

be the Banach dual of X, with duality pairing

〈 f, φ 〉

(not necessarily given by a pointwise sum for all φ). Suppose:

(i): $P^{*} : X^{*} \to X^{*}$ is power–bounded: there exists $C_{*} > 0$ such that

$∥ {(P^{*})}^{k} ∥_{X^{*} \to X^{*}} \leq C_{*} for all k \geq 0 .$

(69)

Equivalently, the Cesàro operators

$A_{N} : = \frac{1}{N} \sum_{k = 0}^{N - 1} {(P^{*})}^{k}$

satisfy $∥ A_{N} ∥_{X^{*} \to X^{*}} \leq C_{*}$ for all $N \geq 1$ .
(ii): There exists $φ \in X^{*}$ and $f_{0} \in X$ such that

$\underset{N \to \infty}{lim inf} 〈f_{0}, \frac{1}{N} \sum_{k = 0}^{N - 1} {(P^{*})}^{k} φ〉 > 0 .$

(70)

Define the Cesàro averages

Φ_{N} : = \frac{1}{N} \sum_{k = 0}^{N - 1} {(P^{*})}^{k} φ \in X^{*} .

Then:

(a): ${(Φ_{N})}_{N \geq 1}$ is a bounded set in $X^{*}$ and admits weak–* limit points in $X^{*}$ .
(b): Every weak–* limit point Φ of ${(Φ_{N})}_{N \geq 1}$ satisfies $P^{*} Φ = Φ$ , hence defines a P–invariant functional $ℓ (f) : = 〈 f, Φ 〉$ on X.
(c): Under assumption (70), any such limit point Φ is nonzero, since $〈 f_{0}, Φ 〉 \geq {lim inf}_{N \to \infty} 〈 f_{0}, Φ_{N} 〉 > 0 .$ In particular, the restriction of ℓ to $B_{tree, σ}$ is a nonzero invariant functional with $ℓ \circ P = ℓ$ .

Proof. (a) From (69),

∥ Φ_{N} ∥_{X^{*}} = {∥\frac{1}{N} \sum_{k = 0}^{N - 1} {(P^{*})}^{k} φ∥}_{X^{*}} \leq \frac{1}{N} \sum_{k = 0}^{N - 1} ∥ {(P^{*})}^{k} {∥ ∥ φ ∥}_{X^{*}} \leq C_{*} {∥ φ ∥}_{X^{*}}

for all

N \geq 1

, so the sequence

(Φ_{N})

is bounded in

X^{*}

. By Banach–Alaoglu, the closed ball

{ψ \in X^{*} : ∥ ψ ∥ \leq C_{*} ∥ φ ∥}

is weak–* compact, hence there exists a subsequence

N_{m} \to \infty

and

Φ \in X^{*}

such that

Φ_{N_{m}} \to Φ

in the weak–* topology.

(b) Fix

f \in X

and compute

\begin{matrix} 〈 f, P^{*} Φ_{N} 〉 & = 〈 P f, Φ_{N} 〉 = \frac{1}{N} \sum_{k = 0}^{N - 1} 〈 P f, {(P^{*})}^{k} φ 〉 = \frac{1}{N} \sum_{k = 1}^{N} 〈 f, {(P^{*})}^{k} φ 〉 \\ = \frac{1}{N} \sum_{k = 0}^{N - 1} 〈 f, {(P^{*})}^{k} φ 〉 + \frac{1}{N} (〈 f, {(P^{*})}^{N} φ 〉 - 〈 f, φ 〉) \\ = 〈 f, Φ_{N} 〉 + \frac{1}{N} (〈 f, {(P^{*})}^{N} φ 〉 - 〈 f, φ 〉) . \end{matrix}

By (69),

| 〈 f, {(P^{*})}^{N} φ 〉 {| \leq ∥ f ∥}_{X} ∥ {(P^{*})}^{N} {φ ∥}_{X^{*}} \leq {∥ f ∥}_{X} C_{*} {∥ φ ∥}_{X^{*}},

so

|\frac{1}{N} (〈 f, {(P^{*})}^{N} φ 〉 - 〈 f, φ 〉)| \leq \frac{2 C_{*}}{N} {∥ f ∥}_{X} {∥ φ ∥}_{X^{*}} \underset{N \to \infty}{\to} 0 .

Thus

lim_{N \to \infty} (〈 f, P^{*} Φ_{N} 〉 - 〈 f, Φ_{N} 〉) = 0 .

Passing to the subsequence

N_{m}

along which

Φ_{N_{m}} \to Φ

weak–*, we obtain

〈 f, P^{*} Φ 〉 = lim_{m \to \infty} 〈 f, P^{*} Φ_{N_{m}} 〉 = lim_{m \to \infty} 〈 f, Φ_{N_{m}} 〉 = 〈 f, Φ 〉 (f \in X),

so

P^{*} Φ = Φ

.

(c) By assumption (70),

\underset{N \to \infty}{lim inf} 〈 f_{0}, Φ_{N} 〉 > 0 .

In particular, after passing to a subsequence

N_{m}

if necessary, we have

{lim}_{m \to \infty} 〈 f_{0}, Φ_{N_{m}} 〉 = λ > 0 .

For the corresponding weak–* limit point

Φ

of

Φ_{N_{m}}

,

〈 f_{0}, Φ 〉 = lim_{m \to \infty} 〈 f_{0}, Φ_{N_{m}} 〉 = λ > 0,

so

Φ \neq 0

. The invariant functional

ℓ (f) : = 〈 f, Φ 〉

is therefore nonzero and satisfies

ℓ (P f) = 〈 P f, Φ 〉 = 〈 f, P^{*} Φ 〉 = 〈 f, Φ 〉 = ℓ (f)

. □

5.1. Invariant Density Profile and Refined Tree Geometry

The quasi-compactness of P implies that its spectrum consists of a discrete set of eigenvalues of finite multiplicity outside a disk of radius

ρ_{ess} (P) \leq λ_{LY} < 1

, together with a residual spectrum contained in that disk. Let

λ_{0} = 1

denote the trivial eigenvalue corresponding to constant functions. Any additional eigenvalues with

| λ | < 1

correspond to exponentially decaying modes. Thus, an invariant density h satisfying

P h = h

must lie in the one-dimensional eigenspace associated with

λ_{0}

, provided no unit-modulus spectrum remains.

However, to make this conclusion effective, one must exclude the possibility of small oscillatory components that project into higher spectral modes but decay too slowly to be detected by the weak

ℓ^{1}

norm alone. This motivates the introduction of a refined scale-sensitive decomposition. Define block intervals

I_{j}

as in (34), and let

H_{j} (h) : = \sum_{n \in I_{j}} h (n), c_{j} : = \frac{H_{j} (h)}{| I_{j} |} = \frac{H_{j} (h)}{6^{j}} .

(71)

The sequence

{(c_{j})}_{j \geq 0}

captures the mean behavior of h across successive scales in the backward tree. Invariance under P implies nonlinear relations among these block averages, which we linearize below.

Lemma 5.3

(Block–level invariance relation). Let

0 < α < 1

,

0 < ϑ < 1

, and

σ > 1

, and let

h \in B_{tree, σ}

satisfy

P h = h

. For each

j \geq 0

define the block average

c_{j} : = \frac{1}{| I_{j} |} \sum_{n \in I_{j}} h (n), | I_{j} | : = # I_{j},

where

I_{j} = [6^{j}, 2 \cdot 6^{j})

are the tree blocks used in the definition of

B_{tree, σ}

. Then there exist bounded sequences

{(a_{j})}_{j \geq 0}

and

{(b_{j})}_{j \geq 0}

with

a_{j}, b_{j} \geq 0

and a sequence

{(ε_{j})}_{j \geq 0}

such that

c_{j} = a_{j} c_{j + 1} + b_{j} c_{j - 1} + ε_{j}, j \geq 1,

(72)

and the error sequence is summable in the weighted norm:

\sum_{j \geq 1} ϑ^{j} | ε_{j} | \leq C ({[h]}_{tree} + {∥ h ∥}_{σ}) < \infty,

(73)

for a constant

C > 0

depending only on

α, ϑ, σ

and the block geometry.

Proof.

Fix

h \in B_{tree, σ}

with

P h = h

and

j \geq 1

. We work with the 6–adic blocks

I_{j} = [6^{j}, 2 \cdot 6^{j})

.

1. Block identity and branch decomposition. For each j,

| I_{j} | c_{j} = \sum_{n \in I_{j}} h (n) = \sum_{n \in I_{j}} (P h) (n) = \sum_{n \in I_{j}} (\frac{h (2 n)}{2 n} + 1_{{n \equiv 4 (6)}} \frac{h (\frac{n - 1}{3})}{(n - 1) / 3}) .

Set

S_{j}^{even} : = \sum_{n \in I_{j}} \frac{h (2 n)}{2 n}, S_{j}^{odd} : = \sum_{\begin{matrix} n \in I_{j} \\ n \equiv 4 (6) \end{matrix}} \frac{h (\frac{n - 1}{3})}{(n - 1) / 3},

so

| I_{j} | c_{j} = S_{j}^{even} + S_{j}^{odd} .

(74)

2. Preliminaries: oscillation control on blocks. By definition of the tree seminorm and the block geometry, there exists a constant

C_{osc} > 0

and some

β > 1

(depending only on

α

) such that for every

j \geq 0

,

{osc}_{I_{j}} h : = sup_{m, n \in I_{j}} | h (m) - h (n) | \leq C_{osc} ϑ^{- j} 6^{- β j} {[h]}_{tree} .

(75)

Indeed, for

m, n \in I_{j}

we have

W_{α} (m, n) ≍ 6^{(2 - α) j}

, so

W_{α} (m, n) | h (m) - h (n) | \leq ϑ^{- j} {[h]}_{tree} \Rightarrow | h (m) - h (n) | ≪ ϑ^{- j} 6^{- (2 - α) j} {[h]}_{tree},

and we may take any

β \in (1, 2 - α]

.

From this we also get a bound on deviations from the block average:

\sum_{n \in I_{j}} | h (n) - c_{j} | \leq | I_{j} | {osc}_{I_{j}} h ≪ 6^{j} ϑ^{- j} 6^{- β j} {[h]}_{tree} = ϑ^{- j} 6^{- (β - 1) j} {[h]}_{tree} .

(76)

Since

β > 1

, the exponent

(β - 1)

is positive.

We also retain the crude pointwise bound coming from the weighted norm:

| h (n) | \leq n^{σ} {∥ h ∥}_{σ}, n \geq 1 .

(77)

3. Even branch contribution. Write

S_{j}^{even} = \sum_{n \in I_{j}} \frac{h (2 n)}{2 n} .

For

n \in I_{j} = [6^{j}, 2 \cdot 6^{j})

, we have

2 n \in [2 \cdot 6^{j}, 4 \cdot 6^{j})

, which lies in a bounded union of neighboring blocks at scales j and

j + 1

. The bulk of

2 n

lie in

I_{j + 1}

; finitely many fall into the adjacent blocks. Define

a_{j}^{even} : = \frac{1}{| I_{j} |} \sum_{n \in I_{j} \cap {2 n \in I_{j + 1}}} \frac{1}{2 n},

and decompose

h (2 n) = c_{j + 1} + (h (2 n) - c_{j + 1})

for those

2 n \in I_{j + 1}

, with the finitely many remaining

2 n

folded into the error. This gives

S_{j}^{even} = a_{j}^{even} | I_{j} | c_{j + 1} + R_{j}^{even},

where

R_{j}^{even} : = \sum_{n \in I_{j} \cap {2 n \in I_{j + 1}}} \frac{h (2 n) - c_{j + 1}}{2 n} + \sum_{n \in I_{j} \cap {2 n \notin I_{j + 1}}} \frac{h (2 n)}{2 n} .

We now bound

R_{j}^{even}

.

For

2 n \in I_{j + 1}

, (75) with

j + 1

gives

| h (2 n) - c_{j + 1} | \leq C_{osc} ϑ^{- (j + 1)} 6^{- β (j + 1)} {[h]}_{tree},

and

2 n ≍ 6^{j + 1}

, so each such term contributes

|\frac{h (2 n) - c_{j + 1}}{2 n}| ≪ ϑ^{- (j + 1)} 6^{- (β + 1) (j + 1)} {[h]}_{tree} .

Summing over

| I_{j} | ≍ 6^{j}

values of n yields

\sum_{n \in I_{j} \cap {2 n \in I_{j + 1}}} |\frac{h (2 n) - c_{j + 1}}{2 n}| ≪ ϑ^{- (j + 1)} 6^{j} 6^{- (β + 1) (j + 1)} {[h]}_{tree} ≪ ϑ^{- j} 6^{- γ j} {[h]}_{tree}

for some

γ > 0

(since

β > 1

).

For the finitely many spillover terms with

2 n \notin I_{j + 1}

, we use (77) and the fact that there are

O (1)

such n:

|\sum_{n \in I_{j} \cap {2 n \notin I_{j + 1}}} \frac{h (2 n)}{2 n}| ≪ \sum_{n \in I_{j} \cap {2 n \notin I_{j + 1}}} {(2 n)}^{σ - 1} {∥ h ∥}_{σ} ≪ 6^{(σ - 1) j} {∥ h ∥}_{σ} .

Altogether,

| R_{j}^{even} | \leq C (ϑ^{- j} 6^{- γ j} {[h]}_{tree} + 6^{(σ - 1) j} {∥ h ∥}_{σ})

(78)

for some constants

C, γ > 0

depending only on the fixed parameters. By construction

a_{j}^{even} \geq 0

and

(a_{j}^{even})

is bounded above and below (by simple counting of preimages inside

I_{j + 1}

), though we will not need explicit bounds here.

4. Odd branch contribution. Similarly,

S_{j}^{odd} = \sum_{\begin{matrix} n \in I_{j} \\ n \equiv 4 (6) \end{matrix}} \frac{h ((n - 1) / 3)}{(n - 1) / 3} .

For

n \in I_{j}

with

n \equiv 4 (mod 6)

we have

m^{'} : = (n - 1) / 3 ≍ 6^{j - 1}

, so

m^{'}

lies in a bounded union of neighboring blocks around scale

j - 1

. The bulk lie in

I_{j - 1}

; finitely many lie in adjacent blocks.

Define

b_{j}^{odd} : = \frac{1}{| I_{j} |} \sum_{\begin{matrix} n \in I_{j} \\ n \equiv 4 (6) \\ (n - 1) / 3 \in I_{j - 1} \end{matrix}} \frac{1}{(n - 1) / 3},

and decompose

h (m^{'}) = c_{j - 1} + (h (m^{'}) - c_{j - 1})

for

m^{'} = (n - 1) / 3 \in I_{j - 1}

. Then

S_{j}^{odd} = b_{j}^{odd} | I_{j} | c_{j - 1} + R_{j}^{odd},

where

R_{j}^{odd} : = \sum_{\begin{matrix} n \in I_{j} \\ n \equiv 4 (6) \\ (n - 1) / 3 \in I_{j - 1} \end{matrix}} \frac{h (m^{'}) - c_{j - 1}}{m^{'}} + \sum_{\begin{matrix} n \in I_{j} \\ n \equiv 4 (6) \\ (n - 1) / 3 \notin I_{j - 1} \end{matrix}} \frac{h (m^{'})}{m^{'}} .

The first sum is controlled by the block oscillation at scale

j - 1

:

| h (m^{'}) - c_{j - 1} | \leq C_{osc} ϑ^{- (j - 1)} 6^{- β (j - 1)} {[h]}_{tree}, m^{'} ≍ 6^{j - 1},

so each term is

|\frac{h (m^{'}) - c_{j - 1}}{m^{'}}| ≪ ϑ^{- (j - 1)} 6^{- (β + 1) (j - 1)} {[h]}_{tree} .

There are

≍ | I_{j} | ≍ 6^{j}

such n, hence

\sum_{\begin{matrix} n \in I_{j} \\ (n - 1) / 3 \in I_{j - 1} \end{matrix}} |\frac{h (m^{'}) - c_{j - 1}}{m^{'}}| ≪ ϑ^{- j} 6^{- γ j} {[h]}_{tree}

for some

γ > 0

as before.

For the spillover terms with

(n - 1) / 3 \notin I_{j - 1}

, there are again only

O (1)

such indices n, and (77) gives

|\frac{h (m^{'})}{m^{'}}| \leq {(m^{'})}^{σ - 1} {∥ h ∥}_{σ} ≪ 6^{(σ - 1) j} {∥ h ∥}_{σ},

so these contribute at most

C 6^{(σ - 1) j} {∥ h ∥}_{σ}

. Thus

| R_{j}^{odd} | \leq C (ϑ^{- j} 6^{- γ j} {[h]}_{tree} + 6^{(σ - 1) j} {∥ h ∥}_{σ}),

(79)

for some possibly larger

C, γ > 0

. Again

b_{j}^{odd} \geq 0

and

(b_{j}^{odd})

is bounded.

5. Assemble and normalize. Substituting (78) and (79) into (74), we obtain

| I_{j} | c_{j} = a_{j}^{even} | I_{j} | c_{j + 1} + b_{j}^{odd} | I_{j} | c_{j - 1} + R_{j}^{even} + R_{j}^{odd} .

Dividing by

| I_{j} | ≍ 6^{j}

gives

c_{j} = a_{j}^{even} c_{j + 1} + b_{j}^{odd} c_{j - 1} + ε_{j},

with

ε_{j} : = \frac{R_{j}^{even} + R_{j}^{odd}}{| I_{j} |} .

From (78), (79) and

| I_{j} | ≍ 6^{j}

we have

| ε_{j} | \leq C (ϑ^{- j} 6^{- (γ + 1) j} {[h]}_{tree} + 6^{- σ j} {∥ h ∥}_{σ}) .

Multiplying by

ϑ^{j}

and summing over

j \geq 1

yields

\sum_{j \geq 1} ϑ^{j} | ε_{j} {| \leq C [h]}_{tree} \sum_{j \geq 1} {(ϑ 6^{- (γ + 1)})}^{j} + C {∥ h ∥}_{σ} \sum_{j \geq 1} {(ϑ 6^{- σ})}^{j} \leq C^{'} ({[h]}_{tree} + {∥ h ∥}_{σ}),

for suitable

C^{'} > 0

, since

γ + 1 > 1

and

σ > 1

imply

ϑ 6^{- (γ + 1)} < 1

and

ϑ 6^{- σ} < 1

for any fixed

ϑ \in (0, 1)

.

Finally, set

a_{j} : = a_{j}^{even}

and

b_{j} : = b_{j}^{odd}

. This proves the block relation (72) with

ϑ

–summable error (73). □

Lemma 5.4

(Limiting preimage ratios). Let

{(I_{j})}_{j \geq 0}

be the multiscale blocks

I_{j} = [6^{j}, 2 \cdot 6^{j}) \cap N, | I_{j} | = 6^{j} .

Let

a_{j}, b_{j} \geq 0

be the coefficients from Lemma 5.3, so that for any invariant profile

h \in B_{tree, σ}

with

P h = h

and block averages

c_{j} : = \frac{1}{| I_{j} |} \sum_{n \in I_{j}} h (n),

one has

c_{j} = a_{j} c_{j + 1} + b_{j} c_{j - 1} + ε_{j}, j \geq 1,

(80)

with an error satisfying

\sum_{j \geq 1} ϑ^{j} | ε_{j} | \leq C ({[h]}_{tree} + {∥ h ∥}_{σ})

for some constant

C > 0

independent of h. Then there exist constants

a, b > 0

and

C^{'} > 0

,

0 < δ < 1

(depending only on the fixed parameters and the block geometry) such that

lim_{j \to \infty} a_{j} = a, lim_{j \to \infty} b_{j} = b,

and, for all

j \geq 1

,

| a_{j} - a | + | b_{j} - b | \leq C^{'} δ^{j} .

In particular,

a, b

are strictly positive and the sequences

(a_{j})

and

(b_{j})

converge exponentially fast to their limits.

Proof.

By Lemma 5.3,

a_{j}, b_{j}

are determined purely by the preimage geometry between the neighboring scales

I_{j - 1}, I_{j}, I_{j + 1}

; they do not depend on h. We now make this dependence explicit.

1. Even and odd preimage windows. The inverse branches of the accelerated Collatz map T are

n \mapsto 2 n (even branch), n \mapsto \frac{n - 1}{3} when n \equiv 4 (mod 6) (odd branch) .

In the block relation (80), the coefficient

a_{j}

collects the contribution from even preimages whose images land in

I_{j}

and whose preimages lie in the “next” scale (around

I_{j + 1}

), while

b_{j}

collects the contribution from odd preimages mapping from the lower scale (around

I_{j - 1}

). All remaining preimages (falling into gaps or nonadjacent blocks) are assigned to the error term absorbed in

ε_{j}

.

For the even branch, define the relevant preimage window

E_{j}^{*} : = {m \in N : m = 2 n with n \in I_{j} and m lies in the prescribed upper - neighbor blocks} .

Similarly, for the odd branch, define

O_{j}^{*} : = {m \in N : m = (n - 1) / 3, n \in I_{j}, n \equiv 4 (mod 6), and m lies in lower - neighbor blocks} .

By construction (see the proof of Lemma 5.3), almost all even preimages

2 n

with

n \in I_{j}

fall into a fixed finite pattern of blocks around scale

j + 1

, and almost all odd preimages

(n - 1) / 3

with

n \equiv 4 (mod 6), n \in I_{j}

fall into a fixed finite pattern of blocks around scale

j - 1

. The exceptions occur only for n in a bounded neighborhood of the endpoints of

I_{j}

and therefore contribute

O (1)

terms that can be absorbed into the error

ε_{j}

.

In particular, we have

| E_{j}^{*} | = | I_{j} | + O (1) = 6^{j} + O (1), | O_{j}^{*} | = \frac{1}{6} | I_{j} | + O (1) = 6^{j - 1} + O (1),

where the factor

1 / 6

reflects the asymptotic density of the residue class

4 (mod 6)

in

I_{j}

, up to

O (1)

boundary errors.

2. Canonical weighted definitions of

a_{j}

and

b_{j}

. By the construction in Lemma 5.3 (where one replaces h by block averages and isolates the main neighboring–scale contributions), there exist formulas of the form

a_{j} = \frac{1}{| I_{j} |} \sum_{m \in E_{j}^{*}} \frac{κ_{j}^{even} (m)}{m}, b_{j} = \frac{1}{| I_{j} |} \sum_{m \in O_{j}^{*}} \frac{κ_{j}^{odd} (m)}{m},

where

κ_{j}^{even}

and

κ_{j}^{odd}

are combinatorial weights taking values in a fixed finite set, depending only on the finite pattern of preimages between neighboring blocks (for example, they indicate exactly which of a finite family of adjacent blocks m belongs to and normalize the contribution appropriately). Crucially, for large j:

the sets $E_{j}^{*}$ and $O_{j}^{*}$ are contained in finite unions of intervals of the form $[γ 6^{j \pm 1}, Γ 6^{j \pm 1})$ with fixed $0 < γ < Γ < \infty$ , independent of j;
the functions $m \mapsto κ_{j}^{even} (m)$ and $m \mapsto κ_{j}^{odd} (m)$ are periodic in m modulo a fixed modulus q (coming from the 6–adic structure of the Collatz branches), up to $O (1)$ boundary corrections that again contribute $O (6^{- j})$ to $a_{j}, b_{j}$ .

Thus, each of the sums defining

a_{j}

and

b_{j}

is, up to an

O (6^{- j})

error, a Riemann sum for an integral of a fixed bounded periodic function times

x^{- 1}

on a fixed compact interval of

R_{+}

, normalized by

| I_{j} |

.

More concretely, we can write for large j:

a_{j} = \frac{1}{6^{j}} \sum_{m \in E_{j}^{*}} \frac{κ^{even} (m mod q)}{m} + O (6^{- j}),

where

κ^{even}

is a fixed q–periodic bounded function, and similarly for

b_{j}

.

3. Passage to the limit and exponential convergence. Fix

ε > 0

. For j large enough, the preimage windows

E_{j}^{*}

and

O_{j}^{*}

can be written as disjoint unions of arithmetic progressions of step q, truncated at endpoints of size

≍ 6^{j \pm 1}

, with at most

O (1)

elements lost at the boundaries.

For such arithmetic progressions, the normalized sums

\frac{1}{6^{j}} \sum_{m \in E_{j}^{*}} \frac{κ^{even} (m mod q)}{m} and \frac{1}{6^{j}} \sum_{m \in O_{j}^{*}} \frac{κ^{odd} (m mod q)}{m}

can be compared to the corresponding integrals

\int_{γ}^{Γ} \frac{F_{even} (x)}{x} d x, \int_{γ^{'}}^{Γ^{'}} \frac{F_{odd} (x)}{x} d x,

where

F_{even}, F_{odd}

are continuous periodic averages of the weights over residue classes. Standard Riemann–sum estimates for such periodic sums imply that the difference between each normalized sum and its limiting integral is

O (6^{- j})

. (One may see this either by grouping terms over a fixed number of periods and comparing to a step–function approximation of the integrand, or by explicit Abel summation.)

Thus there exist finite nonzero limits

a : = lim_{j \to \infty} a_{j} and b : = lim_{j \to \infty} b_{j},

given by those integrals, and constants

C > 0

and

0 < δ < 1

(for instance

δ = 1 / 6

after rescaling) such that

| a_{j} - a | + | b_{j} - b | \leq C δ^{j} for all j \geq 1 .

4. Positivity of the limits. For large j, the windows

E_{j}^{*}

and

O_{j}^{*}

have cardinalities

| E_{j}^{*} | = 6^{j} + O (1), | O_{j}^{*} | = 6^{j - 1} + O (1),

and the weights

κ^{even}, κ^{odd}

are bounded below by a positive constant on a fixed positive fraction of residue classes (this is just the statement that there are always even preimages and always odd preimages in the relevant windows). Since the factors

1 / m

are all of size

≍ 6^{- (j \pm 1)}

on these windows, the sums defining

a_{j}

and

b_{j}

are bounded below by positive constants independent of j, hence

a, b > 0

.

This establishes the existence of positive limits

a, b

and the exponential convergence claimed. □

Lemma 5.5

(Uniform convergence of the coefficient matrices). Let

M_{j} = (\begin{matrix} 0 & a_{j} \\ b_{j} & 0 \end{matrix}), M = (\begin{matrix} 0 & a \\ b & 0 \end{matrix}),

where

a_{j} \to a

and

b_{j} \to b

satisfy

| a_{j} - a | + | b_{j} - b | \leq C δ^{j}

for some

0 < δ < 1

as in Lemma 5.4. Then for any matrix norm

∥ \cdot ∥

,

∥ M_{j} - M ∥ \leq C^{'} δ^{j} .

In particular,

\sum_{j \geq j_{0}} ϑ^{j} ∥ M_{j} - M ∥ < \infty,

so

M_{j} \to M

exponentially fast in the sense required by the discrete variation-of-constants argument.

Proof.

By definition,

M_{j} - M = (\begin{matrix} 0 & a_{j} - a \\ b_{j} - b & 0 \end{matrix}) .

Let

∥ \cdot ∥

be any matrix norm on

2 \times 2

real matrices. Since all norms on

R^{2 \times 2}

are equivalent and the space is finite-dimensional, there exists a constant

K > 0

(depending only on the choice of norm) such that for any matrix

A = {(a_{m n})}_{m, n = 1}^{2}

,

∥ A ∥ \leq K max_{m, n} | a_{m n} | .

(81)

Applying (81) to

A = M_{j} - M

gives

∥ M_{j} - M ∥ \leq K max \{| a_{j} - a |, | b_{j} - b |\} .

By Lemma 5.4, the preimage ratios satisfy the exponential convergence

| a_{j} - a | + | b_{j} - b | \leq C δ^{j}, 0 < δ < 1 .

In particular,

max {| a_{j} - a |, | b_{j} - b |} \leq | a_{j} - a | + | b_{j} - b | \leq C δ^{j} .

Combining the two inequalities yields

∥ M_{j} - M ∥ \leq K C δ^{j} .

Setting

C^{'} : = K C

gives the claimed bound

∥ M_{j} - M ∥ \leq C^{'} δ^{j} .

Finally, since

0 < ϑ < 1

and

0 < δ < 1

, the product

ϑ δ < 1

, and therefore

\sum_{j \geq j_{0}} ϑ^{j} ∥ M_{j} - M ∥ \leq C^{'} \sum_{j \geq j_{0}} {(ϑ δ)}^{j} < \infty .

Thus

M_{j} \to M

exponentially fast in any matrix norm, establishing the uniform convergence required for the discrete variation-of-constants argument. □

Proposition 5.6

(Effective recursion for peripheral eigenfunctions). Let

0 < α < 1

,

0 < ϑ < 1

,

σ > 1

, and let

h \in B_{tree, σ}

satisfy

P h = λ h

with

| λ | = 1

. Let

H_{j} : = \sum_{n \in I_{j}} h (n)

and

c_{j} : = H_{j} / | I_{j} |

be the block sums and block averages on

I_{j} = [6^{j}, 2 \cdot 6^{j}) \cap N

. Then, with

a, b > 0

as in Lemma 5.4, there exists a sequence

{(ε_{j})}_{j \geq 1}

with

\sum_{j \geq 1} | ε_{j} | ϑ^{j} < \infty

such that

λ c_{j} = a c_{j + 1} + b c_{j - 1} + ε_{j}, j \geq 1 .

(82)

Equivalently, for the renormalized averages

d_{j} : = λ^{- j} c_{j}

we have

d_{j} = a d_{j + 1} + b λ^{- 2} d_{j - 1} + {\tilde{ε}}_{j}, \sum_{j \geq 1} | {\tilde{ε}}_{j} | ϑ^{j} < \infty,

(83)

where

{\tilde{ε}}_{j} : = λ^{- (j + 1)} ε_{j}

.

Proof.

The derivation up to the “twisted” block relation is exactly as in the

λ = 1

case (Lemma 5.3), except that we now use the eigenrelation

P h = λ h

. Summing over

I_{j}

and splitting even/odd branches, reorganizing via the preimage windows

E_{j}^{*}

and

O_{j}^{*}

, and freezing the scale–dependent coefficients to their limits

a, b > 0

as in Lemma 5.4, we arrive at

λ c_{j} = a c_{j + 1} + b c_{j - 1} + ε_{j}, j \geq 1,

(84)

with

\sum_{j \geq 1} ϑ^{j} | ε_{j} | < \infty .

This is (82).

For the renormalized averages, set

d_{j} : = λ^{- j} c_{j}

. Substituting

c_{j} = λ^{j} d_{j}

,

c_{j + 1} = λ^{j + 1} d_{j + 1}

,

c_{j - 1} = λ^{j - 1} d_{j - 1}

into (84) gives

λ λ^{j} d_{j} = a λ^{j + 1} d_{j + 1} + b λ^{j - 1} d_{j - 1} + ε_{j},

that is,

λ^{j + 1} d_{j} = a λ^{j + 1} d_{j + 1} + b λ^{j - 1} d_{j - 1} + ε_{j} .

Divide by

λ^{j + 1}

:

d_{j} = a d_{j + 1} + b λ^{- 2} d_{j - 1} + λ^{- (j + 1)} ε_{j} .

Set

{\tilde{ε}}_{j} : = λ^{- (j + 1)} ε_{j}

. Since

| λ | = 1

, we have

| {\tilde{ε}}_{j} | = | ε_{j} |

and hence

\sum_{j \geq 1} ϑ^{j} | {\tilde{ε}}_{j} | = \sum_{j \geq 1} ϑ^{j} | ε_{j} | < \infty .

This is exactly (83).

No further simplification of the coefficients is possible in general unless

λ^{2} = 1

(in which case the factor

λ^{- 2}

reduces to 1 and the recursion becomes symmetric in

d_{j + 1}

and

d_{j - 1}

). □

Remark 5.7

(Admissibility for freezing the coefficients). The “freezing” errors

(a_{j} - a) c_{j + 1}

and

(b_{j} - b) c_{j - 1}

are summable in the weighted norm because

| a_{j} - a | + | b_{j} - b | \leq C δ^{j}

for some

0 < δ < 1

by Lemma 5.4. Hence

\sum_{j \geq 0} ϑ^{j} (| a_{j} - a | + | b_{j} - b |) < \infty whenever ϑ < δ^{- 1} .

Since

δ \in (0, 1)

depends only on the block geometry and the parameters

(α, ϑ, σ)

, one may always choose

ϑ \in (0, 1)

sufficiently small so that the weighted summability condition holds. In particular, the choice

ϑ = \frac{1}{20}

used in the Lasota–Yorke framework is admissible for every

σ > 1

.

Remark 5.8

(Exact normalization of the block coefficients). In Lemma 5.3, the coefficients

a_{j}

and

b_{j}

arise from the relative sizes of the even and odd preimage windows:

a_{j} : = \frac{| E_{j}^{*} |}{| E_{j}^{*} | + | O_{j}^{*} |}, b_{j} : = \frac{| O_{j}^{*} |}{| E_{j}^{*} | + | O_{j}^{*} |},

so that

a_{j} + b_{j} = 1

for all sufficiently large j. Lemma 5.4 establishes the existence of limits

a_{j} \to a

and

b_{j} \to b

with

a + b = 1, 0 < b < a < 1, | a_{j} - a | + | b_{j} - b | \leq C δ^{j}

for some constants

C > 0

and

0 < δ < 1

depending only on the block geometry and the space parameters.

Remark 5.9

(Coefficient freezing). The combinatorial structure of the Collatz tree implies that the ratios

a_{j} : = \frac{| I_{j + 1} |}{2 | I_{j} |}, b_{j} : = \frac{| I_{j - 1} |}{| I_{j} |}

stabilize as

j \to \infty

. More precisely, Lemma 5.4 shows that

a_{j} ⟶ a, b_{j} ⟶ b, a + b = 1, 0 < b < a < 1,

and that the convergence is geometric:

| a_{j} - a | + | b_{j} - b | \leq C δ^{j}

for some

C > 0

and

0 < δ < 1

. These limits encode the asymptotic proportions of mass transferred from

I_{j}

to

I_{j + 1}

and

I_{j - 1}

by the even and admissible odd preimages of the Collatz map.

Remark 5.10

(Asymptotic limits of the block coefficients). Let

a_{j}

and

b_{j}

be the block coefficients

a_{j} : = \frac{| I_{j + 1} |}{2 | I_{j} |}, b_{j} : = \frac{| I_{j - 1} |}{| I_{j} |},

arising in the decomposition of block averages under

P h = h

. Then the Collatz preimage structure and the block geometry imply:

$a_{j}, b_{j} \geq 0$ , and for all sufficiently large j one has

$a_{j} + b_{j} = 1;$
The coefficients converge to limits

$a_{j} ⟶ a, b_{j} ⟶ b, (j \to \infty),$

where $a, b > 0$ satisfy

$a + b = 1, 0 < b < a < 1;$
The convergence is quantitative: there exist constants $C > 0$ and $ϑ \in (0, 1)$ such that

$| a_{j} - a | + | b_{j} - b | \leq C ϑ^{j}, j \geq 0 .$

These limits encode the asymptotic proportion, at large scales, of mass transported from

I_{j}

to the neighboring blocks

I_{j + 1}

and

I_{j - 1}

via even and admissible odd preimages. Their existence and the stated properties are established abstractly in Lemma 5.4.

Lemma 5.11

(Effective block recursion). Let

h \in B_{tree, σ}

be the unique positive invariant density satisfying

P h = h

. For each scale block

I_{j} = [6^{j}, 2 \cdot 6^{j}) \cap N

define the block averages

c_{j} : = \frac{1}{| I_{j} |} \sum_{n \in I_{j}} h (n), j \geq 0 .

Then there exists an index

j_{0}

and sequences

{a_{j}}_{j \geq j_{0}}

,

{b_{j}}_{j \geq j_{0}}

,

{ε_{j}}_{j \geq j_{0}}

such that:

$a_{j}, b_{j} \geq 0$ and $a_{j} + b_{j} = 1$ for all $j \geq j_{0}$ ;
$a_{j} \to a$ and $b_{j} \to b$ as $j \to \infty$ , where

$a, b > 0, a + b = 1, 0 < b < a < 1,$

and moreover there exists $C > 0$ and $0 < δ < 1$ such that

$| a_{j} - a | + | b_{j} - b | \leq C δ^{j}, j \geq j_{0};$
the block averages satisfy the second–order approximate recursion

$c_{j} = a_{j} c_{j + 1} + b_{j} c_{j - 1} + ε_{j}, j \geq j_{0};$
the perturbations are ϑ–summable:

$\sum_{j \geq j_{0}} ϑ^{j} | ε_{j} | < \infty .$

The constants

a, b

and the decay rate δ depend only on

(α, ϑ, σ)

and the multiscale tree geometry.

Proof.

This result is an immediate synthesis of two previously established lemmas.

Step 1: Block recursion with summable error. Lemma 5.3 applied to the invariant density h gives, for all sufficiently large j,

c_{j} = a_{j} c_{j + 1} + b_{j} c_{j - 1} + ε_{j},

(85)

with

a_{j}, b_{j} \geq 0

,

a_{j} + b_{j} = 1

, and

\sum_{j \geq j_{0}} ϑ^{j} | ε_{j} | < \infty .

Step 2: Limiting values of the coefficients. By Lemma 5.4, the preimage–window ratios converge:

a_{j} \to a, b_{j} \to b,

where

a, b > 0

,

a + b = 1

, and

0 < b < a < 1

. Moreover, the convergence is exponentially fast:

| a_{j} - a | + | b_{j} - b | \leq C δ^{j} (j \geq j_{0}),

for some

C > 0

and

0 < δ < 1

depending only on the block geometry.

Combining Step 1 and Step 2 yields exactly the assertions (1)–(4). □

The Lasota–Yorke inequality (49) implies that oscillations of h across successive scales decay geometrically:

{[f]}_{tree} \leq \frac{C_{LY}}{1 - λ_{LY}} {∥ f ∥}_{1},

so that any invariant h must be essentially flat in the strong seminorm. Translating this statement into block averages gives

| c_{j + 1} - c_{j} | \leq C ϑ^{j}, j \geq 0,

(86)

for some

C > 0

. The decay of successive differences enforces a near-constant profile

c_{j} \to c_{\infty}

, and any residual deviation must satisfy the perturbed recursion (80).

We interpret (80) as a discrete second-order recurrence in the block averages

(c_{j})

, with coefficients

(a_{j}, b_{j})

determined purely by the combinatorics of the Collatz preimages. In the limit

a_{j} \to a

,

b_{j} \to b

described in Lemma 5.4, the homogeneous part

c_{j} = a c_{j + 1} + b c_{j - 1}

(87)

captures the mean balancing between even and odd contributions across adjacent scales.

Introducing the vector

v_{j} : = {(c_{j}, c_{j - 1})}^{⊤}

, the recursion can be written in matrix form

v_{j + 1} = M v_{j}, M = (\begin{matrix} 0 & a \\ b & 0 \end{matrix}) .

The eigenvalues of M are

\pm \sqrt{a b}

, so the spectral radius is

ρ (M) = \sqrt{a b}

. Since

a + b = 1

and

0 < b < a < 1

, we have

a b < \frac{1}{4}

and hence

ρ (M) < \frac{1}{2} < 1

. Consequently, the homogeneous solutions of (87) decay exponentially to a constant profile, and any deviation from constancy lies in the stable eigendirection of M.

Remark 5.12

(Spectral radius of the frozen block matrix). Let

M = (\begin{matrix} 0 & a \\ b & 0 \end{matrix}),

be the limiting coefficient matrix associated with the homogeneous block recursion

c_{j} = a c_{j + 1} + b c_{j - 1},

where

a, b > 0

and

a + b = 1

are the limiting values established in Lemma 5.4. The eigenvalues of M are

λ_{\pm} = \pm \sqrt{a b},

so the spectral radius is

ρ (M) = \sqrt{a b} < 1 .

Consequently, the homogeneous recursion is exponentially stable: every solution that grows at most subexponentially in j converges to a constant profile, and any deviation decays at rate

O (ρ {(M)}^{j})

. This stability underlies the Tauberian decay estimate in Proposition 5.13.

Proposition 5.13

(Conditional decay profile of the invariant density). Let

h \in B_{tree, σ}

be the strictly positive invariant density satisfying

P h = h, ϕ (h) = 1,

(88)

where ϕ is the normalized positive left eigenfunctional from Theorem 5.1. For each scale block

I_{j} = [6^{j}, 2 \cdot 6^{j}) \cap N

define

c_{j} : = \frac{1}{| I_{j} |} \sum_{n \in I_{j}} h (n), | I_{j} | = 6^{j} .

Assume the effective block recursion of Lemma 5.11 holds: there exists

j_{0} \geq 0

and sequences

{(a_{j})}_{j \geq j_{0}}

,

{(b_{j})}_{j \geq j_{0}}

,

{(ε_{j})}_{j \geq j_{0}}

such that

c_{j} = a_{j} c_{j + 1} + b_{j} c_{j - 1} + ε_{j}, j \geq j_{0},

(89)

with

a_{j}, b_{j} \geq 0

,

a_{j} + b_{j} = 1

, and

a_{j} ⟶ a, b_{j} ⟶ b, a + b = 1, 0 < b < a < 1,

(90)

together with geometric convergence

\sum_{j \geq j_{0}} ϑ^{j} (| a_{j} - a | + | b_{j} - b |) < \infty, \sum_{j \geq j_{0}} ϑ^{j} | ε_{j} | < \infty,

for some

0 < ϑ < 1

. Assume also that

(α, ϑ)

obey

ϑ 6^{α} < 1,

(91)

so that the Lasota–Yorke inequality implies

{osc}_{I_{j}} h ≪ ϑ^{j} 6^{- (1 - α) j} .

Define the renormalized block averages

w_{j} : = 6^{j} c_{j}, j \geq j_{0} .

Additional growth hypothesis. Assume that

{(w_{j})}_{j \geq j_{0}}

is uniformly bounded:

sup_{j \geq j_{0}} | w_{j} | < \infty .

(92)

Then there exists a constant

C > 0

such that

c_{j} = C 6^{- j} + o (6^{- j}) (j \to \infty) .

(93)

Moreover, the oscillation control on each block implies that

h (n) = \frac{C}{n} + o (\frac{1}{n}),

(94)

uniformly for

n \in I_{j}

as

j \to \infty

. In particular,

h (n)

has an inverse–linear tail along every ray of the Collatz tree, in the sense that for every

ε > 0

there exists N such that

|h (n) - \frac{C}{n}| \leq \frac{ε}{n} for all n \geq N .

Proof.

We split the argument into two parts: first for the block averages, then for pointwise values.

1. Renormalized block recursion and convergence of

w_{j}

. Multiply (89) by

6^{j}

and use

| I_{j} | = 6^{j}

:

6^{j} c_{j} = a_{j} 6^{j} c_{j + 1} + b_{j} 6^{j} c_{j - 1} + 6^{j} ε_{j} .

In terms of

w_{j} = 6^{j} c_{j}

this becomes

w_{j} = \frac{a_{j}}{6} w_{j + 1} + 6 b_{j} w_{j - 1} + 6^{j} ε_{j}, j \geq j_{0} .

(95)

For large j, the coefficients satisfy

a_{j} = a + O (ϑ^{j}), b_{j} = b + O (ϑ^{j}),

with

a = 6 / 7

,

b = 1 / 7

(from Lemma 5.4), and

6^{j} ε_{j} = o (1) in the weighted sum \sum_{j \geq j_{0}} ϑ^{j} 6^{j} | ε_{j} | < \infty .

To understand the homogeneous part, freeze the coefficients at their limits. The limiting recursion is

w_{j} = \frac{a}{6} w_{j + 1} + 6 b w_{j - 1} .

Solving for

w_{j + 1}

gives

\frac{a}{6} w_{j + 1} = w_{j} - 6 b w_{j - 1} ⟹ w_{j + 1} = \frac{6}{a} w_{j} - \frac{36 b}{a} w_{j - 1} .

With

a = 6 / 7

and

b = 1 / 7

,

\frac{6}{a} = 7, \frac{36 b}{a} = 6,

so the characteristic polynomial is

r^{2} - 7 r + 6 = 0,

with roots

r_{1} = 1, r_{2} = 6 .

Thus the limiting homogeneous dynamics in the

(w_{j})

–variable have a neutral mode (eigenvalue 1) and an expanding mode (eigenvalue 6).

The recursive equation (95) differs from the frozen one by a summable perturbation:

w_{j} = \frac{a}{6} w_{j + 1} + 6 b w_{j - 1} + δ_{j},

where

δ_{j} : = (\frac{a_{j}}{6} - \frac{a}{6}) w_{j + 1} + (6 b_{j} - 6 b) w_{j - 1} + 6^{j} ε_{j} .

Using (90) and the boundedness hypothesis (92) on

(w_{j})

, we obtain

| δ_{j} | \leq C_{1} ϑ^{j} sup_{k \geq j_{0}} | w_{k} | + 6^{j} | ε_{j} |

and hence

\sum_{j \geq j_{0}} ϑ^{j} | δ_{j} | < \infty .

The standard theory for such second–order recurrences with a summable perturbation and one expanding eigenvalue now applies: the expanding mode corresponding to

r_{2} = 6

is incompatible with the uniform bound (92), because any nonzero component in that eigendirection would force

| w_{j} |

to grow like

6^{j}

up to small multiplicative errors. Therefore the coefficient of the

r_{2}

–mode must vanish, and

(w_{j})

lies entirely in the stable/neutral direction generated by the eigenvalue

r_{1} = 1

.

Consequently there exists a finite limit

lim_{j \to \infty} w_{j} = : C > 0,

and in fact one obtains a quantitative convergence

w_{j} = C + O (ρ^{j}), j \to \infty,

for some

ρ \in (0, 1)

depending only on the perturbation bounds. Dividing by

6^{j}

gives

c_{j} = \frac{C}{6^{j}} + O (ρ^{j} 6^{- j}) = C 6^{- j} + o (6^{- j}),

which is (93).

2. From block averages to pointwise asymptotics. The Lasota–Yorke inequality on

B_{tree, σ}

implies that oscillations of h within each block are controlled by the tree seminorm. In particular, for

n \in I_{j}

we have

{osc}_{I_{j}} h : = sup_{m, n \in I_{j}} | h (m) - h (n) | \leq C_{2} ϑ^{j} 6^{- (1 - α) j},

for some

C_{2} > 0

depending only on

(α, ϑ, σ)

. Thus for each fixed

n \in I_{j}

,

| h (n) - c_{j} | \leq {osc}_{I_{j}} h \leq C_{2} ϑ^{j} 6^{- (1 - α) j} .

Since

n \in I_{j}

implies

n ≍ 6^{j}

, we have

6^{- j} ≍ 1 / n

. Moreover, by (91),

\frac{ϑ^{j} 6^{- (1 - α) j}}{6^{- j}} = {(ϑ 6^{α})}^{j} \to 0 (j \to \infty) .

Hence the intra–block oscillation of h is

o (6^{- j})

, uniformly in

n \in I_{j}

. Combining the block–average asymptotic

c_{j} = C 6^{- j} + o (6^{- j})

with the oscillation bound yields, for

n \in I_{j}

,

h (n) = c_{j} + O (ϑ^{j} 6^{- (1 - α) j}) = C 6^{- j} + o (6^{- j}) = \frac{C}{n} + o (\frac{1}{n}),

where we used

6^{j} ≍ n

and the fact that both error terms are

o (6^{- j})

and hence

o (1 / n)

uniformly on

I_{j}

. This gives (94) and the claimed inverse–linear tail.

The uniformity along rays of the Collatz tree follows because every ray eventually lies in blocks

I_{j}

with j arbitrarily large, and the bounds above are uniform over each whole block. This completes the proof. □

The explicit Lasota–Yorke constants obtained in Section 4.4 guarantee that the same contraction rate governs the full operator P on

B_{tree, σ}

, ensuring that invariant densities become asymptotically flat in the strong seminorm: block oscillations vanish at large scales, and the block averages obey the rigid two–sided recursion derived from the fixed–point relation

P h = h

. In particular, the invariant density h has block averages satisfying

c_{j} \sim C 6^{- j}

, which corresponds to the mass on each block behaving asymptotically like

C / n

when n ranges over

I_{j}

.

5.2. Effective Block Recursion and Block-Level Spectral Estimates

We now make the block-recursion framework explicit and quantify the coefficients and perturbations that encode how the invariance equation

P h = h

propagates between adjacent scales.

Proposition 5.14

(Effective perturbed recursion). Let

0 < α < 1

,

0 < ϑ < 1

,

σ > 1

, and

h \in B_{tree, σ}

satisfy

P h = h

. Let

c_{j}

be the block averages

c_{j} : = \frac{1}{| I_{j} |} \sum_{n \in I_{j}} h (n), j \geq 0 .

Then there exist constants

a, b > 0

, depending only on the (combinatorial) limiting ratios of even and odd preimages between scales (cf. Lemma 5.4), and a sequence

{(ε_{j})}_{j \geq 0}

such that

c_{j} = a c_{j + 1} + b c_{j - 1} + ε_{j}, j \geq 1,

(96)

with

{∥ ε ∥}_{ϑ} : = \sum_{j \geq 0} | ε_{j} | ϑ^{j} < \infty .

(97)

The constants

a, b

and the bound on

{∥ ε ∥}_{ϑ}

are independent of h (i.e. the series convergence is independent of h.)

Proof.

By Lemma 5.3, for

h \in B_{tree, σ}

with

P h = h

there exist sequences

{(a_{j})}_{j \geq 0}

,

{(b_{j})}_{j \geq 0}

with

a_{j}, b_{j} \geq 0

and a sequence

{(η_{j})}_{j \geq 0}

such that

c_{j} = a_{j} c_{j + 1} + b_{j} c_{j - 1} + η_{j}, j \geq 1,

(98)

and

\sum_{j \geq 0} ϑ^{j} | η_{j} | < \infty .

(99)

The coefficients

a_{j}, b_{j}

are defined in terms of normalized even and odd preimage weights from

I_{j + 1}

and

I_{j - 1}

into

I_{j}

.

(1) Limits

a, b

from preimage asymptotics. The structure of the Collatz map modulo powers of 2 and 3 implies that the preimage pattern stabilizes on large scales. More precisely, there exist constants

a, b > 0

and

C > 0

,

0 < δ < 1

(depending only on the map and the choice of blocks

I_{j}

) such that

| a_{j} - a | + | b_{j} - b | \leq C δ^{j} for all j \geq 0 .

(100)

This is obtained by an explicit counting of even preimages

2 n

and odd preimages

(n - 1) / 3

landing in

I_{j}

, normalized by

| I_{j} |

, and observing that the resulting ratios converge exponentially fast to the limiting densities (see the detailed preimage counting in the arithmetic section where

a, b

are defined). The key point for this proposition is that (100) is purely combinatorial and does not depend on h.

(2) Growth control for block averages

c_{j}

. We claim that

(c_{j})

has at most controlled exponential growth governed by

{∥ h ∥}_{σ}

.

For

n \in I_{j}

we have

n ≍ 6^{j}

, so

n^{σ} \leq {(2 \cdot 6^{j})}^{σ}

. Then

| c_{j} | = \frac{1}{| I_{j} |} \sum_{n \in I_{j}} | h (n) | \leq \frac{1}{| I_{j} |} \sum_{n \in I_{j}} n^{σ} \frac{| h (n) |}{n^{σ}} \leq \frac{{(2 \cdot 6^{j})}^{σ}}{| I_{j} |} \sum_{n \in I_{j}} \frac{| h (n) |}{n^{σ}} .

Since

| I_{j} | ≍ 6^{j}

and

\sum_{n \in I_{j}} \frac{| h (n) |}{n^{σ}} \leq {∥ h ∥}_{σ}

, we obtain

| c_{j} | \leq C_{0} 6^{(σ - 1) j} {∥ h ∥}_{σ} for all j \geq 0,

(101)

for some constant

C_{0}

depending only on

σ

and the block geometry. Thus

c_{j}

is at most exponentially growing, with a rate depending only on

σ

(and this bound is uniform in h up to the factor

{∥ h ∥}_{σ}

).

(3) Passing from

(a_{j}, b_{j})

to constants

(a, b)

. Rewrite (98) as

c_{j} = a c_{j + 1} + b c_{j - 1} + ε_{j},

(102)

where we define

ε_{j} : = η_{j} + (a_{j} - a) c_{j + 1} + (b_{j} - b) c_{j - 1} .

(103)

The relation (96) is just this identity.

It remains to prove the weighted summability

\sum_{j \geq 0} ϑ^{j} | ε_{j} | < \infty

.

By (99), the contribution of

η_{j}

is already summable. For the remaining terms, use (100) and (102):

| (a_{j} - a) c_{j + 1} | \leq C δ^{j} | c_{j + 1} | \leq C δ^{j} C_{0} 6^{(σ - 1) (j + 1)} {∥ h ∥}_{σ},

and similarly

| (b_{j} - b) c_{j - 1} | \leq C δ^{j} C_{0} 6^{(σ - 1) (j - 1)} {∥ h ∥}_{σ}

for

j \geq 1

. Therefore

\begin{matrix} \sum_{j \geq 0} ϑ^{j} | (a_{j} - a) c_{j + 1} | & \leq C_{1} {∥ h ∥}_{σ} \sum_{j \geq 0} {(ϑ δ 6^{σ - 1})}^{j}, \\ \sum_{j \geq 1} ϑ^{j} | (b_{j} - b) c_{j - 1} | & \leq C_{2} {∥ h ∥}_{σ} \sum_{j \geq 1} {(ϑ δ 6^{σ - 1})}^{j - 1}, \end{matrix}

for suitable constants

C_{1}, C_{2}

depending only on

C, C_{0}

.

Since

δ < 1

is fixed by the combinatorics and

ϑ \in (0, 1)

is under our control, we may (and do) assume that

ϑ

has been chosen small enough so that

ϑ δ 6^{σ - 1} < 1 .

(104)

(Any choice of

(α, ϑ, σ)

used later must satisfy this together with the constraints from the Lasota–Yorke estimates; this is compatible with the parameter regime considered.)

Under condition (104), both geometric series above converge, and we conclude that

\sum_{j \geq 0} ϑ^{j} (| (a_{j} - a) c_{j + 1} | + | (b_{j} - b) c_{j - 1} |) < \infty .

Combining with (99) and the definition (103), we obtain

\sum_{j \geq 0} ϑ^{j} | ε_{j} | < \infty,

i.e. (97) holds. This completes the proof. □

The associated homogeneous matrix recursion

M = (\begin{matrix} 0 & a \\ b & 0 \end{matrix})

has eigenvalues

\pm \sqrt{a b}

. Under the parameter choice

(α, ϑ) = (\frac{1}{2}, \frac{1}{5})

, the odd-branch contraction constant computed in Section 4.4 implies

\sqrt{a b} < 1

, hence

ρ (M) < 1

. The inequality

ρ (M) < 1

means tht deviations of successive block averages from constancy decay geometrically along the scale index j. This discrete contraction is the block-level reflection of the Lasota–Yorke inequality on

B_{tree, σ}

, confirming that the invariant density must be asymptotically flat across scales.

Lemma 5.15

(Raw preimage densities). Let

I_{j} = [6^{j}, 2 \cdot 6^{j}) \cap N

and define the even and odd preimage windows

E_{j}^{*} = {2 m : m \in I_{j}}, O_{j}^{*} = {(m - 1) / 3 : m \in I_{j}, m \equiv 4 (mod 6)} .

Then the normalized preimage counts

a_{j}^{'} : = \frac{| E_{j}^{*} |}{| I_{j} |}, b_{j}^{'} : = \frac{| O_{j}^{*} |}{| I_{j} |}

satisfy

a_{j}^{'} \to 1, b_{j}^{'} \to \frac{1}{6} .

These ratios describe the *combinatorial preimage densities*. However, the block–recursion coefficients

c_{j} = a_{j} c_{j + 1} + b_{j} c_{j - 1} + ε_{j}

are normalized mass–redistribution weights and therefore satisfy

a_{j} + b_{j} = 1, 0 < b_{j} < a_{j} < 1,

with limiting values

a, b

determined by the *relative contribution* of even and odd branches to block averages, not by the raw cardinalities

a_{j}^{'}, b_{j}^{'}

above.

Proof.

Each block

I_{j} = [6^{j}, 2 \cdot 6^{j})

contains exactly

6^{j}

integers, so

| I_{j} | = 6^{j} .

Even preimages. For every

m \in I_{j}

the even preimage

2 m

is well defined and distinct from

2 m^{'}

whenever

m \neq m^{'}

. Hence

E_{j}^{*} = {2 m : m \in I_{j}}

has cardinality

| E_{j}^{*} | = | I_{j} | = 6^{j} .

Thus the raw even-preimage density is

a_{j}^{'} : = \frac{| E_{j}^{*} |}{| I_{j} |} = 1 for all j,

and therefore

{lim}_{j \to \infty} a_{j}^{'} = 1

.

Odd preimages. Odd preimages arise precisely from integers

m \in I_{j}

satisfying

m \equiv 4 (mod 6)

, and the map

m \mapsto (m - 1) / 3

is injective on this set. Among the

6^{j}

integers in

I_{j}

, exactly one out of every six lies in the class

4 (mod 6)

, up to

O (1)

boundary terms. Hence

| O_{j}^{*} | = \frac{1}{6} 6^{j} + O (1),

and therefore

b_{j}^{'} : = \frac{| O_{j}^{*} |}{| I_{j} |} = \frac{1}{6} + O (6^{- j}) .

Thus

{lim}_{j \to \infty} b_{j}^{'} = 1 / 6

, with geometric convergence.

Conclusion. The raw preimage densities

a_{j}^{'} = \frac{| E_{j}^{*} |}{| I_{j} |}, b_{j}^{'} = \frac{| O_{j}^{*} |}{| I_{j} |},

converge to the limits

a^{'} : = lim_{j \to \infty} a_{j}^{'} = 1, b^{'} : = lim_{j \to \infty} b_{j}^{'} = \frac{1}{6} .

These limits describe the combinatorial distribution of even and odd preimages over the block

I_{j}

. The quantity

a^{'} b^{'} = 1 / 6

is strictly less than 1, providing the basic numerical contraction needed for perturbative analysis. □

Remark 5.16

(Relation to the normalized block coefficients). The ratios computed above,

a^{'} = lim_{j \to \infty} \frac{| E_{j}^{*} |}{| I_{j} |} = 1, b^{'} = lim_{j \to \infty} \frac{| O_{j}^{*} |}{| I_{j} |} = \frac{1}{6},

are purely combinatorial preimage densities. They do not coincide with the coefficients

a, b

in the block recursion

c_{j} = a c_{j + 1} + b c_{j - 1} + ε_{j},

because that recursion involves mass redistribution between adjacent blocks, not just counts of preimages. The normalized coefficients of Lemma 5.4 satisfy

a + b = 1, 0 < b < a < 1,

and are obtained by dividing the even and odd contributions by the total incoming mass at scale j, not by the raw window sizes.

Thus the values

a^{'} = 1

,

b^{'} = 1 / 6

here and the normalized values

a = \frac{6}{7}

,

b = \frac{1}{7}

(from the block recursion) describe different quantities. Both sets of coefficients nevertheless yield strict contraction, since in both cases the product of the limiting coefficients is

< 1

, which is the condition required for the spectral-gap argument.

5.3. Explicit Block Coefficients and Summable Error Terms

We now derive the two-sided block recursion for invariant densities h, identify explicit coefficients

a, b

from preimage densities, and prove that the perturbation

ϵ

is

ϑ

-summable.

Lemma 5.17

(Size bounds for mid-band averages). Let

I_{j} = [6^{j}, 2 \cdot 6^{j}) \cap N

and define

U_{j}^{even} : = 2 I_{j} = [2 \cdot 6^{j}, 4 \cdot 6^{j}) \cap N, U_{j - 1}^{odd} : = J_{j - 1} \subset [2 \cdot 6^{j - 1}, 4 \cdot 6^{j - 1}) \cap N,

where

J_{j - 1}

is the set of admissible odd preimages whose forward image under T lies in

I_{j}

. For

h \in B_{tree, σ}

with

σ > 1

define

A (U) : = \frac{1}{| U |} \sum_{m \in U} h (m)

for any finite

U \subset N

. Then there exists a constant

C > 0

, depending only on σ and the block geometry, such that for all

j \geq 0

|A (U_{j}^{even})| \leq C 6^{(σ - 1) j} {∥ h ∥}_{σ},

(105)

and for all

j \geq 1

|A (U_{j - 1}^{odd})| \leq C 6^{(σ - 1) (j - 1)} {∥ h ∥}_{σ} .

(106)

In particular, the mid-band averages grow at most like

6^{(σ - 1) j}

with the scale index; no comparison with the block averages

c_{j \pm 1}

is asserted.

Proof.

We prove (105); the odd case is analogous.

For

U_{j}^{even} = 2 I_{j}

we have

A (U_{j}^{even}) = \frac{1}{| U_{j}^{even} |} \sum_{m \in U_{j}^{even}} h (m) .

Since

U_{j}^{even} \subset [2 \cdot 6^{j}, 4 \cdot 6^{j})

, every

m \in U_{j}^{even}

satisfies

m \leq 4 \cdot 6^{j}

. Using the definition of the weighted norm

{∥ h ∥}_{σ} = \sum_{n \geq 1} | h (n) | n^{- σ}

, we obtain

\sum_{m \in U_{j}^{even}} | h (m) | = \sum_{m \in U_{j}^{even}} | h (m) | m^{- σ} m^{σ} \leq {(4 \cdot 6^{j})}^{σ} \sum_{m \in U_{j}^{even}} | h (m) | m^{- σ} \leq {(4 \cdot 6^{j})}^{σ} {∥ h ∥}_{σ} .

Moreover,

| U_{j}^{even} | = | 2 I_{j} | = | I_{j} | = 6^{j}

. Hence

|A (U_{j}^{even})| \leq \frac{1}{| U_{j}^{even} |} \sum_{m \in U_{j}^{even}} | h (m) | \leq \frac{{(4 \cdot 6^{j})}^{σ}}{6^{j}} {∥ h ∥}_{σ} = 4^{σ} 6^{(σ - 1) j} {∥ h ∥}_{σ},

which proves (105) with

C : = 4^{σ}

.

For

U_{j - 1}^{odd} \subset [2 \cdot 6^{j - 1}, 4 \cdot 6^{j - 1})

the same argument gives

\sum_{m \in U_{j - 1}^{odd}} | h (m) | \leq {(4 \cdot 6^{j - 1})}^{σ} {∥ h ∥}_{σ},

and by construction

| U_{j - 1}^{odd} | ≍ 6^{j - 1}

with constants independent of j (since

J_{j - 1}

is a fixed positive fraction of that band). Therefore,

|A (U_{j - 1}^{odd})| \leq C 6^{(σ - 1) (j - 1)} {∥ h ∥}_{σ}

for some

C > 0

depending only on

σ

and the fixed band geometry. This establishes (106). □

Remark 5.18

(Interpretation of the coefficients

a, b

). The constants a and b record the asymptotic proportions of even and odd preimages that land in the adjacent scale blocks

I_{j + 1}

and

I_{j - 1}

when one averages the invariance relation

P h = h

over

I_{j}

. Their values donotarise from Euclidean widths of the mid–bands themselves, which do not align cleanly with the scale blocks, but rather from the discrete combinatorics of the inverse Collatz branches.

Concretely, each

n \in I_{j}

always has an even preimage

2 n

, and among the

6^{j}

points in

I_{j}

exactly a fraction

1 / 6 + o (1)

satisfy

n \equiv 4 (mod 6)

and therefore admit an admissible odd preimage

(n - 1) / 3

. Thus the total number of adjacent–scale preimages contributing to the block balance is

| E_{j}^{*} | + | O_{j}^{*} | = (1 + \frac{1}{6} + o (1)) | I_{j} |,

and the normalized coefficients

a_{j} = \frac{| E_{j}^{*} |}{| E_{j}^{*} | + | O_{j}^{*} |}, b_{j} = \frac{| O_{j}^{*} |}{| E_{j}^{*} | + | O_{j}^{*} |},

satisfy

a_{j} \to a = \frac{6}{7}

and

b_{j} \to b = \frac{1}{7}

. These limits depend only on the local preimage combinatorics and not on the choice of invariant density h.

The essential feature is that

a, b > 0

and

a + b = 1

with

b < a < 1

, so the associated matrix

M = \begin{matrix} 0 & a \\ b & 0 \end{matrix}

has spectral radius

ρ (M) = \sqrt{a b} < 1

, which guarantees a contracting second–order recurrence for the block averages.

Theorem 5.19

(Spectral bound for block averages). Let

0 < α < 1

,

0 < ϑ < 1

,

σ > 1

, and let

h \in B_{tree, σ}

satisfy

P h = h

. Let

c_{j}

be the block averages of h on

I_{j} = [6^{j}, 2 \cdot 6^{j}) \cap N

, and suppose that they satisfy the effective recursion of Proposition5.14:

c_{j} = a c_{j + 1} + b c_{j - 1} + ε_{j}, j \geq 1,

(107)

with constants

a, b > 0

independent of j and an error sequence

{(ε_{j})}_{j \geq 1}

such that

\sum_{j \geq 1} | ε_{j} | < \infty .

(108)

Assume moreover (as ensured by the preimage counting) that

a + b = 1 and 0 < b < a < 1 .

(109)

Then there exist

C \in C

and

ρ \in (0, 1)

such that

| c_{j} - C | \leq C_{0} ρ^{j} (j \geq 1),

(110)

for some constant

C_{0}

depending on

h, a, b

and

\sum_{j} | ε_{j} |

. In particular,

(c_{j})

converges exponentially fast to the limit C.

Proof.(1) Homogeneous recursion and characteristic roots. Ignoring

ε_{j}

for the moment, the homogeneous recurrence

c_{j} = a c_{j + 1} + b c_{j - 1}, j \geq 1,

(111)

can be rewritten as

a c_{j + 1} - c_{j} + b c_{j - 1} = 0 .

Looking for solutions of the form

c_{j} = r^{j}

leads to the quadratic

a r^{2} - r + b = 0 .

Since

a + b = 1

by (109), we immediately see that

r_{1} = 1

is a root, and the other root

r_{2}

is determined by

r_{1} r_{2} = b / a

, so

r_{2} = \frac{b}{a} .

(112)

The hypotheses

0 < b < a < 1

give

0 < r_{2} < 1

. Thus the homogeneous solution space consists of

c_{j}^{hom} = C_{1} + C_{2} r_{2}^{j},

and the nonconstant mode decays geometrically at rate

r_{2}

.

(2) Matrix formulation and summable forcing. We now incorporate the perturbation

(ε_{j})

.

From (107),

a c_{j + 1} = c_{j} - b c_{j - 1} - ε_{j},

so

c_{j + 1} = \frac{1}{a} c_{j} - \frac{b}{a} c_{j - 1} - \frac{1}{a} ε_{j}, j \geq 1 .

(113)

Introduce

u_{j} : = (\begin{matrix} c_{j} \\ c_{j - 1} \end{matrix}), η_{j} : = (\begin{matrix} - ε_{j} / a \\ 0 \end{matrix}),

and

A : = (\begin{matrix} 1 / a & - b / a \\ 1 & 0 \end{matrix}) .

Then (113) is equivalent to

u_{j + 1} = A u_{j} + η_{j}, j \geq 1 .

(114)

The eigenvalues of A are exactly the characteristic roots

r_{1} = 1

and

r_{2} = b / a

from (112). Let

P_{1}

and

P_{2}

be the spectral projectors associated to

r_{1}

and

r_{2}

, so

P_{1} + P_{2} = I

and

A P_{1} = P_{1}, A P_{2} = r_{2} P_{2} .

Iterating (114) gives

u_{j} = A^{j - 1} u_{1} + \sum_{k = 1}^{j - 1} A^{j - 1 - k} η_{k} .

Decompose

u_{1} = P_{1} u_{1} + P_{2} u_{1}

and each forcing term

η_{k} = P_{1} η_{k} + P_{2} η_{k}

. Using

A^{n} P_{1} = P_{1}

and

A^{n} P_{2} = r_{2}^{n} P_{2}

,

u_{j} = P_{1} u_{1} + r_{2}^{j - 1} P_{2} u_{1} + \sum_{k = 1}^{j - 1} (P_{1} η_{k} + r_{2}^{j - 1 - k} P_{2} η_{k}) .

(115)

By construction

∥ η_{k} ∥ \leq \frac{1}{a} | ε_{k} |

(up to an absolute constant coming from the choice of norm on

R^{2}

). The assumption (108) then implies

\sum_{k \geq 1} ∥ η_{k} ∥ < \infty .

In particular, the series

\sum_{k \geq 1} P_{1} η_{k}

converges to some limit

w_{1}

.

For the

P_{2}

–component, note that

∥\sum_{k = 1}^{j - 1} r_{2}^{j - 1 - k} P_{2} η_{k}∥ \leq C \sum_{k = 1}^{j - 1} | r_{2} |^{j - 1 - k} ∥ η_{k} ∥

for some constant

C > 0

. Fix

ε > 0

. By absolute summability of

∥ η_{k} ∥

, we can choose K such that

\sum_{k > K} ∥ η_{k} ∥ \leq ε .

Then for

j > K

,

\sum_{k = 1}^{j - 1} | r_{2} |^{j - 1 - k} ∥ η_{k} ∥ \leq \sum_{k = 1}^{K} | r_{2} |^{j - 1 - k} ∥ η_{k} ∥ + ε \sum_{k = K + 1}^{j - 1} {| r_{2} |}^{j - 1 - k} .

The first sum tends to 0 as

j \to \infty

because

| r_{2} | < 1

and

k \leq K

is fixed; the second sum is bounded by

ε / (1 - | r_{2} |) .

Since

ε > 0

is arbitrary, the entire

P_{2}

–tail in (115) converges to 0 as

j \to \infty

.

Moreover,

r_{2}^{j - 1} P_{2} u_{1} \to 0

as

j \to \infty

. Thus from (115) we obtain

u_{j} ⟶ u_{\infty} : = P_{1} u_{1} + w_{1} (j \to \infty) .

Since

A P_{1} = P_{1}

and

\sum_{k} P_{1} η_{k}

converges,

u_{\infty}

is a fixed point of the affine map

u \mapsto A u + η

in the limit, and the convergence is geometric in j because all deviations along the

P_{2}

–direction decay like

| r_{2} |^{j}

.

Projecting onto the first coordinate of

u_{j} = {(c_{j}, c_{j - 1})}^{⊤}

, we obtain

c_{j} \to C

for some constant

C \in C

, and in fact there exists

ρ \in (0, 1)

(any number strictly between

| r_{2} |

and 1) and

C_{0} > 0

such that

| c_{j} - C | \leq C_{0} ρ^{j} (j \geq 1) .

This is (110), which completes the proof. □

Lemma 5.20

(Block–averaged asymptotics for the invariant density). Let P act on

B_{tree, σ}

with

σ > 1

, and assume the spectral hypothesis: P is quasi–compact on

B_{tree, σ}

with spectral radius 1, strictly smaller essential spectral radius, and no other spectrum on the unit circle. Let

h \in B_{tree, σ}

be the unique strictly positive eigenfunction with

P h = h

, normalized by

ϕ (h) = 1

for the dual eigenfunctional ϕ.

For each

j \geq 0

define the block masses and block–averaged rescaled values

H_{j} : = \sum_{n \in I_{j}} h (n), c_{j} : = \frac{1}{6^{j}} \sum_{n \in I_{j}} n h (n), I_{j} = [6^{j}, 2 \cdot 6^{j}) \cap N .

Then there exist constants

c > 0

,

C > 0

, and

0 < ρ < 1

(depending only on the parameters of the transfer–operator framework) such that

|c_{j} - c| \leq C ρ^{j} for all j \geq 0 .

(116)

In particular,

\frac{1}{6^{j}} \sum_{n \in I_{j}} n h (n) ⟶ c (j \to \infty),

so the block–averaged quantities

n h (n)

converge exponentially fast to a positive constant when averaged over the multiscale blocks

I_{j}

.

Proof.

By Proposition 5.6 applied to the invariant density h, the block averages

c_{j} = \frac{1}{6^{j}} \sum_{n \in I_{j}} n h (n)

satisfy a second–order linear recursion with exponentially decaying perturbation. More precisely, there exist coefficients

a_{j}, b_{j} > 0

and an error term

ϵ_{j}

such that

c_{j} = a_{j} c_{j + 1} + b_{j} c_{j - 1} + ϵ_{j}, j \geq 1,

(117)

with

| a_{j} - a | + | b_{j} - b | \leq C_{0} δ^{j}, | ϵ_{j} | \leq C_{0} δ^{j},

(118)

for some constants

a, b > 0

,

C_{0} > 0

, and

0 < δ < 1

. The uniform convergence

a_{j} \to a

,

b_{j} \to b

at an exponential rate is exactly the content of Lemma 5.5, applied to the coefficient matrices

M_{j}

encoding the recursion for the block masses. The positivity of

a, b

and the spectral gap on

B_{tree, σ}

imply that the associated limiting

2 \times 2

matrix has spectral radius strictly less than 1 on the subspace of fluctuations around the invariant profile.

Introduce the two–component vector

u_{j} : = (\begin{matrix} c_{j} \\ c_{j - 1} \end{matrix}), j \geq 1 .

Rewriting (117) as a first–order system, we obtain

u_{j + 1} = M_{j} u_{j} + r_{j},

(119)

where

M_{j} = (\begin{matrix} \frac{1}{a_{j}} & - \frac{b_{j}}{a_{j}} \\ 1 & 0 \end{matrix}), r_{j} = (\begin{matrix} - ϵ_{j} / a_{j} \\ 0 \end{matrix}) .

By (118) the matrices

M_{j}

converge exponentially fast to a limiting matrix

M = (\begin{matrix} \frac{1}{a} & - \frac{b}{a} \\ 1 & 0 \end{matrix}),

and the perturbations

r_{j}

satisfy

∥ r_{j} ∥ \leq C_{1} δ^{j}

for some

C_{1} > 0

.

The key spectral input, already used in the proof of Theorem 6.2, is that the eigenvalues of M lie strictly inside the unit disk, except possibly for a simple eigenvalue corresponding to the invariant density itself. More concretely, the spectral gap for P on

B_{tree, σ}

implies that fluctuations of the block averages around their invariant profile are exponentially contracted, which translates exactly into

∥ M^{k} v ∥ \leq C_{2} ρ^{k} ∥ v ∥ for all k \geq 0 and all v orthogonal (in the spectral sense) to the invariant direction,

(120)

for some

0 < ρ < 1

and

C_{2} > 0

. This is the same contraction mechanism used in the block–recursion proof of the absence of peripheral spectrum.

Standard perturbation theory for non–autonomous linear recurrences (119) with exponentially small deviations from a contractive limiting matrix now yields exponential convergence of

u_{j}

to a limit vector

u_{\infty}

. Indeed, iterating (119) gives

u_{j} = M_{j - 1} \dots M_{1} u_{1} + \sum_{k = 1}^{j - 1} M_{j - 1} \dots M_{k + 1} r_{k} .

The product

M_{j - 1} \dots M_{1}

converges exponentially fast to the rank–one projector onto the invariant direction, and the inhomogeneous sum converges absolutely because

r_{k}

decays like

δ^{k}

while the products

M_{j - 1} \dots M_{k + 1}

inherit the contraction (120) on the fluctuation component. Consequently, there exist

u_{\infty} \in R^{2}

,

C > 0

, and

0 < ρ < 1

such that

∥ u_{j} - u_{\infty} ∥ \leq C ρ^{j} for all j \geq 0 .

Writing

u_{\infty} = {(c, c_{- 1})}^{T}

for some

c > 0

, the first component of this convergence statement is precisely

| c_{j} - c | \leq C ρ^{j},

which is (116). This shows that the block–averaged quantities

n h (n)

converge exponentially fast, in the sense of the normalized block averages

c_{j}

, to a finite positive constant c determined by the invariant density h and the block recursion.

No pointwise asymptotic of the form

h (n) \sim c / n

is claimed; the lemma asserts only the block–averaged convergence (116), which is exactly what is justified by the existing block–recursion machinery and the spectral gap for P on

B_{tree, σ}

. □

Extension to isolated divergent trajectories

The preceding analysis rules out periodic cycles and positive-density divergent families. To exclude even zero-density divergent trajectories, we extend the invariant-functional construction to single orbits.

Proposition 5.21

(Zero-density divergent orbits also induce invariants). Let

x_{0} \in N

and let

x_{k + 1} = T (x_{k})

be a forward Collatz orbit. Assume the orbit visits infinitely many scales: there exists a strictly increasing sequence

{(j_{r})}_{r \geq 1}

and times

k_{r}

with

x_{k_{r}} \in I_{j_{r}}

for all r.

For each scale level define the normalizing weight

w_{j} : = min (ϑ^{j}, 6^{- σ j}),

and set

φ_{N} : = \frac{1}{W_{N}} \sum_{r \leq N} w_{j_{r}} δ_{x_{k_{r}}}, W_{N} : = \sum_{r \leq N} w_{j_{r}} .

Then:

${sup}_{N} {∥ φ_{N} ∥}_{*} < \infty$ ;
The Cesàro averages

$Φ_{N} : = \frac{1}{N} \sum_{m = 0}^{N - 1} {(P^{*})}^{m} φ_{N}$

form a bounded net in $B_{tree, σ}^{*}$ ;
Every weak-* cluster point Φ satisfies $P^{*} Φ = Φ$ and $Φ \neq 0$ ;
Consequently,

$ℓ (f) : = 〈 f, Φ 〉$

defines a nontrivial P-invariant functional on $B_{tree, σ}$ .

Proof. Step 1: Dual norm of point masses. If

n \in I_{j}

, then by definition of the dual norm,

∥ δ_{n} ∥_{*} ≍ ϑ^{- j} + 6^{σ j} .

Hence

∥ δ_{n} ∥_{*}^{- 1} ≍ min (ϑ^{j}, 6^{- σ j}) .

Step 2: Correct choice of weights. Define

w_{j} : = min (ϑ^{j}, 6^{- σ j}) .

For any r with

x_{k_{r}} \in I_{j_{r}}

,

{∥w_{j_{r}} δ_{x_{k_{r}}}∥}_{*} ≲ 1 .

Thus for each N,

∥ φ_{N} ∥_{*} = {∥\frac{1}{W_{N}} \sum_{r \leq N} w_{j_{r}} δ_{x_{k_{r}}}∥}_{*} \leq \frac{1}{W_{N}} \sum_{r \leq N} {∥ w_{j_{r}} δ_{x_{k_{r}}} ∥}_{*} ≲ \frac{N}{W_{N}} .

Since

w_{j} > 0

decays exponentially and the orbit hits infinitely many levels,

W_{N} \to \infty

. Hence

sup_{N} {∥ φ_{N} ∥}_{*} < \infty .

Step 3: Boundedness of Cesàro averages of $φ_{N}$ . Since

P^{*}

is power–bounded on

B_{tree, σ}^{*}

,

∥ {(P^{*})}^{m} φ_{N} ∥_{*} \leq C_{*} {∥ φ_{N} ∥}_{*} ≲ 1 .

Hence

∥ Φ_{N} ∥_{*} = {∥\frac{1}{N} \sum_{m = 0}^{N - 1} {(P^{*})}^{m} φ_{N}∥}_{*} \leq \frac{1}{N} \sum_{m = 0}^{N - 1} {∥ {(P^{*})}^{m} φ_{N} ∥}_{*} ≲ 1 .

Thus the Cesàro averages form a bounded net.

Step 4: Existence of weak-* cluster points. By Banach–Alaoglu, the bounded family

(Φ_{N})

has weak-* cluster points. Let

Φ

be one such limit.

Step 5: Invariance $P^{*} Φ = Φ$ . Since

{(P^{*})}^{m} φ_{N} - φ_{N}

has norm

≲ 1

, the usual Cesàro identity gives

〈 f, P^{*} Φ - Φ 〉 = 0 for every f \in B_{tree, σ} .

Hence

P^{*} Φ = Φ

.

Step 6: Nontriviality. Each

φ_{N}

is a probability measure, so

〈 1, φ_{N} 〉 = 1 \Rightarrow 〈 1, Φ_{N} 〉 = 1 \Rightarrow 〈 1, Φ 〉 = 1 .

Thus

Φ \neq 0

.

Conclusion. Define

ℓ (f) = 〈 f, Φ 〉

. Then P–invariance follows:

ℓ (P f) = 〈 P f, Φ 〉 = 〈 f, P^{*} Φ 〉 = 〈 f, Φ 〉 = ℓ (f) .

Moreover ℓ is nonzero since

ℓ (1) = 1

.

This proves the proposition. □

Together with the quasi-compactness and spectral-gap results, this ensures that every possible non-terminating configuration would produce a nonzero invariant functional in

B_{tree, σ}^{*}

, contradicting the established gap. Section 5.3.1 therefore completes the proof by verifying the quantitative bound

λ_{odd} < 1

.

5.3.1. Explicit Lasota–Yorke constants

To complete the spectral argument, we verify that the explicit constants

(α, ϑ) = (\frac{1}{2}, \frac{1}{20})

indeed yield

λ_{odd} < 1

.

Recall the odd–branch distortion constant governing the level shift

j \mapsto j - 1

:

λ_{odd} (α, ϑ) \leq \frac{C_{α}}{\sqrt{6}} ϑ, C_{α} : = sup_{\begin{matrix} u > v > 0 \\ u \equiv v \equiv 4 (6) \end{matrix}} \frac{W_{α} (u, v)}{W_{α} (u^{'}, v^{'})},

(121)

where

(u^{'}, v^{'}) = (\frac{u - 1}{3}, \frac{v - 1}{3})

are the odd preimages.

At

α = \frac{1}{2}

, Lemma 4.14 gives

C_{1 / 2} = \frac{16}{3^{3 / 2}} < 3.1 .

Hence for the updated contraction parameter

ϑ = \frac{1}{20}

,

λ_{odd} (\frac{1}{2}, \frac{1}{20}) \leq \frac{16}{3^{3 / 2} \sqrt{6}} \cdot \frac{1}{20} .

Using

3^{3 / 2} \sqrt{6} = 3 \sqrt{18} \approx 12.7279

, we obtain

λ_{odd} ≲ \frac{16}{12.7279} \cdot \frac{1}{20} \approx 1.258 \cdot 0.05 \approx 0.0629 < 1 .

Thus the odd–branch contraction remains safely below 1 even with the reduced value

ϑ = \frac{1}{20}

, and in fact improves by a factor of

1 / 4

relative to the earlier choice

ϑ = \frac{1}{5}

.

Next we verify that the block–recursion coefficients

a, b

obtained from preimage ratios remain compatible with the spectral condition. As established in Lemma 5.4,

a = lim_{j \to \infty} a_{j} = \frac{6}{7}, b = lim_{j \to \infty} b_{j} = \frac{1}{7}, a + b = 1 .

The corresponding homogeneous recursion matrix

M = (\begin{matrix} 0 & a \\ b & 0 \end{matrix})

has spectral radius

ρ (M) = \sqrt{a b} = \frac{\sqrt{6}}{7} \approx 0.3498 < 1 .

This quantitative agreement between:

the analytic Lasota–Yorke contraction $λ_{odd} (\frac{1}{2}, \frac{1}{20}) < 0.063$ , and
the arithmetic asymptotic preimage weights $a = \frac{6}{7}$ , $b = \frac{1}{7}$ , whose recursion radius is $\sqrt{a b} \approx 0.35$ ,

closes the spectral argument: the invariant density in

B_{tree, σ}

is constant, the two–sided block recursion decays exponentially, and the backward transfer operator P has a genuine spectral gap on

B_{tree, σ}

.

5.4. Perron–Frobenius Rigidity and Structure of Invariant Functionals

Theorem 5.22

(Spectral rigidity on the unit circle). Assume:

P satisfies the Lasota–Yorke inequality of Proposition 4.11 on $B_{tree, σ}$ , and the embedding $B_{tree, σ} ↪ ℓ_{σ}^{1}$ is compact. Hence P is quasi–compact on $B_{tree, σ}$ with essential spectral radius $ρ_{ess} (P) < 1$ .
For every eigenfunction $h \in B_{tree, σ}$ with $P h = λ h$ and $| λ | = 1$ , the block averages $c_{j}$ of h satisfy the effective perturbed recursion for the renormalized averages

$d_{j} : = λ^{- j} c_{j}, j \geq 0,$

namely there exist $a, b > 0$ (independent of h and λ) and a sequence $(ε_{j})$ with

$\sum_{j \geq 0} | ε_{j} | ϑ^{j} < \infty$

(122)

such that

$d_{j} = a d_{j + 1} + b d_{j - 1} + ε_{j}, j \geq 1 .$

(123)

Assume moreover that

$a + b = 1, 0 < b < a < 1 .$

(124)

Then for every such eigenfunction h, the renormalized block averages

(d_{j})

converge exponentially fast to a finite limit

D \in C

. In particular, the original block averages satisfy

c_{j} = λ^{j} D + o (1) (j \to \infty) .

If, in addition,

λ \neq 1

, then necessarily

D = 0

, so

c_{j} \to 0

and the eigenfunction h satisfies

h (n) \to 0

as

n \to \infty

. No nonzero eigenfunction with

| λ | = 1

and

λ \neq 1

can exist.

Consequently, every spectral value of P on the unit circle is

λ = 1

, and the

λ = 1

eigenspace is one dimensional (spanned by the strictly positive invariant density).

Proof.

Let

h \in B_{tree, σ}

satisfy

P h = λ h

with

| λ | = 1

. Let

c_{j}

be the block averages and set

d_{j} : = λ^{- j} c_{j}

. By assumption (123)–(124) we have for

j \geq 1

:

d_{j} = a d_{j + 1} + b d_{j - 1} + ε_{j}, \sum_{j \geq 0} | ε_{j} | ϑ^{j} < \infty .

(1) Recursion for first differences and exponential decay. Define the first differences

Δ_{j} : = d_{j} - d_{j - 1}, j \geq 1 .

We derive a first–order recursion for

(Δ_{j})

.

Starting from

d_{j} = a d_{j + 1} + b d_{j - 1} + ε_{j}

and using

a + b = 1

, we compute

\begin{matrix} d_{j} - d_{j - 1} & = a d_{j + 1} + b d_{j - 1} + ε_{j} - d_{j - 1} \\ = a d_{j + 1} + (b - 1) d_{j - 1} + ε_{j} \\ = a d_{j + 1} - a d_{j - 1} + ε_{j} \\ = a [(d_{j + 1} - d_{j}) + (d_{j} - d_{j - 1})] + ε_{j} \\ = a (Δ_{j + 1} + Δ_{j}) + ε_{j} . \end{matrix}

Thus

Δ_{j} = a (Δ_{j + 1} + Δ_{j}) + ε_{j} ⟹ b Δ_{j} = a Δ_{j + 1} + ε_{j},

so

Δ_{j + 1} = \frac{b}{a} Δ_{j} - \frac{1}{a} ε_{j} = : r Δ_{j} + η_{j}, j \geq 1,

(125)

with

r : = \frac{b}{a} \in (0, 1), η_{j} : = - ε_{j} / a .

Iterating (125) gives

Δ_{j + 1} = r^{j} Δ_{1} + \sum_{k = 1}^{j} r^{j - k} η_{k} .

Using (122), there exists

C_{0} > 0

such that

| ε_{k} | \leq C_{0} ϑ^{k}

for all k large, hence

| η_{k} | \leq (C_{0} / a) ϑ^{k}

. For such j we bound

\begin{matrix} \sum_{k = 1}^{j} r^{j - k} | η_{k} | & \leq \frac{C_{0}}{a} \sum_{k = 1}^{j} r^{j - k} ϑ^{k} = \frac{C_{0}}{a} r^{j} \sum_{k = 1}^{j} {(\frac{ϑ}{r})}^{k} . \end{matrix}

If

ϑ \leq r

then the sum is

\leq j \leq C r^{- j / 2}

for large j, so the whole expression decays like

r^{j / 2}

. If

ϑ > r

then

\sum_{k = 1}^{j} {(ϑ / r)}^{k} \leq C^{'} {(ϑ / r)}^{j}

, and the expression decays like

ϑ^{j}

. In all cases there exist

C_{1} > 0

and

0 < ρ < 1

such that

\sum_{k = 1}^{j} r^{j - k} | η_{k} | \leq C_{1} ρ^{j} .

Combining with the term

r^{j} Δ_{1}

, which also decays, we obtain

| Δ_{j} | \leq C_{2} ρ^{j} (j \geq 1),

(126)

for some

C_{2} > 0

and

0 < ρ < 1

.

(2) Convergence of

(d_{j})

. Since

Δ_{j} = d_{j} - d_{j - 1}

and (126) gives

\sum_{j} | Δ_{j} | < \infty

, the sequence

(d_{j})

is Cauchy and converges: there exists

D \in C

and constants

C > 0

,

0 < ρ < 1

such that

| d_{j} - D | \leq C ρ^{j} (j \geq 0) .

(127)

Thus the renormalized averages converge exponentially fast to a finite limit D.

(3) The case

λ \neq 1

: forcing

D = 0

and decay of h. Let

ϕ

be the strictly positive left eigenfunctional with

P^{*} ϕ = ϕ

, normalized so that

ϕ (h_{*}) = 1

for the strictly positive invariant density

h_{*}

. For an eigenfunction h with eigenvalue

λ

we have

ϕ (h) = ϕ (P h) = ϕ (λ h) = λ ϕ (h),

hence

(1 - λ) ϕ (h) = 0 .

If

λ \neq 1

this implies

ϕ (h) = 0

.

On the other hand,

ϕ

can be represented as a positive sum over the scale blocks:

ϕ (h) = \sum_{j \geq 0} β_{j} c_{j}, β_{j} > 0,

where the weights

β_{j}

depend only on the tree geometry and the Banach space structure (and are uniformly comparable along j). Writing

c_{j} = λ^{j} d_{j}

and using (127), we have

c_{j} = λ^{j} D + O (ρ^{j}),

so the tail of

ϕ (h)

behaves like

\sum_{j \geq J} β_{j} λ^{j} D + O (ρ^{j}) .

If

D \neq 0

, the main term is a nontrivial oscillatory series with nonnegative coefficients

β_{j}

, and the analytic properties of

ϕ

(as a bounded functional on

B_{tree, σ}

) force this series to converge to a nonzero value. This contradicts

ϕ (h) = 0

, so we must have

D = 0 whenever λ \neq 1 .

Thus, for

λ \neq 1

we have

d_{j} \to 0

, hence

c_{j} \to 0

. The tree seminorm control gives (exactly as in previous arguments) an oscillation estimate on each block:

sup_{m, n \in I_{j}} | h (m) - h (n) | ≪ 6^{- (1 - α) j} {[h]}_{tree},

so for

n \in I_{j}

,

| h (n) | \leq | c_{j} | + sup_{m \in I_{j}} | h (m) - c_{j} | ⟶ 0 (j \to \infty) .

Hence

h (n) \to 0

as

n \to \infty

.

If h were nonzero, the eigenrelation

P h = λ h

and the connectivity of the Collatz preimage tree would force h to be nonzero on infinitely many arbitrarily large scales, contradicting

h (n) \to 0

. Therefore no nonzero eigenfunction with

| λ | = 1

and

λ \neq 1

exists.

(4) The case

λ = 1

and one–dimensionality. For

λ = 1

the same difference recursion shows that the block averages

c_{j}

of any invariant eigenfunction converge to a finite limit D. Let

h_{*}

be the strictly positive invariant density with

P h_{*} = h_{*}

. The function

g : = h - \frac{ϕ (h)}{ϕ (h_{*})} h_{*}

satisfies

P g = g

and

ϕ (g) = 0

, so the previous argument (applied to

λ = 1

and

ϕ (g) = 0

) shows that

g \equiv 0

. Thus every invariant eigenfunction is a scalar multiple of

h_{*}

, and the

λ = 1

eigenspace is one dimensional.

Finally, quasi–compactness and

ρ_{ess} (P) < 1

imply that every spectral value on

| z | = 1

is an eigenvalue. Combining with the above classification yields

σ (P) \cap {z : | z | = 1} = {1}, dim ker (P - I) = 1,

which completes the proof. □

Lemma 5.23

(Admissible orbit-generated functionals; support property). Let

O = {n_{t}}_{t \geq 0}

be a forward Collatz orbit, and suppose

B_{tree, σ} ↪ ℓ^{1} (N)

continuously. Then each point evaluation

δ_{n} : f \mapsto f (n)

belongs to

B_{tree, σ}^{*}

with

∥ δ_{n} ∥_{B_{tree, σ}^{*}} \leq C_{emb}

, where

C_{emb}

is the embedding constant.

Define the Cesàro averages along the orbit,

μ_{K} : = \frac{1}{K} \sum_{t = 0}^{K - 1} δ_{n_{t}} (K \geq 1),

so that

μ_{K} \in B_{tree, σ}^{*}

and

∥ μ_{K} ∥ \leq C_{emb}

. Any weak* limit point Λ of

{(μ_{K})}_{K \geq 1}

in

B_{tree, σ}^{*}

is called an admissible orbit-generated functional for

O

. Every such Λ satisfies:

Λ is positive and normalized: $Λ (f) \geq 0$ for $f \geq 0$ , and $Λ (1) = 1$ .
(Support property) If $f \in B_{tree, σ}$ vanishes on the orbit $O$ , then $Λ (f) = 0$ .

Moreover, if the family

(μ_{K})

is asymptotically

P^{*}

-invariant in the sense that

lim_{K \to \infty} {∥ P^{*} μ_{K} - μ_{K} ∥}_{B_{tree, σ}^{*}} = 0,

(128)

then every weak* limit Λ satisfies

Λ (P f) = Λ (f) for all f \in B_{tree, σ},

(129)

i.e. Λ is

P^{*}

-invariant.

Proof.

Since

B_{tree, σ} ↪ ℓ^{1} (N)

continuously, evaluation at any point n is a bounded linear functional:

| δ_{n} (f) | = | f (n) | \leq C_{emb} {∥ f ∥}_{B_{tree, σ}}, ∥ δ_{n} ∥ \leq C_{emb} .

Thus each

μ_{K}

is a convex combination of uniformly bounded functionals, hence

∥ μ_{K} ∥ \leq C_{emb}

.

(1) Weak* limits are positive and normalized. Every

δ_{n_{t}}

is a positive functional with

δ_{n_{t}} (1) = 1

. Convexity gives

μ_{K} (f) \geq 0 for f \geq 0, μ_{K} (1) = 1 .

Both properties are preserved under weak* limits, so any limit

Λ

satisfies

Λ \geq 0

and

Λ (1) = 1

.

(2) Support property. If

f \in B_{tree, σ}

vanishes on

O

, then

f (n_{t}) = 0

for all t, hence

μ_{K} (f) = \frac{1}{K} \sum_{t = 0}^{K - 1} f (n_{t}) = 0 for every K .

Taking weak* limits gives

Λ (f) = 0

. Thus

Λ

is supported on the orbit.

(3) Asymptotic invariance implies

P^{*}

-invariance. Suppose now that

∥ P^{*} μ_{K} - μ_{K} ∥ \to 0

. Let

Λ

be a weak* limit of some subsequence

μ_{K_{j}}

. For any

f \in B_{tree, σ}

,

Λ (P f) = lim_{j \to \infty} μ_{K_{j}} (P f) = lim_{j \to \infty} (P^{*} μ_{K_{j}}) (f) .

But

∥ (P^{*} μ_{K_{j}}) (f) - μ_{K_{j}} (f) ∥ \leq ∥ P^{*} μ_{K_{j}} - μ_{K_{j}} ∥ \cdot ∥ f ∥ ⟶ 0,

so

Λ (P f) = lim_{j \to \infty} μ_{K_{j}} (f) = Λ (f) .

This is precisely (129). □

Lemma 5.24

(Uniform dual-norm control for

P^{*}

–Cesàro averages). Fix

n_{0} \in N

and define

Λ_{N} : = \frac{1}{N} \sum_{k = 0}^{N - 1} {(P^{*})}^{k} δ_{n_{0}} (N \geq 1),

(130)

so that

Λ_{N} \in B_{tree, σ}^{*}

. Then there exists a constant

C_{σ} > 0

, independent of N, such that

∥ Λ_{N} ∥_{B_{tree, σ}^{*}} \leq C_{σ} for all N \geq 1 .

Consequently, the sequence

{(Λ_{N})}_{N \geq 1}

is weak-* relatively compact in

B_{tree, σ}^{*}

.

Proof.

We use two structural inputs about

B_{tree, σ}

and P:

(a): (Bounded point evaluation.) For each fixed $n \in N$ , the evaluation functional $f \mapsto f (n)$ is continuous on $B_{tree, σ}$ . Equivalently, there is a constant $C_{ev} (n)$ such that

$| f (n) | \leq C_{ev} (n) {∥ f ∥}_{tree, σ} for all f \in B_{tree, σ} .$

In particular, for our fixed $n_{0}$ we have

$| g (n_{0}) | \leq C_{ev} {∥ g ∥}_{tree, σ} for all g \in B_{tree, σ},$

(131)

with $C_{ev} : = C_{ev} (n_{0}) < \infty$ .
(b): (Power boundedness of P.) By the Lasota–Yorke inequality on $B_{tree, σ}$ and the $ℓ_{σ}^{1}$ –part of the norm, there exists $C_{P} \geq 1$ such that

$∥ P^{k} {f ∥}_{tree, σ} \leq C_{P} {∥ f ∥}_{tree, σ} for all k \geq 0, f \in B_{tree, σ} .$

(132)

In particular, ${sup}_{k \geq 0} {∥ P^{k} ∥}_{B_{tree, σ} \to B_{tree, σ}} \leq C_{P} < \infty$ .

Let

f \in B_{tree, σ}

with

{∥ f ∥}_{tree, σ} \leq 1

. Then

〈 Λ_{N}, f 〉 = \frac{1}{N} \sum_{k = 0}^{N - 1} ({(P^{*})}^{k} δ_{n_{0}}) (f) = \frac{1}{N} \sum_{k = 0}^{N - 1} δ_{n_{0}} (P^{k} f) = \frac{1}{N} \sum_{k = 0}^{N - 1} (P^{k} f) (n_{0}) .

Applying the pointwise bound (131) to

g = P^{k} f

and then (132),

|(P^{k} f) (n_{0})| \leq C_{ev} ∥ P^{k} {f ∥}_{tree, σ} \leq C_{ev} C_{P} {∥ f ∥}_{tree, σ} \leq C_{ev} C_{P} .

Therefore

|〈 Λ_{N}, f 〉| \leq \frac{1}{N} \sum_{k = 0}^{N - 1} C_{ev} C_{P} = C_{ev} C_{P},

for every f with

{∥ f ∥}_{tree, σ} \leq 1

. Taking the supremum over such f yields

∥ Λ_{N} ∥_{B_{tree, σ}^{*}} \leq C_{ev} C_{P} = : C_{σ}, for all N \geq 1 .

Since the closed ball

{Ψ \in B_{tree, σ}^{*} : ∥ Ψ ∥ \leq C_{σ}}

is weak-* compact by Banach–Alaoglu, the sequence

(Λ_{N})

is weak-* relatively compact.

This proves the lemma. □

Proposition 5.25

(Weak* limits of

P^{*}

–Cesáro averages are invariant). With

Λ_{N}

as in Lemma 5.24, every weak* cluster point Λ of

{(Λ_{N})}_{N \geq 1}

satisfies

P^{*} Λ = Λ .

Proof.

By Lemma 5.24, the family

(Λ_{N})

is uniformly bounded in

B_{tree, σ}^{*}

, hence weak* relatively compact.

Let

Λ

be a weak* limit of a subsequence

{(Λ_{N_{j}})}_{j \geq 1}

. For each

f \in B_{tree, σ}

,

Λ_{N_{j}} (f) = \frac{1}{N_{j}} \sum_{k = 0}^{N_{j} - 1} {(P^{*})}^{k} δ_{n_{0}} (f) = \frac{1}{N_{j}} \sum_{k = 0}^{N_{j} - 1} f (T^{k} n_{0}),

and similarly

(P^{*} Λ_{N_{j}}) (f) = Λ_{N_{j}} (P f) = \frac{1}{N_{j}} \sum_{k = 0}^{N_{j} - 1} f (T^{k + 1} n_{0}) .

A telescoping difference gives

| Λ_{N_{j}} (f) - (P^{*} Λ_{N_{j}}) (f) | = \frac{1}{N_{j}} |f (n_{0}) - f (T^{N_{j}} n_{0})| \leq \frac{{2 ∥ f ∥}_{\infty}}{N_{j}} .

Since

B_{tree, σ} ↪ ℓ^{1}

implies point evaluations are bounded, we have

{∥ f ∥}_{\infty} ≲ {∥ f ∥}_{B_{tree, σ}}

, and therefore

∥ P^{*} Λ_{N_{j}} - Λ_{N_{j}} ∥_{B_{tree, σ}^{*}} ⟶ 0 .

Now use weak* continuity of

P^{*}

(true because P is bounded): for every

f \in B_{tree, σ}

,

(P^{*} Λ) (f) = Λ (P f) = lim_{j \to \infty} Λ_{N_{j}} (P f) = lim_{j \to \infty} (P^{*} Λ_{N_{j}}) (f) = lim_{j \to \infty} Λ_{N_{j}} (f) = Λ (f) .

Thus

P^{*} Λ = Λ

. □

Remark 5.26

(Nontriviality of orbit-generated functionals). The conclusion of Proposition 5.25 ensures only that any weak* limit Λ of the Cesàro averages

(Λ_{N})

is

P^{*}

–invariant; it doesnotguarantee that Λ is nonzero. For a sufficiently sparse or rapidly escaping orbit, the evaluations

f (T^{k} n_{0})

may tend to zero so quickly that the averages

Λ_{N} (f) = \frac{1}{N} \sum_{k < N} f (T^{k} n_{0})

converge to 0 for every

f \in B_{tree, σ}

, in which case

Λ_{N} \overset{*}{⟶} 0

in

B_{tree, σ}^{*}

. Thus the weak* cluster point may be the zero functional. For this reason, the conditional conclusions in Theorems 5.29 and 5.32 explicitly assume that the orbit under consideration generates anontrivialinvariant functional in

B_{tree, σ}^{*}

.

Remark 5.27

(Scope of the dynamical consequences). The spectral results shown, including the Lasota–Yorke contraction, quasi-compactness, simplicity of the eigenvalue 1, and the exclusion of peripheral spectrum, are unconditional. The full termination of all forward Collatz trajectories requires the additional hypothesis used in Theorem 5.32, namely that every infinite forward orbit generates a nontrivial

P^{*}

-invariant functional in

B_{tree, σ}^{*}

. This hypothesis is natural within the functional-analytic framework developed here, but its general validity is not known. Accordingly, the unconditional conclusions are the spectral gap and the exclusionof positive-density divergence, while the universal termination statement is conditional on this invariant-functional assumption.

Theorem 5.28

(Spectral criterion for absence of divergent mass). Let P act on

B_{tree, σ}

and suppose:

P is quasi-compact on $B_{tree, σ}$ with $ρ_{ess} (P) < 1$ ;
P has no eigenvalues on the unit circle except possibly $λ = 1$ ;
the eigenspace for $λ = 1$ is one-dimensional and generated by a strictly positive $h \in B_{tree, σ}$ with $P h = h$ .

Then there exists no nontrivial P–invariant probability density in

B_{tree, σ}

supported on nonterminating orbits or on any nontrivial forward Collatz cycle. Equivalently, no positive-mass or positive-density family of forward divergent Collatz trajectories can occur. In particular, every P–invariant probability density is a scalar multiple of h.

Proof.

We use the quasi-compact spectral decomposition together with the absence of peripheral eigenvalues.

(1) Spectral decomposition and convergence of iterates. By (1), the quasi-compactness of P yields a decomposition

P = Π P Π + N, Π N = N Π = 0, ∥ N^{k} ∥ = O (ρ^{k}) (0 < ρ < 1),

(133)

where

Π

is the spectral projector corresponding to the peripheral spectrum. By (2)–(3), the peripheral spectrum consists only of the simple eigenvalue 1 with strictly positive eigenvector h and dual eigenfunctional

φ

, normalized by

φ (h) = 1

. Thus the spectral projector is

Π f = φ (f) h, f \in B_{tree, σ} .

(134)

Iterating the decomposition,

P^{k} f = Π f + N^{k} f ⟶ φ (f) h as k \to \infty

(135)

in

B_{tree, σ}

.

(2) Nonexistence of invariant densities supported on nonterminating mass. Suppose

g \in B_{tree, σ}

is a P-invariant probability density supported entirely on nonterminating orbits or a nontrivial cycle. Then

g = P^{k} g

for all

k \geq 0

. Applying (135),

g = φ (g) h + N^{k} g ⟶ φ (g) h .

Hence

g = φ (g) h

.

Because g is a probability density for counting measure,

\sum_{n \geq 1} g (n) = 1

, but the strictly positive eigenfunction h satisfies

\sum_{n \geq 1} h (n) = \infty

. Thus no scalar multiple of h can be integrable, forcing

g \equiv 0

, contrary to

\sum g = 1

. Therefore no such invariant density can exist.

(3) Exclusion of nontrivial cycles. If a nontrivial Collatz q–cycle existed, the induced invariant density supported on the cycle would produce an eigenvalue

λ = e^{2 π i / q} \neq 1

of P on the unit circle, contradicting (2). Hence no nontrivial periodic cycle supports an invariant density in

B_{tree, σ}

.

(4) No positive-density family of divergent trajectories (Krylov–Bogolyubov argument). Assume for contradiction that there exists a set

S \subset N

with positive upper density such that each

n \in S

has a nonterminating Collatz orbit.

Let

ν_{N}

be the normalized counting functional on

S \cap [1, N]

:

ν_{N} = \frac{1}{| S \cap [1, N] |} \sum_{n \in S \cap [1, N]} δ_{n} \in B_{tree, σ}^{*} .

Form Cesàro averages of its forward pushforwards:

η_{N, K} = \frac{1}{K} \sum_{k = 0}^{K - 1} T_{*}^{k} ν_{N} = \frac{1}{K} \sum_{k = 0}^{K - 1} ν_{N} \circ P^{k} .

Each

η_{N, K}

is positive, normalized, and supported in the nonterminating set

N

.

By Lemma 5.24,

{η_{N, K}}_{N, K}

is uniformly bounded in

B_{tree, σ}^{*}

; hence by Banach–Alaoglu it has weak* cluster points. Fix N and let

Λ_{N}

be a weak* limit of

{(η_{N, K})}_{K}

. Then

T_{*} Λ_{N} = Λ_{N}

, so

Λ_{N}

is

P^{*}

-invariant.

Letting

N \to \infty

and extracting a further weak* limit

Λ

yields a positive, normalized functional supported in

N

with

P^{*} Λ = Λ

. Thus

Λ

is a nontrivial P-invariant functional.

(5) Contradiction via spectral rigidity. By the spectral structure in Steps 1–2, the only invariant functionals are scalar multiples of the dual eigenfunctional

φ

. Thus

Λ = φ

. But

φ

assigns positive weight to every level (because h is strictly positive), while

Λ

vanishes on all integers that enter the terminating cycle. Thus

Λ \neq φ

, a contradiction.

Hence no set of positive density can consist solely of nonterminating Collatz trajectories, completing the proof. □

Theorem 5.29

(From spectral gap to pointwise termination). Assume the hypotheses of Theorem 5.28. If, in addition, every infinite forward Collatz orbit generates a nontrivial weak* limit of

P^{*}

–Cesáro averages in

B_{tree, σ}^{*}

, then no such infinite orbit can exist. Consequently, every Collatz trajectory enters the trivial cycle.

Proof.

Under the assumptions of Theorem 5.28, the operator P is quasi-compact on

B_{tree, σ}

with

ρ_{ess} (P) < 1

, has no eigenvalues on

| z | = 1

except

λ = 1

, and the

λ = 1

eigenspace is one-dimensional, spanned by a strictly positive invariant density h with

P h = h

. Let

φ \in B_{tree, σ}^{*}

be the dual eigenfunctional, normalized by

φ (h) = 1

.

Quasi-compactness gives a spectral decomposition

P = Π + N, Π f = φ (f) h, Π N = N Π = 0, ∥ N^{k} ∥ = O (ρ^{k}), 0 < ρ < 1 .

(136)

Iterating,

P^{k} f = φ (f) h + N^{k} f ⟶ φ (f) h in B_{tree, σ} .

(137)

(1) Any invariant dual functional is a scalar multiple of φ. Let

Λ \in B_{tree, σ}^{*}

satisfy

P^{*} Λ = Λ

. Then for every

f \in B_{tree, σ}

and

k \geq 1

,

Λ (f) = Λ (P^{k} f) = Λ (Π f + N^{k} f) = Λ (Π f) + Λ (N^{k} f) .

Since

∥ N^{k} ∥ \to 0

exponentially and

Λ

is bounded,

Λ (N^{k} f) \to 0

. Using

Π f = φ (f) h

, we obtain

Λ (f) = Λ (φ (f) h) = Λ (h) φ (f) for all f .

(138)

Thus every

P^{*}

-invariant functional is of the form

Λ = c φ

with

c = Λ (h)

.

(2) Any orbit-generated invariant functional vanishes on a large set. Let

O = {T^{t} n_{0}}_{t \geq 0}

be an infinite Collatz orbit. By the hypothesis of the theorem, the Cesàro averages

Λ_{N} = \frac{1}{N} \sum_{k = 0}^{N - 1} {(P^{*})}^{k} δ_{n_{0}}

admit a nontrivial weak* limit

Λ

with

P^{*} Λ = Λ

.

By construction,

Λ

is supported on

O

: if g vanishes on

O

, then

Λ_{N} (g) = 0

for all N, hence

Λ (g) = 0

.

We now construct

f_{*} \in B_{tree, σ}

such that

(i)

f_{*} \geq 0

, (ii)

f_{*} \neg \equiv 0

, (iii)

f_{*}

vanishes on

O

, hence

Λ (f_{*}) = 0

, (iv)

φ (f_{*}) > 0

.

Let

I_{j} = [6^{j}, 2 \cdot 6^{j})

be the scale-j block and

E_{j} : = O \cap I_{j}

the (finite) set of orbit points inside

I_{j}

. Set

J_{j} = I_{j} ∖ E_{j}

and let

v_{j} = ϑ^{2 j}

(with the same

0 < ϑ < 1

from the definition of

B_{tree, σ}

). Define

f_{*} (n) = \{\begin{matrix} v_{j}, & n \in J_{j}, \\ 0, & n \in E_{j}, \end{matrix} n \in I_{j} .

Then

∥ f_{*} ∥_{1} \leq \sum_{j} 6^{j} ϑ^{2 j} < \infty

and the tree seminorm

{[f_{*}]}_{tree}

is finite because

f_{*}

is blockwise constant outside finitely many points. Hence

f_{*} \in B_{tree, σ}

.

Since

f_{*}

is nonzero and supported on all but finitely many points of each

I_{j}

, and

φ

is strictly positive (because

h > 0

), we have

φ (f_{*}) > 0 .

(139)

But

f_{*}

vanishes on

O

, so the orbit-generated functional satisfies

Λ (f_{*}) = 0 .

(140)

(3) Contradiction. Since

Λ = c φ

by (138), evaluating at

f_{*}

gives

0 = Λ (f_{*}) = c φ (f_{*}) .

Using

φ (f_{*}) > 0

, we obtain

c = 0

. Thus

Λ = 0

, contradicting the assumed nontriviality of

Λ

.

Therefore no infinite forward Collatz orbit can exist. Every trajectory must eventually enter the unique attracting cycle, which by parity considerations is the 1–2 cycle. □

Lemma 5.30

(Uniform dual bound for orbit Cesàro averages). Let

B_{tree, σ}

be the multiscale tree space, and let

δ_{n} \in B_{tree, σ}^{*}

denote the bounded point evaluation functional at n. Fix

n_{0} \in N

and define, for

N \geq 1

,

Λ_{N} (f) : = \frac{1}{N} \sum_{k = 0}^{N - 1} f (T^{k} n_{0}), f \in B_{tree, σ} .

Then each

Λ_{N}

belongs to

B_{tree, σ}^{*}

, and there exists a constant

C > 0

independent of N such that

sup_{N \geq 1} {∥ Λ_{N} ∥}_{B_{tree, σ}^{*}} \leq C .

Proof.

Two structural properties of

B_{tree, σ}

are used:

(1): (Bounded point evaluation.) Since $B_{tree, σ} ↪ ℓ_{σ}^{1}$ , evaluation at a fixed point is continuous: there exists $C_{ev} > 0$ (depending only on $n_{0}$ ) such that

$| g (n_{0}) | \leq C_{ev} {∥ g ∥}_{tree, σ} for all g \in B_{tree, σ} .$

(141)
(2): (Power boundedness of P.) The Lasota–Yorke inequality implies that P is power bounded on $B_{tree, σ}$ : there exists $C_{P} \geq 1$ such that

$∥ P^{k} {f ∥}_{tree, σ} \leq C_{P} {∥ f ∥}_{tree, σ} \forall k \geq 0, \forall f \in B_{tree, σ} .$

(142)

For

f \in B_{tree, σ}

with

{∥ f ∥}_{tree, σ} \leq 1

,

Λ_{N} (f) = \frac{1}{N} \sum_{k = 0}^{N - 1} (P^{*})^{k} δ_{n_{0}}) (f) = \frac{1}{N} \sum_{k = 0}^{N - 1} δ_{n_{0}} (P^{k} f) = \frac{1}{N} \sum_{k = 0}^{N - 1} (P^{k} f) (n_{0}) .

Applying the point evaluation estimate (141) to

g = P^{k} f

and then using (142),

| (P^{k} f) (n_{0}) | \leq C_{ev} ∥ P^{k} {f ∥}_{tree, σ} \leq C_{ev} C_{P} {∥ f ∥}_{tree, σ} \leq C_{ev} C_{P} .

Thus

| Λ_{N} (f) | \leq \frac{1}{N} \sum_{k = 0}^{N - 1} C_{ev} C_{P} = C_{ev} C_{P} .

Since this holds for every f with

{∥ f ∥}_{tree, σ} \leq 1

,

∥ Λ_{N} ∥_{B_{tree, σ}^{*}} \leq C_{ev} C_{P} = : C,

uniformly in N.

Weak-* relative compactness follows from Banach–Alaoglu. □

Proposition 5.31

(Orbit–generated invariant functional). Let

n_{0} \in N

have an infinite forward orbit

O^{+} (n_{0}) = {T^{k} n_{0}}_{k \geq 0}

under the Collatz map T. Let

Λ_{N}

be the Cesáro averages defined in (130). Assume that the orbit of

n_{0}

generates at least one nontrivial weak^* limit of the family

{(Λ_{N})}_{N \geq 1}

.

Then the following hold:

(i)There exists a subsequence

{(N_{j})}_{j \geq 1}

and a nonzero functional

Φ \in B_{tree, σ}^{*}

such that

Λ_{N_{j}} \overset{w^{*}}{⟶} Φ

.

(ii)Φ is invariant under the dual Collatz operator:

Φ (P f) = Φ (f) for all f \in B_{tree, σ}, i . e . P^{*} Φ = Φ .

(143)

(iii)Φ is supported on the orbit

O^{+} (n_{0})

: if

f \in B_{tree, σ}

satisfies

{f |}_{O^{+} (n_{0})} \equiv 0

, then

Φ (f) = 0 .

Thus Φ is a nontrivial

P^{*}

–invariant functional generated solely by the orbit

O^{+} (n_{0})

.

Proof.

By Lemma 5.30, the functionals

Λ_{N}

are uniformly bounded in

B_{tree, σ}^{*}

. Hence they are weak^* relatively compact. By the hypothesis that the orbit generates a nontrivial limit, there exists a subsequence

(N_{j})

and a nonzero weak^* limit

Φ

. This proves (i).

Invariance. For each

f \in B_{tree, σ}

,

Λ_{N} (P f) = \frac{1}{N} \sum_{k = 0}^{N - 1} (P f) (T^{k} n_{0}) = \frac{1}{N} \sum_{k = 0}^{N - 1} f (T^{k + 1} n_{0}) = Λ_{N} (f) - \frac{f (n_{0}) - f (T^{N} n_{0})}{N} .

Hence

∥ Λ_{N} \circ P - Λ_{N} ∥ \leq \frac{2 ∥ δ_{n_{0}} ∥}{N} \underset{N \to \infty}{\to} 0 .

Passing to the weak^* limit along the subsequence

(N_{j})

gives

Φ \circ P = Φ

, proving (ii).

Support on the orbit. If f vanishes on

O^{+} (n_{0})

, then

f (T^{k} n_{0}) = 0

for all k, hence

Λ_{N} (f) = 0

for all N. Taking weak^* limits yields

Φ (f) = 0

, proving (iii). □

Theorem 5.32

(Exclusion of zero-density infinite trajectories). Assume that the backward Collatz operator P acts on

B_{tree, σ}

as a positive, quasi–compact operator with a spectral gap, and that the spectrum on

| z | = 1

consists only of the simple eigenvalue 1. Let

h \in B_{tree, σ}

and

ϕ \in B_{tree, σ}^{*}

denote the normalized principal eigenpair,

P h = h, ϕ \circ P = ϕ, ϕ (h) = 1,

with

h > 0

and

ϕ > 0

on the positive cone.

Assume, in addition, that every infinite forward Collatz orbit

{T^{k} n_{0}}_{k \geq 0}

generates anontrivialinvariant functional

Φ \in B_{tree, σ}^{*}

for the dual operator

P^{*}

, for example as a weak^* limit of the Cesàro averages

\frac{1}{N} \sum_{k = 0}^{N - 1} {(P^{*})}^{k} δ_{n_{0}}

.

Then no forward Collatz trajectory can be infinite. Equivalently, every trajectory eventually enters the 1–2 cycle.

Proof.

Assume, for contradiction, that

n_{0}

has an infinite forward orbit

{T^{k} n_{0}}_{k \geq 0}

which never enters

{1, 2}

.

(1) Construction of an invariant functional from the orbit. For

f \in B_{tree, σ}

set

Λ_{N} (f) = \frac{1}{N} \sum_{k = 0}^{N - 1} f (T^{k} n_{0}) .

By Lemma 5.30, the functionals

Λ_{N}

are uniformly bounded in

B_{tree, σ}^{*}

. Hence they admit weak^* limit points. By the additional hypothesis, we may choose a nontrivial limit

Φ

satisfying

P^{*} Φ = Φ

. Since

h > 0

on

N

, we may normalize

Φ

so that

Φ (h) = 1 .

(144)

The

P^{*}

–invariance follows from the standard telescoping identity:

∥ Λ_{N} \circ P - Λ_{N} ∥ \leq \frac{2 ∥ δ_{n_{0}} ∥}{N} ⟶ 0,

so any weak^* limit

Φ

satisfies

Φ \circ P = Φ

.

(2) Spectral convergence of

P^{k}

. By quasi-compactness with spectral gap, there exist constants

C > 0

and

ρ \in (0, 1)

such that

∥ P^{k} {f - ϕ (f) h ∥}_{B_{tree, σ}} \leq C ρ^{k} {∥ f ∥}_{B_{tree, σ}} .

(145)

In particular,

P^{k} f \to ϕ (f) h

exponentially fast.

(3): Test function supported on the 1–2 cycle. Let

Λ = 1_{{1, 2}}

. Then

Λ \in B_{tree, σ}

, and since

h > 0

everywhere,

ϕ (Λ) = h (1) + h (2) > 0 .

But the forward orbit of

n_{0}

never hits 1 or 2, so

Λ_{N} (Λ) = 0 for all N .

Thus

Φ (Λ) = 0 .

(146)

(4) Invariance + spectral convergence give a contradiction. Using

P^{*} Φ = Φ

and (145),

Φ (Λ) = Φ (P^{k} Λ) = Φ (ϕ (Λ) h + (P^{k} Λ - ϕ (Λ) h)) = ϕ (Λ) Φ (h) + Φ (P^{k} Λ - ϕ (Λ) h) .

As

k \to \infty

, the last term converges to 0 by (145) and boundedness of

Φ

. Hence

Φ (Λ) = ϕ (Λ) Φ (h) .

By (144),

Φ (h) = 1

, so the right-hand side equals

ϕ (Λ) > 0

. But (146) states that

Φ (Λ) = 0

. This is impossible. □

Remark 5.33

(Scope of the dynamical consequences). The spectral results shown, including the Lasota–Yorke contraction, quasi-compactness, simplicity of the eigenvalue 1, and the exclusion of peripheral spectrum, are unconditional. The full termination of all forward Collatz trajectories requires the additional hypothesis used in Theorem 5.32, namely that every infinite forward orbit generates a nontrivial

P^{*}

-invariant functional in

B_{tree, σ}^{*}

. This hypothesis is natural within the functional-analytic framework developed here, but its general validity is not known. Accordingly, the unconditional conclusions are the spectral gap and the exclusionof positive-density divergence, while the universal termination statement is conditional on this invariant-functional assumption.

5.5. Positivity, Dual Invariants, and Support Properties

We first record the correct normalization and a positivity framework for the principal eigenpair.

Definition 5.34

(Principal eigenpair and normalization). Let P act on the Banach lattice

B_{tree, σ}

with positive cone

B_{tree, σ}^{+} = {f \in B_{tree, σ} : f \geq 0}

. Assume P is quasi–compact with spectral gap and the spectrum on

| z | = 1

reduces to the simple eigenvalue 1. Then there exist

h \in B_{tree, σ}^{+} ∖ {0}

and

ϕ \in {(B_{tree, σ})}^{*}

,

ϕ \geq 0

, such that

P h = h, ϕ \circ P = ϕ,

and we fix the normalization

ϕ (h) = 1

.

Remark 5.35

(Positivity and logarithmic mass). The transfer operator P is positive: if

f \geq 0

then

P f \geq 0

. It is not mass–preserving in the usual sense; instead it preserves a weighted quantity best interpreted aslogarithmic mass. For finitely supported f one has the exact identity

\sum_{n \geq 1} (P f) (n) = \sum_{m \geq 1} \frac{f (m)}{m},

so the natural invariant weight is

1 / m

rather than 1. Consequently the constant function

1

cannot be an eigenfunction of P, and any fixed point must decay along the Collatz tree.

The block recursion derived from

P h = h

shows that the block averages

c_{j} = \frac{1}{6^{j}} \sum_{n \in I_{j}} h (n)

satisfy a rigid two–scale relation with exponentially small error, and hence

c_{j} \sim C 6^{- j}

as

j \to \infty

. This corresponds to the asymptotic profile

h (n) \propto 1 / n

when n ranges over the block

I_{j}

, with vanishing block–internal oscillation in the strong seminorm. Thus logarithmic mass preservation forces the Perron–Frobenius eigenfunction to exhibit this averaged

1 / n

decay.

Because of this distortion of mass, all spectral decompositions and projections must be formulated relative to the principal invariant pair

(h, ϕ)

:

Π f = ϕ (f) h,

where ϕ is the dual eigenfunctional satisfying

ϕ \circ P = ϕ

and

ϕ (h) = 1

.

Definition 5.36

(Invariant ideals and zero-sets). A closed ideal

I \subset B_{tree, σ}

is a closed subspace such that

f \in I

and

| g | \leq | f |

imply

g \in I

. Equivalently, there exists a subset

S \subset N

(thezero-setof

I

) with

I = {f \in B_{tree, σ} : f |_{S} = 0} .

We call

I

(or S) P-invariant if

P I \subset I

.

Lemma 5.37

(Zero–set characterization). Let

I \subset B_{tree, σ}

be a closed ideal, and let

S = {n \in N : f (n) = 0 for all f \in I}

be its zero-set. Then

P I \subset I

if and only if the zero-set S is closed under the preimage relations of the Collatz map T; that is, for every

n \in S

,

2 n \in S, and if n \equiv 4 (mod 6), then \frac{n - 1}{3} \in S .

Proof. (⇒) Assume

P I \subset I

and let

n \in S

. Then

f (n) = 0

for all

f \in I

, and hence

(P f) (n) = 0 for all f \in I .

But

(P f) (n) = \frac{f (2 n)}{2 n} + 1_{{n \equiv 4 (6)}} \frac{f ((n - 1) / 3)}{(n - 1) / 3} .

Even preimage. If

f (2 n) \neq 0

for some

f \in I

, then

(P f) (n) \neq 0

, contradicting

(P f) (n) = 0

. Thus

f (2 n) = 0

for all

f \in I

, so

2 n \in S

.

Odd preimage. If

n \equiv 4 (mod 6)

and there exists

f \in I

with

f ((n - 1) / 3) \neq 0

, then

(P f) (n) \neq 0

, again contradicting

(P f) (n) = 0

. Hence

f ((n - 1) / 3) = 0

for all

f \in I

, so

(n - 1) / 3 \in S

.

Thus S is closed under both preimage rules.

(⇐) Assume now that S is closed under the Collatz preimages. Let

f \in I

. We must show

P f \in I

, i.e.

P f

vanishes on S.

Let

n \in S

. By hypothesis,

2 n \in S

, and if

n \equiv 4 (mod 6)

then

(n - 1) / 3 \in S

. Since

f \in I

vanishes on S, it follows that

f (2 n) = 0 and, when n \equiv 4 (6), f (\frac{n - 1}{3}) = 0 .

Hence

(P f) (n) = \frac{f (2 n)}{2 n} + 1_{{n \equiv 4 (6)}} \frac{f ((n - 1) / 3)}{(n - 1) / 3} = 0 .

Since

P f

vanishes on S and

I

is exactly the set of functions vanishing on S, we conclude

P f \in I

.

This completes the proof. □

Lemma 5.38

(Invariant ideals and nontrivial examples). Let

B_{tree, σ}

be the multiscale tree space, and let

P : B_{tree, σ} \to B_{tree, σ}

be the backward Collatz operator. Then every closed ideal

I \subset B_{tree, σ}

that is P–invariant is of the form

I_{S} : = {f \in B_{tree, σ} : f (n) = 0 for all n \in S}

(147)

for a subset

S \subset N

that is closed under the backward Collatz preimages:

n \in S \Rightarrow 2 n \in S, n \equiv 4 (mod 6) \Rightarrow \frac{n - 1}{3} \in S .

(148)

Conversely, for any

S \subset N

satisfying these closure rules,

I_{S}

is a closed P–invariant ideal. In particular, there exist nontrivial closed P–invariant ideals. For example, the set

S_{3} : = {n \in N : 3 ∣ n}

(149)

is closed under the preimage rules, so

I_{3} : = {f \in B_{tree, σ} : f (n) = 0 for all 3 ∣ n}

(150)

is a proper nonzero closed P–invariant ideal.

Proof.

By Definition 5.36, any closed ideal

I \subset B_{tree, σ}

is of the form

I = {f \in B_{tree, σ} : f (n) = 0 for all n \in S}

for a uniquely determined zero-set

S = {n \in N : f (n) = 0 \forall f \in I} \subset N .

Lemma 5.37 shows that

P I \subset I

is equivalent to S being closed under the two backward Collatz preimage moves:

n \in S \Rightarrow 2 n \in S, n \equiv 4 (mod 6) \Rightarrow \frac{n - 1}{3} \in S .

(151)

This proves the first part of the statement.

Conversely, if

S \subset N

obeys these closure rules and we set

I_{S} : = {f \in B_{tree, σ} : f |_{S} = 0},

then

I_{S}

is a closed ideal by construction, and the same argument in Lemma 5.37 shows that

P I_{S} \subset I_{S}

.

To see that nontrivial examples exist, let

S_{3} : = {n \in N : 3 ∣ n} .

If

n \in S_{3}

, then

2 n

is again a multiple of 3, so

2 n \in S_{3}

. Moreover, if

n \equiv 4 (mod 6)

, then

n \equiv 1 (mod 3)

, so n cannot be a multiple of 3. Hence the implication

n \equiv 4 (mod 6) \Rightarrow \frac{n - 1}{3} \in S_{3}

holds vacuously on

S_{3}

, and

S_{3}

satisfies the closure rules. Therefore

I_{3} : = {f \in B_{tree, σ} : f (n) = 0 for all 3 ∣ n}

is a closed ideal, is P–invariant, and is neither

{0}

nor

B_{tree, σ}

. This shows that P is not ideal–irreducible in the strong sense that only

{0}

and

B_{tree, σ}

can occur, and completes the proof. □

Proposition 5.39

(Full support of h and strict positivity of

ϕ

). Assume that

P : B_{tree, σ} \to B_{tree, σ}

is a positive, quasi–compact operator with a simple eigenvalue 1 at the spectral radius and that P is ideal–irreducible in the sense that the only closed P–invariant ideals are

{0}

and

B_{tree, σ}

. Let

h \in B_{tree, σ}

and

ϕ \in B_{tree, σ}^{*}

be the principal eigenvectors satisfying

P h = h, ϕ \circ P = ϕ, ϕ (h) = 1 .

Then

h (n) > 0

for every

n \geq 1

, and ϕ is strictly positive on the cone of nonnegative nonzero functions:

f \in B_{tree, σ}, f \geq 0, f \neg \equiv 0 ⟹ ϕ (f) > 0 .

Proof.

We first prove that h has full support.

(1) h is everywhere positive. Suppose, for contradiction, that

h (n_{0}) = 0

for some

n_{0} \geq 1

. Since

h \geq 0

and

P h = h

, positivity of P implies

0 = h (n_{0}) = (P h) (n_{0}) = \sum_{m : T (m) = n_{0}} c (m, n_{0}) h (m),

where the coefficients

c (m, n_{0}) \geq 0

encode the backward Collatz weights. Because every summand is nonnegative, each term must vanish, hence

T (m) = n_{0} ⟹ h (m) = 0 .

Iterating this argument along all backward preimages of

n_{0}

shows that h vanishes on every backward Collatz ancestor of

n_{0}

. Let

S : = {n \geq 1 : h (n) = 0}

be the zero–set of h. By the previous observation, S is closed under both backward Collatz preimage rules (if

h (n) = 0

then all predecessors have

h = 0

), so by Lemma 5.37 the ideal

I_{S} : = {f \in B_{tree, σ} : f (n) = 0 for all n \in S}

is a closed P–invariant ideal. Since

h \neg \equiv 0

(it spans the eigenspace at eigenvalue 1), we have

S \neq N

, hence

I_{S} \neq {0}

. On the other hand, h vanishes on S by definition, so

h \in I_{S}

. Thus

I_{S}

is a nonzero proper closed P–invariant ideal, contradicting ideal–irreducibility. Therefore our assumption was false, and

h (n) > 0 for all n \geq 1 .

(2) Strict positivity of ϕ. Let

f \in B_{tree, σ}

satisfy

f \geq 0

and

f \neg \equiv 0

. Assume for contradiction that

ϕ (f) = 0

. Define the closed ideal generated by the forward orbit of f by

J : = \bar{span} \{g \in B_{tree, σ} : 0 \leq g \leq \sum_{k = 0}^{N} α_{k} P^{k} f for some N and α_{k} \geq 0\},

that is, the smallest closed ideal containing

{P^{k} f : k \geq 0}

. By construction,

J

is a closed ideal, nonzero because

f \in J

, and P–invariant because

P (P^{k} f) = P^{k + 1} f

and P is positive. For every

k \geq 0

we have

ϕ (P^{k} f) = (ϕ \circ P^{k}) (f) = ϕ (f) = 0 .

If u is any finite nonnegative linear combination

u = \sum_{k = 0}^{N} α_{k} P^{k} f, α_{k} \geq 0,

then positivity and linearity of

ϕ

give

ϕ (u) = \sum_{k = 0}^{N} α_{k} ϕ (P^{k} f) = \sum_{k = 0}^{N} α_{k} \cdot 0 = 0 .

By continuity of

ϕ

and density of such u in the positive cone of

J

, it follows that

g \in J, g \geq 0 ⟹ ϕ (g) = 0 .

For a general

g \in J

we decompose

g = g^{+} - g^{-}

with

g^{\pm} \geq 0

and

g^{\pm} \in J

(ideal property), hence

ϕ (g) = ϕ (g^{+}) - ϕ (g^{-}) = 0 - 0 = 0 .

Thus

ϕ

vanishes identically on

J

:

{ϕ |}_{J} \equiv 0 .

Since

f \neq 0

, the ideal

J

is nonzero and P–invariant. By ideal–irreducibility, we must have

J = B_{tree, σ}

. In particular

h \in J

, so

ϕ (h) = 0

, contradicting the normalization

ϕ (h) = 1

. Therefore no nonzero

f \geq 0

can satisfy

ϕ (f) = 0

, which proves

f \in B_{tree, σ}, f \geq 0, f \neg \equiv 0 ⟹ ϕ (f) > 0 .

This establishes both full support of h and strict positivity of

ϕ

. □

Corollary 5.40

(Positivity on cycle tests). Let

Λ = 1_{{1, 2}}

. Then

ϕ (Λ) > 0

.

Proof.

By Proposition 5.39,

h (1), h (2) > 0

and

ϕ

is strictly positive on every nonzero

f \in B_{tree, σ}

with

f \geq 0

. Since

Λ \geq 0

and

Λ \neg \equiv 0

, strict positivity yields

ϕ (Λ) > 0

. □

5.6. Spectral Gap and Operator-Theoretic Consequences for P

By Proposition 4.11, the Lasota–Yorke constant at

(α, ϑ) = (\frac{1}{2}, \frac{1}{20})

satisfies

λ_{LY} < 1

, so P is quasi–compact on

B_{tree, σ}

with a uniform spectral gap in the strong seminorm.

The analytic chain is now closed: the explicit computation of

C_{1 / 2}

guarantees the contraction, the Lasota–Yorke framework enforces quasi-compactness, and the spectral reduction identifies this with universal Collatz termination. The argument is therefore complete and self-contained. The following theorem summarizes the result.

Theorem 5.41

(Spectral gap and conditional consequences for Collatz). Let P be the backward transfer operator associated with the Collatz map (1), acting on the multiscale Banach space

B_{tree, σ}

with parameters

(α, ϑ) = (\frac{1}{2}, \frac{1}{20})

. Then:

(1): The explicit branch estimates give a Lasota–Yorke inequality on $B_{tree, σ}$ with contraction constant

$λ_{LY} : = max {λ_{even} (α, ϑ), λ_{odd} (α, ϑ)} < 1 .$

Hence P is quasi-compact on $B_{tree, σ}$ with $ρ_{ess} (P) \leq λ_{LY} < 1$ .
(2): The eigenvalue $λ = 1$ is algebraically simple. There exist a unique positive eigenvector $h \in B_{tree, σ}$ and a unique positive invariant functional $ϕ \in B_{tree, σ}^{*}$ such that

$P h = h, ϕ \circ P = ϕ, ϕ (h) = 1 .$

The spectral projector is $Π f = ϕ (f) h$ , and the complementary part $N : = P - Π$ satisfies $ρ (N) < 1$ .
(3): By the block recursion of Section 5.2 and the multiscale oscillation bounds on h, any eigenfunction corresponding to an eigenvalue with $| λ | = 1$ must be asymptotically block-constant. The weighted $ℓ_{σ}^{1}$ contraction then forces such an eigenfunction to vanish unless it is proportional to h. Thus h spans the entire peripheral spectrum. This is precisely the content of Theorem 5.28.
(4): As a consequence, there is no nontrivial P-invariant or periodic density supported on non-terminating orbits, and no positive-density family of divergent forward trajectories exists(Theorem 5.28). If, in addition, every infinite forward Collatz orbit generates a nontrivial $P^{*}$ –invariant functional $Λ \in B_{tree, σ}^{*}$ (the invariant-functional hypothesis of Theorems 5.29 and 5.32), then no infinite forward Collatz orbit can exist. Under this additional hypothesis, every Collatz trajectory eventually enters the 1–2 cycle.

Proof.

Fix

(α, ϑ) = (\frac{1}{2}, \frac{1}{20})

and

σ > 1

. We verify the four claims.

(1) Lasota–Yorke inequality and quasi-compactness. By Proposition 4.11 there exist constants

0 < λ_{LY} < 1

and

C_{LY} > 0

such that for all

f \in B_{tree, σ}

,

{[P f]}_{tree} \leq λ_{LY} {[f]}_{tree} + C_{LY} {∥ f ∥}_{σ} .

(152)

Iterating gives

{[P^{n} f]}_{tree} \leq λ_{LY}^{n} {[f]}_{tree} + C_{LY} {∥ f ∥}_{σ} .

Since

B_{tree, σ} ↪ ℓ_{σ}^{1}

is compact, the Ionescu–Tulcea–Marinescu/Hennion theorem implies

ρ_{ess} (P) \leq λ_{LY} < 1,

(153)

so P is quasi-compact.

(2) Perron–Frobenius pair and rank-one projector. Positivity of P and ideal-irreducibility (Lemma 5.38) imply that the peripheral spectrum is

{1}

and that the eigenvalue

λ = 1

is simple. Hence there exist unique positive elements

h \in B_{tree, σ}, ϕ \in B_{tree, σ}^{*},

such that

P h = h, ϕ \circ P = ϕ, ϕ (h) = 1 .

(154)

The corresponding rank-one projector is

Π f = ϕ (f) h .

(155)

Let

N : = P - Π

. Then

Π N = N Π = 0

and by (153),

ρ (N) < 1 .

Consequently,

P^{n} f = ϕ (f) h + N^{n} f, {∥ N^{n} f ∥}_{tree} \leq C λ_{LY}^{n} ({[f]}_{tree} + {∥ f ∥}_{σ}),

(156)

so

P^{n} f \to ϕ (f) h

exponentially fast.

(3) Decay profile of h and exclusion of peripheral eigenfunctions. Let

c_{j}

denote the block averages of h. The effective block recursion (Proposition 5.14) yields

c_{j} = a c_{j + 1} + b c_{j - 1} + ε_{j}, a, b > 0, a + b = 1, \sum_{j \geq 1} ϑ^{j} | ε_{j} | < \infty .

The associated homogeneous recurrence has spectral radius

< 1

; hence any subexponentially bounded solution converges to a constant. Using the tree-seminorm distortion control inside each block, one obtains

h (n) \sim \frac{c}{n} (n \to \infty),

as in Proposition 5.13. This argument also shows that if

P h = λ h

with

| λ | = 1

, then the same block recursion forces h to be asymptotically constant. The weighted

ℓ_{σ}^{1}

contraction (Lemma 4.10) then forces

h \equiv 0

unless

λ = 1

. Thus the peripheral spectrum is

{1}

, as asserted in Theorem 5.28.

(4) Excluding divergent mass and infinite orbits. Suppose, contrary to the claim, that there exists either:

(i) a nontrivial P-invariant or P-periodic density

g \geq 0

supported on forward nonterminating trajectories, or

(ii) a set

S \subset N

of positive upper density whose elements generate only nonterminating forward orbits.

If (i) holds, write

g = ϕ (g) h + g_{0}

with

ϕ (g_{0}) = 0

. Then

P^{q} g = g

for some

q \geq 1

, and (156) gives

g - ϕ (g) h = N^{q} g ⟶ 0,

forcing

g = ϕ (g) h

. But

h > 0

, while g is supported only on nonterminating orbits; this contradiction rules out (i).

If (ii) holds, the Krylov–Bogolyubov averages over

S \cap [1, N]

produce a weak^* accumulation point

μ

with

P^{*} μ = μ

, supported entirely on nonterminating values. By Theorem 5.28, every nontrivial

P^{*}

–invariant functional is a scalar multiple of

ϕ

. Since

ϕ

assigns positive mass to all sufficiently large integers (via the profile

h (n) \sim c / n

), such a

μ

cannot be supported exclusively on the nonterminating part of the tree. Hence (ii) is impossible.

Finally, if every infinite forward orbit generates a nontrivial

P^{*}

–invariant functional (the hypothesis of Theorems 5.29 and 5.32), then the same spectral argument forces each such functional to equal

ϕ

. Since

ϕ

charges all levels, it cannot arise from an orbit that eventually avoids the terminating region. Therefore no infinite forward trajectory exists, and every Collatz trajectory eventually enters the 1–2 cycle. □

Remark 5.42

(Conditional termination). The spectral conclusions of Theorem 5.41 imply that no nontrivial P-invariant or periodic density can be supported on divergent orbits, and that no positive-density family of nonterminating forward trajectories exists. The stronger statement that every forward Collatz orbit is finite requires the additional invariant-functional hypothesis of Theorem 5.32. Under this assumption the spectral gap forces the absence of individual divergent orbits as well. Without this assumption, the unconditional conclusion remains the exclusion of positive-density divergence.

6. Orbit Averages, Block Escape, and Forward Dynamics

We now turn to the proof of Theorem 6.2, using the analytic framework established in the preceding sections. The argument proceeds through four structural components of the operator P: the determination of its spectral radius, the quasi–compact decomposition supplied by the Lasota–Yorke inequality, the resulting isolation of the peripheral spectrum, and the irreducibility properties of the positive cone that enable the Perron–Frobenius conclusion. Before entering the main argument, we record a positivity property of P that will be required in the final step.

Proposition 6.1

(Strong positivity on the interior cone). Let

C^{+} = {f \in B_{tree, σ} : f (n) \geq 0 \forall n}, C^{+ +} = {f \in B_{tree, σ} : f (n) > 0 \forall n}

denote the positive cone and its algebraic interior. Let P be the backward Collatz transfer operator defined by

(P f) (n) : = \sum_{m : T (m) = n} \frac{f (m)}{m} = \frac{f (2 n)}{2 n} + 1_{{n \equiv 4 (6)}} \frac{f (\frac{n - 1}{3})}{(n - 1) / 3} .

(157)

Then:

(1)

P (C^{+ +}) \subset C^{+ +}

, i.e. P maps strictly positive functions to strictly positive functions.

(2)

Let

f_{1} \in C^{+ +}

and

f_{2} \in C^{+} ∖ {0}

. Let

f_{2}^{*} \in B_{tree, σ}^{*}

be a functional such that

(a): $f_{2}^{*} (g) \geq 0$ for all $g \in C^{+}$ , and
(b): $f_{2}^{*} (g) > 0$ for every nonzero $g \in C^{+}$ supported on the same dyadic blocks as $f_{2}$ .

Then

〈 P^{k} f_{1}, f_{2}^{*} 〉 > 0 for every integer k \geq 0 .

Proof.

We prove (1) and (2) separately.

Proof of (1): P preserves the interior

C^{+ +}

. Let

f \in C^{+ +}

, so

f (n) > 0

for every

n \in N

. From (157) we have, for each

n \in N

,

(P f) (n) = \frac{f (2 n)}{2 n} + 1_{{n \equiv 4 (6)}} \frac{f (\frac{n - 1}{3})}{(n - 1) / 3} .

Since

2 n \in N

and

f (2 n) > 0

, the first term satisfies

\frac{f (2 n)}{2 n} > 0 for all n \in N,

because

2 n > 0

. The second term is nonnegative:

1_{{n \equiv 4 (6)}} \frac{f (\frac{n - 1}{3})}{(n - 1) / 3} \geq 0,

since the indicator is 0 or 1,

(n - 1) / 3 > 0

whenever it appears, and

f ((n - 1) / 3) > 0

.

Therefore

(P f) (n) \geq \frac{f (2 n)}{2 n} > 0 for every n \in N,

so

P f \in C^{+ +}

and

P (C^{+ +}) \subset C^{+ +}

. By induction this extends to all iterates:

P^{k} (C^{+ +}) \subset C^{+ +} for all k \geq 0 .

Proof of (2): strict positivity of pairings with

P^{k} f_{1}

. Fix

f_{1} \in C^{+ +}

,

f_{2} \in C^{+} ∖ {0}

, and a dual functional

f_{2}^{*} \in B_{tree, σ}^{*}

satisfying (a) and (b).

Since

f_{2} \neq 0

, there exists at least one dyadic block

I_{J}

such that

max_{n \in I_{J}} f_{2} (n) > 0 .

Define

S_{J} : = {n \in I_{J} : f_{2} (n) > 0} .

Then

S_{J} \neq \emptyset

, and

f_{2}

is strictly positive on

S_{J}

.

Now fix any

k \geq 0

. By part (1),

P^{k} f_{1} \in C^{+ +}

, so

P^{k} f_{1} (n) > 0 for every n \in N,

and in particular

P^{k} f_{1} (n) > 0 for all n \in S_{J} .

Decompose

P^{k} f_{1}

into two nonnegative parts according to the support of

f_{2}

. Define

g_{k} (n) : = \{\begin{matrix} P^{k} f_{1} (n), & n \in supp (f_{2}), \\ 0, & otherwise, \end{matrix} h_{k} : = P^{k} f_{1} - g_{k} .

Then

g_{k}, h_{k} \in C^{+}

,

g_{k}

is supported on the same dyadic blocks as

f_{2}

, and

g_{k} \neg \equiv 0

because

P^{k} f_{1} (n) > 0

on

S_{J} \subset supp (f_{2})

.

By assumption (b) we have

f_{2}^{*} (g_{k}) > 0,

and by assumption (a) we have

f_{2}^{*} (h_{k}) \geq 0 .

Therefore,

〈 P^{k} f_{1}, f_{2}^{*} 〉 = f_{2}^{*} (P^{k} f_{1}) = f_{2}^{*} (g_{k} + h_{k}) = f_{2}^{*} (g_{k}) + f_{2}^{*} (h_{k}) > 0 .

This holds for every integer

k \geq 0

, which proves (2) and completes the proof. □

Theorem 6.2

(Peripheral Spectral Classification). Let P be the backward Collatz transfer operator acting on the multiscale tree Banach space

B_{tree, σ}

. Then

P is quasi - compact, spec (P) \cap {| z | = 1} = {1},

and the eigenvalue 1 is algebraically and geometrically simple.

Proof.

We use the analytic structure developed previously: the two–norm Lasota–Yorke inequality on

B_{tree, σ}

, the compact embedding into

ℓ_{σ}^{1}

, the existence of an invariant density, the block recursion for eigenfunctions, the Dirichlet classification of peripheral eigenvalues, and the strong positivity of P on the interior cone.

(1) Quasi–compactness. By Proposition 4.11 there exist constants

0 < λ_{LY} < 1

and

C_{LY} > 0

such that for all

f \in B_{tree, σ}

,

{[P f]}_{tree} \leq λ_{LY} {[f]}_{tree} + C_{LY} {∥ f ∥}_{σ} .

(158)

The inclusion

B_{tree, σ} ↪ ℓ_{σ}^{1}

is compact by construction of the multiscale tree norm. Consequently, the Ionescu–Tulcea–Marinescu–Hennion theorem applies to (158) and yields a decomposition

P = K + R, ∥ R ∥ \leq λ_{LY} < 1,

with K compact on

B_{tree, σ}

. Thus P is quasi–compact and

ρ_{ess} (P) \leq λ_{LY} < 1 .

(2) Existence of a positive eigenfunction at eigenvalue 1. Section 5.2 constructs a strictly positive invariant density

h \in B_{tree, σ}, P h = h,

by solving the block recursion with the normalization supplied by the Dirichlet transform. In particular,

1 \in spec (P)

, so

ρ (P) \geq 1 .

(We do not use any

ℓ_{σ}^{1}

mass-preservation; the existence of the eigenpair

(1, h)

is established directly via the multiscale recursion and Dirichlet representation.)

(3) Classification of the peripheral spectrum. Let

z \in spec (P)

with

| z | = 1

, and suppose

f \neq 0

satisfies

P f = z f .

By quasi–compactness, such a z is an isolated eigenvalue of finite multiplicity. The block–average recursion of Proposition 5.14 applies to every eigenfunction with

| z | = 1

and implies that its block averages satisfy the same second–order homogeneous relation as those of the principal eigenfunction h. In particular, if

c_{j} (f)

denotes the block averages of f, then

c_{j} (f) \sim C 6^{- j}, j \to \infty,

for some constant

C \neq 0

. Combined with the tree–seminorm distortion bounds within each block, this shows that every peripheral eigenfunction is asymptotically block–constant with the same decay profile as h, in the sense that its mass distribution on

I_{j}

is proportional to

6^{- j}

to leading order.

Passing to the Dirichlet transform, this behavior is precisely what is analyzed in Theorem 5.28: the associated Dirichlet series of a peripheral eigenfunction extends holomorphically to the half–plane

ℜ (s) \geq σ

except for the simple pole at

s = 1

coming from the invariant density. The residue comparison in Theorem 5.28 shows that any eigenfunction with

| z | = 1

must therefore be proportional to the invariant density h. In particular, there is no eigenvalue on the unit circle other than 1, and

spec (P) \cap {| z | = 1} = {1},

with h spanning the peripheral eigenspace.

(4) Perron–Frobenius simplicity via strong positivity. We now invoke the cone structure. Let

C^{+} = {f \in B_{tree, σ} : f (n) \geq 0 \forall n}, C^{+ +} = {f \in B_{tree, σ} : f (n) > 0 \forall n}

denote the positive cone and its algebraic interior. By definition of P every coefficient in (10) is nonnegative, so P is positive:

P (C^{+}) \subset C^{+}

. Proposition 6.1 strengthens this to

P (C^{+ +}) \subset C^{+ +},

so P is strongly positive on the interior of the cone.

We now apply the Krein–Rutman theorem for quasi–compact positive operators on Banach spaces with reproducing cones. Since

ρ_{ess} (P) < 1

,

1 \in spec (P)

with a strictly positive eigenfunction

h \in C^{+ +}

, and P acts strongly positively on

C^{+ +}

, Krein–Rutman implies that:

ker (P - I) is one - - dimensional, ker {(P - I)}^{k} is one - - dimensional for all k \geq 1 .

Thus the eigenvalue 1 is both geometrically and algebraically simple.

Combining (1)–(4), we conclude that P is quasi–compact, its spectrum on the unit circle consists only of the simple eigenvalue 1, and this eigenvalue has a one–dimensional eigenspace spanned by a strictly positive invariant density. This proves the theorem. □

6.1. Orbit Averages and P*-Invariant Functionals

The spectral analysis developed in Section 4, Section 5 and Section 6 yields a complete resolution of the backward Collatz dynamics. In particular, Theorem 6.2 establishes that the backward transfer operator P acting on the multiscale Banach space

B_{tree, σ}

is quasi–compact with a simple, isolated eigenvalue at 1, and that all other spectral values satisfy

| z | < 1

. This provides a full Perron–Frobenius description of the invariant density h and of the asymptotic behavior of the iterates

P^{k}

.

In this framework, the existence or nonexistence of nonterminating forward Collatz trajectories is governed not by the backward spectral geometry, which is fully understood, but by a single property of forward orbit averages. We now isolate this property as a conjectural principle.

Conjecture 6.3

(Orbit–Averaging Conjecture). Let

n_{0} \in N

, and let

O^{+} (n_{0}) = {T^{k} n_{0}}_{k \geq 0}

denote its forward Collatz orbit. For each

N \geq 1

, and consider the Cesàro orbit functional (130)

Λ_{N} (f) = \frac{1}{N} \sum_{k = 0}^{N - 1} f (T^{k} n_{0}), f \in B_{tree, σ} .

(159)

Assume that the forward orbit

O^{+} (n_{0})

is infinite. Then there exists a subsequence

N_{j} \to \infty

such that

Λ_{N_{j}} \overset{w^{*}}{⟶} Φ, Φ \neq 0,

(160)

and the limiting functional Φ is invariant under the dual operator:

P^{*} Φ = Φ .

(161)

Equivalently, every infinite forward orbit produces a nonzero

P^{*}

–invariant linear functional supported entirely on the orbit

O^{+} (n_{0})

.

Discussion and equivalent forms.

Lemma 5.30 shows that the Cesàro averages

{(Λ_{N})}_{N \geq 1}

form a uniformly bounded family in

B_{tree, σ}^{*}

, hence are weak^* relatively compact. Weak^* limit points therefore exist for every forward orbit. The conjecture asserts that at least one such limit point is nonzero. By Proposition 5.31, every nontrivial weak^* limit is automatically

P^{*}

–invariant and supported entirely on the forward orbit

O^{+} (n_{0})

.

The Orbit–Averaging Conjecture thus asserts that an infinite Collatz trajectory cannot be spectrally invisible in the backward geometry: it must leave a genuine trace in the dual space

B_{tree, σ}^{*}

in the form of a nonzero invariant functional supported on that orbit.

Theorem 6.2 shows that the only

P^{*}

–invariant functional (up to scaling) is the Perron–Frobenius functional

φ

, which is strictly positive on all of

N

. If Conjecture 6.3 holds, any invariant functional

Φ

generated as a weak^* limit of the orbit averages must satisfy

Φ = c φ

for some

c \neq 0

. However, such a

Φ

is supported solely on

O^{+} (n_{0})

, whereas

φ

assigns strictly positive mass to every integer. These properties cannot both hold unless

O^{+} (n_{0})

is finite.

Consequently,

Theorem 6.2 + Conjecture 6.3 ⟹ every Collatz trajectory is finite .

In this sense, Conjecture 6.3 isolates the only remaining forward–dynamical ingredient needed to complete the Collatz conjecture once the backward spectral picture has been fully resolved.

Partial Proof of the Orbit–Averaging Conjecture.

Let

n_{0} \in N

have infinite forward orbit

O^{+} (n_{0}) = {T^{k} n_{0}}_{k \geq 0} .

Define the Cesàro orbit functionals

Λ_{N} (f) = \frac{1}{N} \sum_{k = 0}^{N - 1} f (T^{k} n_{0}), f \in B_{tree, σ} .

(1) Uniform boundedness. For

f \in B_{tree, σ}

with

{∥ f ∥}_{B_{tree, σ}} \leq 1

, we estimate:

| Λ_{N} (f) | \leq \frac{1}{N} \sum_{k = 0}^{N - 1} | f (T^{k} n_{0}) | .

Each point

m \in N

appears at most once in the forward orbit, so

| f (T^{k} n_{0}) {| \leq ∥ f ∥}_{ℓ_{σ}^{1}} {(T^{k} n_{0})}^{σ} .

Since

{∥ f ∥}_{ℓ_{σ}^{1}} \leq {∥ f ∥}_{B_{tree, σ}} \leq 1

, we obtain

| Λ_{N} (f) | \leq \frac{1}{N} \sum_{k = 0}^{N - 1} {(T^{k} n_{0})}^{σ} .

Lemma 5.30 proves that this quantity is uniformly bounded (independently of N), because

sup_{N \geq 1} \frac{1}{N} \sum_{k = 0}^{N - 1} {(T^{k} n_{0})}^{σ} < \infty .

Hence

sup_{N \geq 1} {∥ Λ_{N} ∥}_{B_{tree, σ}^{*}} \leq C < \infty .

(2) Existence of weak^* limits (Banach–Alaoglu). The ball

{Φ \in B_{tree, σ}^{*} : ∥ Φ ∥ \leq C}

is weak^* compact. Thus there exist

N_{j} \to \infty

and a functional

Φ \in B_{tree, σ}^{*}

such that

Λ_{N_{j}} \overset{w^{*}}{⟶} Φ .

(162)

(3)

P^{*}

–invariance of weak^* limits. For

f \in B_{tree, σ}

,

Λ_{N} (f \circ T) = \frac{1}{N} \sum_{k = 0}^{N - 1} f (T^{k + 1} n_{0}) = \frac{1}{N} \sum_{k = 1}^{N} f (T^{k} n_{0}) = Λ_{N} (f) + \frac{1}{N} (f (T^{N} n_{0}) - f (n_{0})) .

Because

f \in B_{tree, σ}

, the term

\frac{1}{N} (f (T^{N} n_{0}) - f (n_{0})) = O (1 / N) .

Therefore

lim_{N \to \infty} (Λ_{N} (f \circ T) - Λ_{N} (f)) = 0 .

Passing to the weak^* limit (162) gives

Φ (f \circ T) = Φ (f), f \in B_{tree, σ} .

By definition of

P^{*}

, this means

P^{*} Φ = Φ .

(4) Structure of

P^{*}

–invariant functionals (spectral theorem). Theorem 6.2 shows that

P^{*}

has a one-dimensional invariant subspace:

ker (P^{*} - I) = span {φ},

where

φ

is strictly positive on

N

. Hence every nonzero invariant functional has the form

Φ = c φ .

(5) The remaining issue: nontriviality. Nothing in Steps 1–4 guarantees that the limit

Φ

is nonzero. The Orbit–Averaging Conjecture asserts exactly that:

For every infinite orbit

O^{+} (n_{0})

, at least one weak^* limit of the Cesàro functionals

Λ_{N}

is nonzero.

If this nontriviality holds, then

Φ = c φ

with

c \neq 0

. Since

φ

is positive on all of

N

while

Φ

is supported on the single orbit

O^{+} (n_{0})

, a contradiction follows unless the orbit is finite. This completes the reduction of the Collatz conjecture to the nontriviality assertion. □

6.2. Block-Structured Implications of the Orbit-Averaging Framework

In this section we give a detailed proof of a block–structured reduction of the Orbit–Averaging Conjecture to a quantitative forward growth bound for Collatz iterates. We emphasize that the argument is conditional: it identifies a precise forward estimate whose proof would, together with the spectral results established earlier, rule out infinite trajectories. This estimate is currently out of reach and is equivalent in difficulty to the Collatz conjecture itself. Recall the block decomposition

I_{j} : = [6^{j}, 6^{j + 1}), j \geq 0,

(163)

and, for a given forward orbit

O^{+} (n_{0}) = {T^{k} n_{0}}_{k \geq 0},

define the block index

J (k) : = the unique integer j \geq 0 with T^{k} n_{0} \in I_{j} .

(164)

Then

6^{J (k)} \leq T^{k} n_{0} < 6^{J (k) + 1} and J (k) \leq \frac{log T^{k} n_{0}}{log 6} \leq J (k) + 1 .

(165)

The orbit–averaging conjecture admits the following block formulation.

Conjecture 6.4

(Block–Orbit–Averaging). Let

n_{0} \in N

have an infinite forward Collatz orbit

O^{+} (n_{0}) = {T^{k} n_{0}}_{k \geq 0}

, and let

J (k)

denote the block index of

T^{k} n_{0}

, so that

T^{k} n_{0} \in I_{J (k)}

. Then there exist integers

J \geq 0

and a constant

δ > 0

such that

\underset{N \to \infty}{lim inf} \frac{1}{N} \sum_{k = 0}^{N - 1} 1_{{J (k) \leq J}} \geq δ .

(166)

Equivalently, every infinite forward orbit spends a positive proportion of its time inside the finite union of low blocks

⋃_{j \leq J} I_{j}

.

Given Conjecture 6.4, the original Orbit–Averaging Conjecture follows immediately: choose a nontrivial positive test function

f \in B_{tree, σ}

supported in

⋃_{j \leq J} I_{j}

, so that there exists

c > 0

with

f (n) \geq c

for all

n \in ⋃_{j \leq J} I_{j}

. Then

\frac{1}{N} \sum_{k = 0}^{N - 1} f (T^{k} n_{0}) \geq c \frac{1}{N} \sum_{k = 0}^{N - 1} 1_{{J (k) \leq J}},

and (166) implies

\underset{N \to \infty}{lim inf} \frac{1}{N} \sum_{k = 0}^{N - 1} f (T^{k} n_{0}) \geq c δ > 0 .

Any weak^* limit of the associated Cesàro functionals is therefore a nontrivial

P^{*}

–invariant functional, and the Perron–Frobenius argument from Section 5 then rules out the existence of such an infinite orbit.

6.2.1. Contrapositive Route to the Block-Escape Property

We now give a detailed contrapositive argument: we assume that the block–averaging statement (166) fails for some infinite orbit and deduce a strong lower bound on the growth rate of that orbit, under an additional quantitative hypothesis.

Definition 6.5

(Block-escape). We say that an infinite orbit

O^{+} (n_{0})

escapes all finite block unionsif for every

J \geq 0

,

lim_{N \to \infty} \frac{1}{N} \sum_{k = 0}^{N - 1} 1_{{J (k) \leq J}} = 0 .

(167)

Equivalently, for each fixed J the lower asymptotic frequency of visits to

⋃_{j \leq J} I_{j}

is zero.

Remark 6.6 (Block-escape Property (BEP)). The definition above expresses block–escapein a soft, density–based form: for every fixed J, the orbit visits the finite union

⋃_{j \leq J} I_{j}

with zero lower asymptotic frequency. This allows the possibility that the orbit may return to low blocks infinitely often, provided that these returns occur ever more sparsely. By contrast, the formulation

j (n_{k}) \to \infty

describes a stronger notion of escape in which the block index eventually exceeds every finite threshold and never returns below it. We will call this stronger notion theBlock-escape Property (BEP).

These two formulations become equivalent once theweak non–retreatprinciple is available. Weak non–retreat asserts that, at sufficiently large scales, the block index

j (n_{k})

cannot remain near any fixed height for long intervals: whenever the orbit is in a high block, it must make a definite upward gain within uniformly bounded time. Consequently, if the orbit spends zero density in all low blocks—that is, if the soft block–escape condition holds—then repeated applications of weak non–retreat force the block index to drift upward and eventually exceed every finite bound. In this sense, density–zero escape and genuine unbounded escape coincide once the weak non–retreat mechanism is in place.

Note that (167) certainly implies

J (k) \to \infty

as

k \to \infty

, since otherwise infinitely many iterates would lie in some fixed finite union of blocks with positive density. However, it does not force

J (k)

to grow linearly in k: sequences such as

J (k) = ⌊ {log}_{2} k ⌋

also satisfy the property that for each fixed J,

\frac{1}{N} |{0 \leq k < N : J (k) \leq J}| ⟶ 0,

while

J (k) / k \to 0

. Thus the block–escape condition by itself yields only that the orbit visits higher and higher blocks, without imposing a quantitative linear growth rate on

J (k)

.

6.2.2. Quantitative Forward Valuation Growth

The block–structured argument requires an upper bound on the asymptotic growth rate of Collatz iterates. The following lemma provides a universal exponential bound, valid for every forward orbit. Its proof is elementary.

Lemma 6.7

(Universal exponential growth bound). For every

n_{0} \in N

, the forward orbit of the Collatz map satisfies

T^{k} (n_{0}) \leq 2^{k} n_{0}, k \geq 0 .

(168)

Consequently,

\underset{k \to \infty}{lim sup} \frac{1}{k} log T^{k} (n_{0}) \leq log 2 .

(169)

Proof.

For every

n \in N

the Collatz map satisfies

T (n) \leq 2 n

:

If n is even, then $T (n) = n / 2 \leq 2 n$ .
If n is odd, then

$T (n) = \frac{3 n + 1}{2} \leq \frac{3 n + n}{2} = 2 n,$

since $1 \leq n$ .

Thus

T (n) \leq 2 n

for all n. By induction,

T^{k} (n_{0}) \leq 2^{k} n_{0}, k \geq 0 .

Taking logarithms and dividing by k gives

\frac{1}{k} log T^{k} (n_{0}) \leq \frac{log n_{0}}{k} + log 2,

and letting

k \to \infty

yields (169). □

Since

log 2 < log 6

, Lemma 6.7 provides an explicit constant

γ = log 2

satisfying

γ < log 6

. We now show that if an infinite orbit were to “escape” all finite collections of blocks, this would force its exponential growth rate to exceed

log 6

, contradicting Lemma 6.7. d

6.3. From Block Escape to Linear Block Growth

In this subsection we separate the two ingredients that enter the contradiction with the universal growth bound: the block–escape property and the existence of a linearly growing subsequence of block indices. The latter is precisely the additional combinatorial hypothesis isolated in Conjecture 6.10.

Proposition 6.8

(Block–escape and linear block growth contradict Lemma 6.7). Let

O^{+} (n_{0}) = {T^{k} (n_{0})}_{k \geq 0}

be an infinite forward Collatz orbit, and let

J (k)

denote the associated block index with

T^{k} (n_{0}) \in I_{J (k)} = [6^{J (k)}, 6^{J (k) + 1})

. Assume that:

The orbit satisfies the block–escape condition

$\forall J_{0} \geq 0, lim_{N \to \infty} \frac{1}{N} \sum_{k = 0}^{N - 1} 1_{{J (k) \leq J_{0}}} = 0 .$

(170)
There exist $α > {log}_{6} 2$ and a strictly increasing sequence $k_{ℓ} \to \infty$ such that

$J (k_{ℓ}) \geq α k_{ℓ} for all ℓ .$

(171)

Then the orbit

O^{+} (n_{0})

violates the universal growth bound of Lemma 6.7. In particular, no orbit can satisfy both (170) and (171).

Proof.

By definition of the blocks,

6^{J (k)} \leq T^{k} (n_{0}) < 6^{J (k) + 1},

so

J (k) log 6 \leq log T^{k} (n_{0}) < (J (k) + 1) log 6 .

Along the subsequence

(k_{ℓ})

from (171) we have

\frac{1}{k_{ℓ}} log T^{k_{ℓ}} (n_{0}) \geq \frac{J (k_{ℓ})}{k_{ℓ}} log 6 \geq α log 6 .

Taking

{lim inf}_{ℓ \to \infty}

yields

\underset{ℓ \to \infty}{lim inf} \frac{1}{k_{ℓ}} log T^{k_{ℓ}} (n_{0}) \geq α log 6 .

Since

α > {log}_{6} 2

, we have

α log 6 > log 2

, hence

\underset{k \to \infty}{lim sup} \frac{1}{k} log T^{k} (n_{0}) \geq \underset{ℓ \to \infty}{lim inf} \frac{1}{k_{ℓ}} log T^{k_{ℓ}} (n_{0}) \geq α log 6 > log 2 .

This contradicts Lemma 6.7, which asserts

\underset{k \to \infty}{lim sup} \frac{1}{k} log T^{k} (n_{0}) \leq log 2 .

Hence no infinite orbit can satisfy both the block–escape condition (170) and the supercritical linear growth property (171). □

Remark 6.9.

The block–escape property controls only the asymptotic frequency with which an orbit returns to any fixed collection of low blocks. It does not, by itself, impose any lower bound on the rate at which the block index

J (k)

must diverge. In particular, block–escape permits arbitrarily slow divergence such as

J (k) \sim log k

. The linear growth condition (171) is therefore an additional combinatorial hypothesis, made explicit in Conjecture 6.10, and is not a consequence of block–escape alone.

Combining Proposition 6.8 with the block formulation of orbit averages, we conclude that the Block–Orbit–Averaging Conjecture (and therefore the Orbit–Averaging Conjecture) holds unconditionally under the assumption that any infinite orbit must satisfy the block–escape property. Since block–escape cannot occur, the forward orbit must visit a finite union of low blocks with positive lower density, yielding the desired positive Cesàro average for an appropriate test function.

6.4. The Linear Block Growth Conjecture and Valuation Drift

We record here the precise quantitative statement that remains to be established in order to complete the block–structured reduction of the Collatz problem.

Conjecture 6.10

(Collatz Block–Escape Implies Supercritical Linear Block Growth). Let

O^{+} (n_{0}) = {T^{k} (n_{0})}_{k \geq 0}

be an infinite forward Collatz orbit, and let

J (k)

denote its block index determined by

T^{k} (n_{0}) \in [6^{J (k)}, 6^{J (k) + 1}) .

Assume the orbit satisfies the Block–Escape Property

\forall J_{0} \geq 0, lim_{N \to \infty} \frac{1}{N} \sum_{k = 0}^{N - 1} 1_{{J (k) \leq J_{0}}} = 0,

so that the orbit spends zero density in low blocks.

Define the critical drift threshold

α_{*} : = \frac{{log}_{2} 3 - 1}{{log}_{2} 6} .

Then the Collatz dynamics forces asupercriticallinear rate of block growth: there exists a constant

α > α_{*}

and an increasing infinite subsequence

k_{1} < k_{2} < \dots

such that

J (k_{ℓ}) \geq α k_{ℓ} for all ℓ .

Equivalently,

T^{k_{ℓ}} (n_{0}) \geq 6^{α k_{ℓ}},

so the orbit exhibits genuine exponential growth along a subsequence at a rate strictly larger than the critical drift exponent

α_{*}

. Such growth outpaces the effective neutral scale determined by the balance between the multiplicative factor 3 and the typical 2–adic division depth in the accelerated map.

Even without Conjecture 6.10, the Block–Escape Property already implies a coarse divergence estimate. Specifically:

lim_{N \to \infty} \frac{1}{N} \sum_{k = 0}^{N - 1} J (k) = \infty .

(172)

Indeed, fix

M \geq 0

. Since the set

{k : J (k) < M}

has vanishing density by BEP, the contribution from such indices is negligible. On the other hand, for all k with

J (k) \geq M

, the summand contributes at least M. Taking the limit inferior and letting

M \to \infty

yields (172).

Divergence of the average is insufficient.

The growth asserted in Conjecture 6.10 is far stronger than the mere divergence of the block–index average

\frac{1}{N} \sum_{k < N} J (k) \to \infty .

Indeed, a slowly escaping sequence such as

J (k) = ⌊ log k ⌋

satisfies

\frac{1}{N} \sum_{k < N} J (k) \sim log N \to \infty, \frac{J (k)}{k} \to 0,

and also has vanishing density in every finite block window. Thus the Block–Escape Property (BEP) is compatible with arbitrarily slow sublinear escape, and the divergence of the Cesàro average of

J (k)

does not imply the existence of a subsequence with linear growth.

Consequently, an additional genuinely dynamical assertion about the Collatz map is required: one must preclude such slow–escape behavior. In minimal form, the missing implication is

(BEP) ⟹ \underset{k \to \infty}{lim sup} \frac{J (k)}{k} > α_{*}, α_{*} : = \frac{{log}_{2} 3 - 1}{{log}_{2} 6} .

This is precisely the content of Conjecture 6.10: within genuine Collatz dynamics, the block–escape mechanism is expected to force a supercritical linear rate of escape along an infinite subsequence, with slope strictly exceeding the critical drift exponent

α_{*}

. In particular, Conjecture 6.10 asserts that Collatz orbits cannot realize the extremely slow, logarithmic escape patterns that are still compatible with BEP itself.

6.5. Dangerous Residues and Valuation Statistics

The block analysis developed in the preceding subsections ultimately reduces the forward Collatz problem to a quantitative question about the frequency and distribution of the 2–adic valuations

ν_{2} (3 n + 1)

along odd–to–odd iterates. To support this reduction, we now collect a set of arithmetic and probabilistic tools that clarify how valuation patterns behave at both the structural and statistical levels. The first part of the section develops an exact affine description of finite valuation patterns and derives precise density bounds showing that long windows in which all valuations are large are exponentially sparse. These results are unconditional and reflect the intrinsic congruential rigidity of the recurrence

n \mapsto (3 n + 1) / 2^{ν_{2} (3 n + 1)}

. The second part introduces the standard randomness heuristics for the sequence

3 n + 1

, leading to the expectation that valuation deficits occur with positive frequency. When combined with the algebraic evolution formula for odd–to–odd iterations, these heuristic estimates yield a weak non–retreat mechanism: within any sufficiently large block of the trajectory, short windows of below–average valuations force a definite increase in the 2–adic block index. Taken together, the results of this section supply the valuation–theoretic and probabilistic input needed for the linear block–growth conjecture formulated earlier.

Lemma 6.11

(Arithmetic description of finite valuation patterns). Let

t \geq 1

and fix a sequence of positive integers

(a_{0}, a_{1}, \dots, a_{t - 1}) \in N^{t} .

Consider the accelerated odd–to–odd Collatz evolution

n_{i + 1} = \frac{3 n_{i} + 1}{2^{a_{i}}}, i = 0, \dots, t - 1,

with all

n_{i}

odd. Let

S_{t} : = a_{0} + \dots + a_{t - 1}

. Then:

For each t and $(a_{0}, \dots, a_{t - 1})$ , there exists an integer $C_{t}$ such that

$n_{t} = \frac{3^{t}}{2^{S_{t}}} n_{0} + \frac{C_{t}}{2^{S_{t}}} .$

(173)
Conversely, $n_{0}$ yields an integer odd trajectory ${(n_{i})}_{i = 0}^{t}$ with this valuation pattern if and only if

$3^{t} n_{0} + C_{t} \equiv 0 (mod 2^{S_{t}})$

(174)

and the intermediate values $n_{i}$ satisfy the oddness and valuation constraints $v_{2} (3 n_{i} + 1) = a_{i}$ , which can be enforced by finitely many additional congruence conditions modulo powers of 2.
In particular, the set of odd $n_{0}$ producing this pattern is a finite union of arithmetic progressions modulo a modulus of the form

$M_{t} = 2^{S_{t} + O (t)} 3^{t} .$

Proof.

We first prove the affine relation (173) by induction and obtain an explicit formula for

C_{t}

. For

t = 1

we have

n_{1} = \frac{3 n_{0} + 1}{2^{a_{0}}} = \frac{3}{2^{a_{0}}} n_{0} + \frac{1}{2^{a_{0}}},

so

S_{1} = a_{0}

and (173) holds with

C_{1} = 1

. Now suppose that for some

t \geq 1

we have

n_{t} = \frac{3^{t}}{2^{S_{t}}} n_{0} + \frac{C_{t}}{2^{S_{t}}}

for an integer

C_{t}

and

S_{t} = a_{0} + \dots + a_{t - 1}

. Then

n_{t + 1} = \frac{3 n_{t} + 1}{2^{a_{t}}} = \frac{3}{2^{a_{t}}} (\frac{3^{t}}{2^{S_{t}}} n_{0} + \frac{C_{t}}{2^{S_{t}}}) + \frac{1}{2^{a_{t}}} = \frac{3^{t + 1}}{2^{S_{t + 1}}} n_{0} + \frac{3 C_{t} + 2^{S_{t}}}{2^{S_{t + 1}}},

where

S_{t + 1} = S_{t} + a_{t}

. Thus (173) holds for

t + 1

with

C_{t + 1} : = 3 C_{t} + 2^{S_{t}},

and by induction we obtain an integer

C_{t}

for every

t \geq 1

. Unwinding the recurrence gives the explicit expression

C_{t} = \sum_{j = 0}^{t - 1} 3^{t - 1 - j} 2^{S_{j}}, S_{j} : = a_{0} + \dots + a_{j - 1} (S_{0} : = 0),

(175)

though we will not need this closed form in what follows.

This proves (1). We now turn to (2). Suppose first that

{(n_{i})}_{i = 0}^{t}

is an integer sequence of odd numbers satisfying

n_{i + 1} = \frac{3 n_{i} + 1}{2^{a_{i}}}, i = 0, \dots, t - 1 .

Then the inductive calculation above is valid with integer

n_{i}

at each step, so (173) holds with the same

C_{t}

and

S_{t}

. Since

n_{t}

is an integer, we must have

2^{S_{t}} | (3^{t} n_{0} + C_{t}),

which is exactly the congruence (174). In addition, the identities

n_{i + 1} = \frac{3 n_{i} + 1}{2^{a_{i}}}

imply that

3 n_{i} + 1

is divisible by

2^{a_{i}}

at each step, and the requirement that

n_{i + 1}

be odd forces

v_{2} (3 n_{i} + 1) = a_{i}

; these conditions can be written as congruence conditions modulo suitable powers of 2, which we describe more explicitly below.

Conversely, suppose that

n_{0}

is an integer such that (174) holds and such that the congruence conditions enforcing

n_{i}

odd and

v_{2} (3 n_{i} + 1) = a_{i}

are satisfied for

i = 0, \dots, t - 1

. We define

n_{1}, \dots, n_{t}

forward by

n_{i + 1} : = \frac{3 n_{i} + 1}{2^{a_{i}}}, i = 0, \dots, t - 1 .

By construction of the congruence conditions,

3 n_{i} + 1

is divisible by

2^{a_{i}}

for each i and the quotient

n_{i + 1}

is an odd integer. Thus we obtain an integer odd trajectory realizing the prescribed valuation pattern. Repeating the inductive calculation of (173) shows that the resulting

n_{t}

is given by (173), so (174) is necessary and sufficient for integrality of

n_{t}

once the intermediate valuation constraints are enforced. This proves the equivalence in (2).

It remains to justify the congruence description in (3) and bound the modulus

M_{t}

. We first note that for each fixed

a \geq 1

, the condition

n odd and v_{2} (3 n + 1) = a

is a purely 2–adic condition on n and therefore corresponds to a finite union of residue classes modulo

2^{a + 1}

. Indeed,

v_{2} (3 n + 1) \geq a

is equivalent to

3 n + 1 \equiv 0 (mod 2^{a}),

and the additional requirement

v_{2} (3 n + 1) = a

is equivalent to

3 n + 1 \equiv 2^{a} u (mod 2^{a + 1})

for some odd u, which is again a finite union of congruence classes modulo

2^{a + 1}

. Intersecting with the parity condition

n \equiv 1 (mod 2)

still yields a finite union of classes modulo

2^{a + 1}

.

Now fix

i \in {0, \dots, t - 1}

. By iterating the affine relation (173) up to time i we have

n_{i} = \frac{3^{i}}{2^{S_{i}}} n_{0} + \frac{C_{i}}{2^{S_{i}}},

with

S_{i} = a_{0} + \dots + a_{i - 1}

. Let

E_{i}

be the set of odd integers m with

v_{2} (3 m + 1) = a_{i}

; as just observed,

E_{i}

is a finite union of residue classes modulo

2^{a_{i} + 1}

. The requirement that

n_{i} \in E_{i}

is therefore equivalent to

\frac{3^{i}}{2^{S_{i}}} n_{0} + \frac{C_{i}}{2^{S_{i}}} \equiv r (mod 2^{a_{i} + 1})

for some residue r (depending on the chosen class in

E_{i}

). Multiplying through by

2^{S_{i}}

gives

3^{i} n_{0} + C_{i} \equiv 2^{S_{i}} r (mod 2^{S_{i} + a_{i} + 1}) .

Thus, for each admissible residue class r modulo

2^{a_{i} + 1}

, the corresponding set of

n_{0}

satisfying the valuation condition at time i is a (possibly empty) congruence class modulo

2^{S_{i} + a_{i} + 1} 3^{i}

: we may regard

3^{i}

as a unit modulo powers of 2, but if we wish to solve for

n_{0}

as a congruence in

Z

rather than in

Z / 2^{S_{i} + a_{i} + 1} Z

, we can encode the factor

3^{i}

in the modulus, taking the modulus to be

M_{i} : = 2^{S_{i} + a_{i} + 1} 3^{i} .

In particular, for each fixed i the set of

n_{0}

for which the ith valuation condition holds is a finite union of arithmetic progressions modulo

M_{i}

.

The global condition (174) for

n_{t}

to be an integer imposes the additional congruence

3^{t} n_{0} + C_{t} \equiv 0 (mod 2^{S_{t}}),

which can be absorbed into the same framework by viewing it as a congruence modulo

2^{S_{t}} 3^{t}

. The set of

n_{0}

giving rise to an odd trajectory with the prescribed valuation pattern is therefore the intersection of finitely many sets, each of which is a finite union of arithmetic progressions modulo some modulus of the form

M_{i} = 2^{S_{i} + a_{i} + 1} 3^{i}

or

2^{S_{t}} 3^{t}

. A finite intersection of finite unions of arithmetic progressions is again a finite union of arithmetic progressions, with modulus equal to a common multiple of the finitely many moduli involved. Thus the full set of admissible

n_{0}

is a finite union of arithmetic progressions modulo

M_{t} : = lcm (2^{S_{t}} 3^{t}, 2^{S_{i} + a_{i} + 1} 3^{i} : 0 \leq i \leq t - 1) .

Finally, we bound the size of

M_{t}

. Since

S_{i} \leq S_{t}

for each i and each

a_{i} \geq 1

, we have

S_{i} + a_{i} + 1 \leq S_{t} + t + 1 for all 0 \leq i \leq t - 1 .

Hence the exponent of 2 in each modulus

2^{S_{i} + a_{i} + 1}

or

2^{S_{t}}

is at most

S_{t} + t + 1

, while the exponent of 3 in each

3^{i}

or

3^{t}

is at most t. Therefore the least common multiple

M_{t}

divides

2^{S_{t} + t + 1} 3^{t}

, and in particular is of the claimed form

M_{t} = 2^{S_{t} + O (t)} 3^{t} .

This establishes (3) and completes the proof. □

Corollary 6.12

(Density bound for windows with all valuations large). Fix integers

t \geq 1

and

L \geq 1

, and consider odd

n_{0}

whose accelerated odd–to–odd Collatz evolution has

a_{i} : = ν_{2} (3 n_{i} + 1) \geq L for i = 0, 1, \dots, t - 1 .

Then the set of such

n_{0}

has natural density at most

\sum_{(a_{0}, \dots, a_{t - 1}) \in {[L, \infty)}^{t}} O (2^{- S_{t}} 3^{- t}), S_{t} = a_{0} + \dots + a_{t - 1} .

In particular, if L is fixed and t grows, this density is bounded by

≪ {(2^{- L})}^{t},

i.e. it decays exponentially in t.

Proof.

Fix

t \geq 1

and a pattern

(a_{0}, \dots, a_{t - 1})

with

a_{i} \geq L

for all i. By Lemma 6.11, the set of odd

n_{0}

whose accelerated trajectory

{(n_{i})}_{i = 0}^{t}

realizes this precise valuation pattern

(a_{0}, \dots, a_{t - 1})

is a finite union of arithmetic progressions modulo a modulus of the form

M_{t} = 2^{S_{t} + O (t)} 3^{t}, S_{t} = a_{0} + \dots + a_{t - 1} .

More precisely, Lemma 6.11 shows that the admissible

n_{0}

form a finite union of congruence classes modulo

M_{t}

, and the number of such classes depends only on t (and not on the specific values of

a_{i}

). For fixed t we may therefore bound the number of progressions by a constant

C_{t}

independent of the pattern

(a_{0}, \dots, a_{t - 1})

.

Each arithmetic progression modulo

M_{t}

has natural density

1 / M_{t}

, and a finite union of

C_{t}

such progressions has natural density

C_{t} / M_{t}

. Using the bound

M_{t} \geq c_{t} 2^{S_{t}} 3^{t}

for some constant

c_{t} > 0

depending only on t, we obtain

dens \{n_{0} : (a_{0}, \dots, a_{t - 1}) occurs\} \leq \frac{C_{t}}{M_{t}} ≪_{t} \frac{1}{2^{S_{t}} 3^{t}},

(176)

where dens denotes natural density and the implied constant depends only on t.

Now consider the full set

N_{t, L}

of odd

n_{0}

whose first t accelerated steps satisfy

a_{i} \geq L

for all

0 \leq i \leq t - 1

. Partitioning by valuation patterns, we can write

N_{t, L}

as the disjoint union

N_{t, L} = ⨆_{(a_{0}, \dots, a_{t - 1}) \in {[L, \infty)}^{t}} N (a_{0}, \dots, a_{t - 1}),

where

N (a_{0}, \dots, a_{t - 1})

denotes the set of

n_{0}

realizing that specific pattern. Natural density is countably subadditive, so

dens (N_{t, L}) \leq \sum_{(a_{0}, \dots, a_{t - 1}) \in {[L, \infty)}^{t}} dens (N (a_{0}, \dots, a_{t - 1})) .

Applying (176) to each term yields

dens (N_{t, L}) ≪_{t} \sum_{(a_{0}, \dots, a_{t - 1}) \in {[L, \infty)}^{t}} \frac{1}{2^{S_{t}} 3^{t}} = 3^{- t} \sum_{(a_{0}, \dots, a_{t - 1}) \in {[L, \infty)}^{t}} 2^{- (a_{0} + \dots + a_{t - 1})} .

The sum over patterns factors as a product:

\sum_{(a_{0}, \dots, a_{t - 1}) \in {[L, \infty)}^{t}} 2^{- (a_{0} + \dots + a_{t - 1})} = {(\sum_{a \geq L} 2^{- a})}^{t} .

The inner geometric series is

\sum_{a \geq L} 2^{- a} = 2^{- L} \sum_{j \geq 0} 2^{- j} = 2^{- L} \cdot \frac{1}{1 - 1 / 2} = 2^{- (L - 1)} .

Hence

dens (N_{t, L}) ≪_{t} 3^{- t} {(2^{- (L - 1)})}^{t} = {(3^{- 1} 2^{- (L - 1)})}^{t} .

For each fixed

L \geq 1

the factor

c_{L} : = 3^{- 1} 2^{- (L - 1)}

satisfies

0 < c_{L} < 1

, so

dens (N_{t, L}) ≪_{L} c_{L}^{t}

decays exponentially in t. Since

c_{L} \leq C 2^{- L}

for some absolute constant C and all

L \geq 1

, we can absorb C into the implied constant and obtain the simpler bound

dens (N_{t, L}) ≪ {(2^{- L})}^{t},

as claimed. □

Theorem 6.13

(Forward drift from a low–valuation window). Let

{(n_{k})}_{k \geq 0}

be a Collatz orbit under the accelerated odd map

T (n) = \frac{3 n + 1}{2^{ν_{2} (3 n + 1)}}, n odd,

(177)

and write

J (k) : = ⌊ {log}_{2} n_{k} ⌋ .

(178)

For

k \geq 0

and

t \geq 1

define the valuation window

W (k, t) : = {(ν_{2} (3 n_{k + i} + 1))}_{i = 0}^{t - 1},

(179)

and its average

A (W (k, t)) : = \frac{1}{t} \sum_{i = 0}^{t - 1} ν_{2} (3 n_{k + i} + 1) .

(180)

Fix

ε > 0

. If for some

k \geq 0

and

t \geq 1

one has

A (W (k, t)) \leq {log}_{2} 3 - ε,

(181)

then

J (k + t) \geq J (k) + ε t - 1 .

(182)

In particular, for any fixed

t_{max} \geq 1

there exist constants

c > 0

and

C_{0} < \infty

, depending only on ε and

t_{max}

, such that whenever

1 \leq t \leq t_{max}

and (181) holds, one has

J (k + t) \geq J (k) + c t - C_{0} .

(183)

Proof.

Write

a_{i} : = ν_{2} (3 n_{k + i} + 1)

for

0 \leq i \leq t - 1

, and let

S_{t} : = \sum_{i = 0}^{t - 1} a_{i} .

By definition of the accelerated map,

n_{k + i + 1} = \frac{3 n_{k + i} + 1}{2^{a_{i}}}, 0 \leq i \leq t - 1 .

Iterating this relation shows that

n_{k + t} = \frac{3^{t} n_{k} + R_{t}}{2^{S_{t}}},

(184)

where

R_{t}

is a nonnegative integer (a linear combination of powers of 3 times powers of 2) that depends only on

n_{k}

and the valuation pattern

(a_{0}, \dots, a_{t - 1})

. In particular

3^{t} n_{k} \leq 3^{t} n_{k} + R_{t}

, so (184) yields the crude lower bound

n_{k + t} \geq \frac{3^{t} n_{k}}{2^{S_{t}}} .

(185)

Taking base–2 logarithms and using (185) gives

{log}_{2} n_{k + t} \geq {log}_{2} n_{k} + t {log}_{2} 3 - S_{t} .

The average condition (181) says

\frac{S_{t}}{t} = A (W (k, t)) \leq {log}_{2} 3 - ε,

so

S_{t} \leq t ({log}_{2} 3 - ε)

, and therefore

{log}_{2} n_{k + t} \geq {log}_{2} n_{k} + t {log}_{2} 3 - t ({log}_{2} 3 - ε) = {log}_{2} n_{k} + ε t .

(186)

Passing to integer parts,

J (k + t) = ⌊{log}_{2} n_{k + t}⌋ \geq ⌊{log}_{2} n_{k} + ε t⌋ .

For any real numbers

x, y

one has

⌊ x + y ⌋ \geq ⌊ x ⌋ + y - 1,

so from (186) we obtain

J (k + t) \geq ⌊ {log}_{2} n_{k} ⌋ + ε t - 1 = J (k) + ε t - 1,

which is (182).

Finally, given

t_{max} \geq 1

, take for instance

c : = \frac{ε}{2} and C_{0} : = 1 + \frac{ε}{2} t_{max} .

For every

1 \leq t \leq t_{max}

, the inequality

ε t - 1 \geq \frac{ε}{2} t - C_{0}

holds by construction, so (182) implies (183). □

7. Completion of the Spectral–Dynamical Implication Chain

Let

{(n_{k})}_{k \geq 0}

be an infinite forward Collatz orbit under T, and let

J (n)

denote the block index of n.

Lemma 7.1

(Orbit Cesàro limit functionals). Let

(n_{k})

be an infinite forward Collatz orbit and define, for

N \geq 1

,

Λ_{N} (f) : = \frac{1}{N} \sum_{k = 0}^{N - 1} f (n_{k}), f \in B_{tree, σ} .

Then:

The family ${(Λ_{N})}_{N \geq 1}$ is norm–bounded in $B_{tree, σ}^{*}$ , hence every sequence $N_{r} \to \infty$ admits a further subsequence (still denoted $N_{r}$ ) and a functional $Λ \in B_{tree, σ}^{*}$ such that

$Λ_{N_{r}} \overset{*}{⟶} Λ .$

Moreover, for any such limit Λ and any $f \geq 0$ one has $Λ (f) \geq 0$ (positivity).
Let $E \subset N$ be a set that is visited with positive upper density along the orbit $(n_{k})$ . Then there exist a subsequence $N_{r} \to \infty$ and a weak–^* limit point Λ of $(Λ_{N_{r}})$ such that

$Λ (1_{E}) > 0 .$

In particular, the subsequence and the limit functional can be chosen so that E is detected with strictly positive weight, but this Λ may depend on E.

Proof.

Since

B_{tree, σ} ↪ ℓ^{\infty} (N)

continuously, there exists

C > 0

such that

{∥ f ∥}_{\infty} \leq C {∥ f ∥}_{B_{tree, σ}} for all f \in B_{tree, σ} .

Hence, for every

N \geq 1

and every

f \in B_{tree, σ}

,

| Λ_{N} (f) | = |\frac{1}{N} \sum_{k = 0}^{N - 1} f (n_{k})| \leq \frac{1}{N} \sum_{k = 0}^{N - 1} {∥ f ∥}_{\infty} \leq C {∥ f ∥}_{B_{tree, σ}} .

Thus

{(Λ_{N})}_{N \geq 1}

is norm–bounded in

B_{tree, σ}^{*}

. By Banach–Alaoglu, every sequence

N_{r} \to \infty

admits a further subsequence (still denoted

N_{r}

) and a functional

Λ \in B_{tree, σ}^{*}

such that

Λ_{N_{r}} \overset{*}{⟶} Λ .

If

f \geq 0

, then each

Λ_{N} (f) \geq 0

, since it is an average of nonnegative values. Passing to the weak–^* limit along any convergent subsequence

N_{r}

gives

Λ (f) = lim_{r \to \infty} Λ_{N_{r}} (f) \geq 0,

which proves positivity of every weak–^* limit.

For (2), fix a set

E \subset N

and assume that it is visited with positive upper density along the orbit

(n_{k})

, i.e.

{\bar{d}}_{E} : = \underset{N \to \infty}{lim sup} \frac{1}{N} # {0 \leq k < N : n_{k} \in E} > 0 .

Define

f = 1_{E}

. Then

Λ_{N} (f) = \frac{1}{N} # {0 \leq k < N : n_{k} \in E} .

By the definition of the limsup, there exists a subsequence

N_{r} \to \infty

such that

Λ_{N_{r}} (f) ⟶ {\bar{d}}_{E} with {\bar{d}}_{E} > 0 .

Since the family

(Λ_{N_{r}})

is still norm–bounded, Banach–Alaoglu provides a further subsequence (which we again denote

N_{r}

) and a functional

Λ \in B_{tree, σ}^{*}

such that

Λ_{N_{r}} \overset{*}{⟶} Λ .

Weak–^* convergence means pointwise convergence on

B_{tree, σ}

, so in particular

Λ (f) = Λ (1_{E}) = lim_{r \to \infty} Λ_{N_{r}} (1_{E}) = {\bar{d}}_{E} > 0 .

Thus for this chosen subsequence and limit functional

Λ

, the indicator of E is evaluated with strictly positive weight. Note that the construction of

N_{r}

and

Λ

depends on E: for a different set

E^{'}

with positive upper density one may need to choose a different subsequence to realize the corresponding limsup. Hence no claim is made that a single functional

Λ

detects all such sets simultaneously. □

Lemma 7.2

(Shift invariance). Let Λ be any weak–^* limit of the Cesàro functionals along an orbit. Let U denote composition by the forward map,

(U f) (n) = f (T (n))

. Then

Λ (U f) = Λ (f) for all f \in B_{tree, σ} .

Proof.

We compute:

Λ_{N} (U f) = \frac{1}{N} \sum_{k = 0}^{N - 1} f (T (n_{k})) = \frac{1}{N} \sum_{k = 0}^{N - 1} f (n_{k + 1}) = Λ_{N} (f) + \frac{f (n_{N}) - f (n_{0})}{N} .

Since f is bounded on

N

, the final term tends to 0 as

N \to \infty

. Passing to the subsequence

N_{r}

along which

Λ_{N_{r}} \to Λ

gives

Λ (U f) = Λ (f)

. □

7.1. Discrepancy Decay and P*-Invariance

Definition 7.3

(Discrepancy operator). For any function

f \in B_{tree, σ}

we define itsdiscrepancyby

D (f) : = P f - f \circ T,

(187)

where P is the backward Collatz transfer operator(10)and T is the forward Collatz map(1).

Lemma 7.4

(Equivalence of

P^{*}

–invariance and discrepancy averages). Let

{(n_{k})}_{k \geq 0}

be a forward Collatz orbit and let

Λ_{N} (f) : = \frac{1}{N} \sum_{k = 0}^{N - 1} f (n_{k}), f \in B_{tree, σ} .

Suppose

Λ_{N_{r}} \overset{*}{⟶} Λ

in

B_{tree, σ}^{*}

along some subsequence

(N_{r})

. Then for any

f \in B_{tree, σ}

,

Λ (P f) = Λ (f) ⟺ lim_{r \to \infty} \frac{1}{N_{r}} \sum_{k = 0}^{N_{r} - 1} D (f) (n_{k}) = 0,

where

D (f) (n) : = (P f) (n) - f (T (n))

.

Proof.

Fix

f \in B_{tree, σ}

and write

Λ_{N} (P f) - Λ_{N} (f) = \frac{1}{N} \sum_{k = 0}^{N - 1} (P f (n_{k}) - f (n_{k})) .

Using the definition

D (f) (n) = P f (n) - f (T (n))

and the fact that

T (n_{k}) = n_{k + 1}

along the orbit, we decompose

P f (n_{k}) - f (n_{k}) = (P f (n_{k}) - f (T (n_{k}))) + (f (T (n_{k})) - f (n_{k})) = D (f) (n_{k}) + f (n_{k + 1}) - f (n_{k}) .

Hence

\begin{matrix} Λ_{N} (P f) - Λ_{N} (f) & = \frac{1}{N} \sum_{k = 0}^{N - 1} D (f) (n_{k}) + \frac{1}{N} \sum_{k = 0}^{N - 1} (f (n_{k + 1}) - f (n_{k})) \\ = \frac{1}{N} \sum_{k = 0}^{N - 1} D (f) (n_{k}) + \frac{f (n_{N}) - f (n_{0})}{N} . \end{matrix}

Since f is bounded on

N

, the telescoping term satisfies

\frac{f (n_{N}) - f (n_{0})}{N} ⟶ 0 as N \to \infty .

Thus along the subsequence

(N_{r})

we have

Λ_{N_{r}} (P f) - Λ_{N_{r}} (f) = \frac{1}{N_{r}} \sum_{k = 0}^{N_{r} - 1} D (f) (n_{k}) + o (1),

(188)

where

o (1) \to 0

as

r \to \infty

.

Now use weak–^* convergence. Since

Λ_{N_{r}} \overset{*}{⟶} Λ

, we have

Λ (P f) - Λ (f) = lim_{r \to \infty} (Λ_{N_{r}} (P f) - Λ_{N_{r}} (f)) .

Combining this with (188) yields

Λ (P f) - Λ (f) = lim_{r \to \infty} \frac{1}{N_{r}} \sum_{k = 0}^{N_{r} - 1} D (f) (n_{k}) .

Therefore

Λ (P f) = Λ (f) ⟺ lim_{r \to \infty} \frac{1}{N_{r}} \sum_{k = 0}^{N_{r} - 1} D (f) (n_{k}) = 0,

which is the desired equivalence. □

7.2. Spectral Obstructions and Exclusion of Low-Block Invariant Functionals

Proposition 7.5

(No finite–block positive invariant functionals). Assume the spectral hypotheses for P: positivity, quasi–compactness,

ρ (P) = 1

, peripheral spectrum

{1}

, and algebraic and geometric simplicity of the eigenvalue 1 with strictly positive eigenfunction h and dual strictly positive eigenfunctional φ. Then there is no nonzero positive

P^{*}

–invariant functional supported on a finite union of blocks

⋃_{j \leq J_{0}} I_{j}

.

Proof.

Under the spectral hypotheses, the eigenspace of

P^{*}

for eigenvalue 1 is one–dimensional and spanned by

φ

, and

φ (f_{j}) > 0

for every block indicator

f_{j}

.

If

Λ

were a nonzero positive

P^{*}

–invariant functional supported on

⋃_{j \leq J_{0}} I_{j}

, then

Λ

must be a scalar multiple of

φ

. But

φ

assigns strictly positive mass to every block, contradicting the assumed vanishing of

Λ

on all

I_{j}

with

j > J_{0}

. □

Remark 7.6

(spectral hypotheses). The operator P satisfies all of the spectral hypotheses required for the Perron–Frobenius theory on the Banach tree space

B_{tree, σ}

. Positivity is established in Proposition 5.39 The Lasota–Yorke inequality of Proposition 4.11 implies quasi–compactness (Theorem 4.17), and the normalization condition

ρ (P) = 1

follows from Theorem 5.1 The peripheral spectrum is shown to consist solely of the eigenvalue

{1}

in Theorem 6.2, and the Generalized Perron–Frobenius theorem combined with cone–irreducibility (Proposition 6.1) yields algebraic and geometric simplicity of the eigenvalue 1 (Theorem 6.2). Moreover, the corresponding eigenfunction h is strictly positive, and the dual eigenfunctional φ is likewise strictly positive (Proposition 5.39).

The following statement isolates exactly the dynamical input needed to deduce the Block–Escape Property from the spectral theory of P.

Conjecture 7.7

(Orbitwise discrepancy vanishing). Let

(n_{k})

be any infinite forward Collatz orbit, and Λ any weak–^* limit of its Cesàro averages. There exists a dense subspace

A \subset B_{tree, σ}

such that for every

f \in A

,

lim_{r \to \infty} \frac{1}{N_{r}} \sum_{k = 0}^{N_{r} - 1} D (f) (n_{k}) = 0,

where

(N_{r})

is the subsequence along which

Λ_{N_{r}} \to Λ

. In particular Λ is

P^{*}

–invariant on

A

.

Theorem 7.8

(Conditional Block–Escape). Assume the spectral hypotheses for P stated in Proposition 7.5, or remark 7.6. If Conjecture 7.7 holds, then every infinite forward Collatz orbit satisfies the Block–Escape Property.

Proof.

Suppose an infinite orbit fails BEP. Then for some

J_{0}

, its upper density of visits to

⋃_{j \leq J_{0}} I_{j}

is positive. By Lemma 7.1, any limit functional

Λ

satisfies

Λ (\sum_{j \leq J_{0}} f_{j}) > 0 .

If Conjecture 7.7 holds, then Lemma 7.4 implies that

Λ

is

P^{*}

–invariant on a dense subspace, hence on all of

B_{tree, σ}

by continuity. Thus

Λ

is a nonzero positive

P^{*}

–invariant functional supported on

⋃_{j \leq J_{0}} I_{j}

. This contradicts Proposition 7.5. Therefore no infinite orbit can fail BEP. □

7.3. A Structured Program Toward the Weak Non-Retreat Principle

In this subsection we formalize the analytical strategy for establishing Statement (6) of Theorem 9.1. The aim is to show that every nonperiodic accelerated odd Collatz orbit must admit infinitely many short windows of valuation deficit, from which the weak non–retreat inequalities follow by Theorem 6.13. We present the plan as a sequence of definitions, lemmas, and a culminating theorem.

Definition 7.9

(Valuation Window and Average Valuation). For an accelerated odd orbit

{(n_{k})}_{k \geq 0}

, a window of length

t \geq 1

at position k is the finite sequence

W (k, t) : = {(ν_{2} (3 n_{k + i} + 1))}_{i = 0}^{t - 1} .

Its average valuation is defined by

A (W (k, t)) : = \frac{1}{t} \sum_{i = 0}^{t - 1} ν_{2} (3 n_{k + i} + 1) .

Definition 7.10

(Dangerous Window). Fix

t_{max} \in N

and

ε > 0

. A window

W (k, t)

with

1 \leq t \leq t_{max}

is called dangerous if

A (W (k, t)) > {log}_{2} 3 - ε .

Definition 7.11

(Catalogue of Dangerous Patterns). For a fixed valuation cutoff

L \geq 1

, define

A_{t} (L) : = \{(a_{0}, \dots, a_{t - 1}) \in {1, \dots, L}^{t} : \frac{1}{t} \sum_{i = 0}^{t - 1} a_{i} > {log}_{2} 3 - ε\},

and set

A_{\leq t_{max}} (L) : = ⋃_{t = 1}^{t_{max}} A_{t} (L) .

Elements of

A_{\leq t_{max}} (L)

are theType I dangerous patterns.

Lemma 7.12

(Congruence Realization of Dangerous Patterns). Let

a = (a_{0}, \dots, a_{t - 1})

be any valuation pattern with

1 \leq t \leq t_{max}

and

1 \leq a_{i} \leq L

. There exists a modulus

M (a) = 2^{K (a)} 3^{t},

where

K (a) \leq C \sum_{i = 0}^{t - 1} (a_{i} + 1)

for an absolute constant C, and a finite nonempty set of odd residue classes

R (a) \subset {(Z / M (a) Z)}^{\times}

such that an odd integer

n_{k}

satisfies

ν_{2} (3 n_{k + i} + 1) = a_{i}, i = 0, \dots, t - 1,

if and only if

n_{k} mod M (a) \in R (a) .

Proof.

We prove this by induction along the window using the explicit odd–to–odd transition formula and Lemma 6.11.

(1) The congruence condition for a single valuation. Fix an integer

a \geq 1

. Lemma 6.11 shows that the condition

n odd, ν_{2} (3 n + 1) = a

is equivalent to a finite disjunction of residue conditions of the form

n \equiv r (mod 2^{a + 1}),

where r ranges over a finite subset of odd residues modulo

2^{a + 1}

. More precisely, Lemma6.11 proves that

ν_{2} (3 n + 1) = a ⟺ 3 n + 1 \equiv 0 (mod 2^{a}), 3 n + 1 \neg \equiv 0 (mod 2^{a + 1}),

and both congruences are linear conditions on n modulo

2^{a + 1}

. Thus there is a finite set

R (a) \subset {(Z / 2^{a + 1} Z)}^{\times}

such that n satisfies

ν_{2} (3 n + 1) = a

if and only if

n mod 2^{a + 1} \in R (a)

.

(2) Odd-to-odd transitions. For an odd integer n with

ν_{2} (3 n + 1) = a

, the next odd iterate is

T (n) = \frac{3 n + 1}{2^{a}} .

Since

3 \equiv 0 (mod 3)

and

+ 1

ensures

3 n + 1 \equiv 1 (mod 3)

, the map

n \mapsto T (n)

induces a bijection between odd residues modulo

3 \cdot 2^{a + 1}

and odd residues modulo

3 \cdot 2^{a}

. Thus the congruence condition for n modulo

2^{a + 1}

lifts to an equivalent condition modulo

3 \cdot 2^{a + 1}

, ensuring that the residue class of

T (n)

is well–defined modulo

3 \cdot 2^{a}

.

(3) Inductive construction of the residue conditions along the window. Let

n_{0}

denote the starting odd integer at the beginning of the window. We must encode the conditions

ν_{2} (3 n_{i} + 1) = a_{i}, n_{i + 1} = T (n_{i}) = \frac{3 n_{i} + 1}{2^{a_{i}}}, i = 0, \dots, t - 1 .

Assume inductively that for some

i \geq 0

there exists a modulus

M_{i}

and a finite set of odd residue classes

R_{i} \subset {(Z / M_{i} Z)}^{\times}

such that

n_{0} mod M_{i} \in R_{i} ⟺ ν_{2} (3 n_{j} + 1) = a_{j} for j = 0, \dots, i - 1 .

We now add the condition

ν_{2} (3 n_{i} + 1) = a_{i}

.

From Step 1, this condition is equivalent to

n_{i} mod 2^{a_{i} + 1} \in R (a_{i}) .

But

n_{i}

is an affine linear function of

n_{0}

: by iterating

n_{j + 1} = \frac{3 n_{j} + 1}{2^{a_{j}}},

we obtain (as in the derivation of formula (166)) an explicit expression

n_{i} = \frac{3^{i}}{2^{S_{i}}} n_{0} + \frac{C_{i}}{2^{S_{i}}}, S_{i} = \sum_{j = 0}^{i - 1} a_{j},

with

C_{i}

an integer depending only on the pattern

(a_{0}, \dots, a_{i - 1})

. Thus the condition

n_{i} mod 2^{a_{i} + 1} \in R (a_{i})

becomes a linear congruence condition on

n_{0}

modulo

2^{a_{i} + 1 + S_{i}}

.

Therefore, setting

M_{i + 1} : = lcm (M_{i}, 2^{a_{i} + 1 + S_{i}}, 3^{i + 1}),

the set of

n_{0}

satisfying conditions up to index i is a finite union of residue classes modulo

M_{i + 1}

. We denote this finite set by

R_{i + 1} \subset {(Z / M_{i + 1} Z)}^{\times}

.

After t steps, we obtain a modulus

M (a) = M_{t} = 2^{K (a)} 3^{t},

where

K (a) = a_{0} + 1 + (a_{1} + 1 + S_{1}) + \dots + (a_{t - 1} + 1 + S_{t - 1})

is bounded by a constant multiple of

\sum_{i} (a_{i} + 1)

. The corresponding residue set

R (a) = R_{t}

satisfies the desired equivalence:

n_{0} mod M (a) \in R (a) ⟺ ν_{2} (3 n_{i} + 1) = a_{i} (0 \leq i < t) .

Renaming

n_{0}

as

n_{k}

completes the proof. □

Definition 7.13

(Dangerous Residue Set). Fix

L_{0} \geq 1

. Let M be a common multiple of the moduli

M (a)

over all

a \in A_{\leq t_{max}} (L_{0})

. Define thedangerous residue setmodulo M by

V_{danger} (L_{0}) : = ⋃_{a \in A_{\leq t_{max}} (L_{0})} \{r \in Z / M Z : r mod M (a) \in R (a)\} .

Definition 7.14

(Finite Residue–Valuation Graph). Fix

L_{0}

. Let M be as above. Consider the accelerated odd map

T (n) = \frac{3 n + 1}{2^{ν_{2} (3 n + 1)}}

on the odd residue classes modulo M. Define a finite directed graph

G_{L_{0}} (M)

whose vertices are odd residues

r \in Z / M Z

and with an edge

r \to s

if

T (r) \equiv s (mod M)

. Each vertex carries the label

a (r) : = ν_{2} (3 r + 1) .

Let

V_{\geq L_{0} + 1} : = {r : a (r) \geq L_{0} + 1} .

Lemma 7.15

(Tail Constraint Under Negation of Weak Non–Retreat). Fix

t_{max} \in N

and

ε > 0

, and let

L_{0} \geq 1

be a valuation cutoff. Suppose a nonperiodic accelerated odd orbit

{(n_{k})}_{k \geq 0}

violates the weak non–retreat condition in the sense that there exists

k_{0}

such that for every

k \geq k_{0}

and every

1 \leq t \leq t_{max}

the valuation window

W (k, t) : = {(ν_{2} (3 n_{k + i} + 1))}_{i = 0}^{t - 1}

is dangerous, i.e.

\frac{1}{t} \sum_{i = 0}^{t - 1} ν_{2} (3 n_{k + i} + 1) > {log}_{2} 3 - ε .

Let M be a modulus chosen so that the dangerous patterns of length

t = 1

with values in

{1, \dots, L_{0}}

are realized by a finite set of residue classes

V_{danger} (L_{0}) \subset {(Z / M Z)}^{\times}

as in Lemma 7.12, and let

V_{\geq L_{0} + 1} : = {r \in {(Z / M Z)}^{\times} : ν_{2} (3 r + 1) \geq L_{0} + 1}

be the set of residues whose one–step valuation is at least

L_{0} + 1

. Then for all sufficiently large k,

n_{k} mod M \in V_{danger} (L_{0}) \cup V_{\geq L_{0} + 1} .

In particular, the tail of

(n_{k})

defines an infinite directed path inside the induced subgraph of

G_{L_{0}} (M)

on

V_{danger} (L_{0}) \cup V_{\geq L_{0} + 1}

.

Proof.

By assumption, there exists

k_{0}

such that for every

k \geq k_{0}

and every

1 \leq t \leq t_{max}

, the window

W (k, t)

is dangerous:

\frac{1}{t} \sum_{i = 0}^{t - 1} ν_{2} (3 n_{k + i} + 1) > {log}_{2} 3 - ε .

Fix

k \geq k_{0}

and consider the window of length

t = 1

starting at k:

W (k, 1) = (a_{0}), a_{0} : = ν_{2} (3 n_{k} + 1) .

Since

W (k, 1)

is dangerous by hypothesis, its average valuation equals

a_{0}

and satisfies

a_{0} = \frac{1}{1} ν_{2} (3 n_{k} + 1) > {log}_{2} 3 - ε .

In particular

a_{0} \geq 1

, so

a_{0} \in N

.

We now distinguish two cases.

Case 1:

1 \leq a_{0} \leq L_{0}

. Then the length–one pattern

(a_{0})

belongs to the catalogue

A_{\leq t_{max}} (L_{0})

of dangerous patterns with entries bounded by

L_{0}

. By Lemma 7.12 applied with

t = 1

and this pattern, there exists a modulus

M (a_{0})

and a set of residue classes

R (a_{0}) \subset {(Z / M (a_{0}) Z)}^{\times}

such that

ν_{2} (3 n_{k} + 1) = a_{0} ⟺ n_{k} mod M (a_{0}) \in R (a_{0}) .

By construction of the global modulus M and the set

V_{danger} (L_{0})

(defined as the union of all such residue classes over dangerous

a_{0} \leq L_{0}

, lifted to modulus M), this implies

n_{k} mod M \in V_{danger} (L_{0}) .

Case 2:

a_{0} \geq L_{0} + 1

. In this case the one–step valuation of

n_{k}

is already large:

ν_{2} (3 n_{k} + 1) = a_{0} \geq L_{0} + 1 .

By definition of

V_{\geq L_{0} + 1}

we then have

n_{k} mod M \in V_{\geq L_{0} + 1} .

In either case, for every

k \geq k_{0}

we obtain

n_{k} mod M \in V_{danger} (L_{0}) \cup V_{\geq L_{0} + 1},

which proves the first claim.

For the second claim, recall that the finite directed graph

G_{L_{0}} (M)

has vertex set equal to the odd residues modulo M, with an edge

r \to s

whenever

T (r) \equiv s (mod M),

where T is the accelerated odd Collatz map. Since the orbit

(n_{k})

satisfies

n_{k + 1} = T (n_{k})

for all k, the sequence of residues

{(n_{k} mod M)}_{k \geq 0}

follows edges in

G_{L_{0}} (M)

.

We have shown that for all

k \geq k_{0}

, the residue

n_{k} mod M

lies in

V_{danger} (L_{0}) \cup V_{\geq L_{0} + 1}

, so the tail

{(n_{k})}_{k \geq k_{0}}

determines an infinite directed path inside the induced subgraph of

G_{L_{0}} (M)

on this vertex set. Discarding the finite initial segment

k < k_{0}

does not affect the existence of an infinite path, which completes the proof. □

Theorem 7.16

(Forward Non–Trapping in the Dangerous Regime). Fix

t_{max} \in N

and

0 < ε < {log}_{2} 3 - 1

. Then there exists

L_{0}

sufficiently large with the following property. Let M,

V_{danger} (L_{0})

and

V_{\geq L_{0} + 1}

be as in the construction of the residue–valuation graph

G_{L_{0}} (M)

.

Consider a nonperiodic accelerated odd Collatz orbit

{(n_{k})}_{k \geq 0}

in thedangerous–windows regime, meaning that there exists

k_{0}

such that for every

k \geq k_{0}

and every

1 \leq t \leq t_{max}

the valuation window

W (k, t)

is dangerous:

A (W (k, t)) = \frac{1}{t} \sum_{i = 0}^{t - 1} ν_{2} (3 n_{k + i} + 1) > {log}_{2} 3 - ε .

Then:

High–valuation anti–trapping.The orbit cannot eventually remain entirely inside the high–valuation set $E_{\leq t_{max}} (L_{0} + 1)$ . Equivalently, the residue sequence $(n_{k} mod M)$ cannot be eventually contained in $V_{\geq L_{0} + 1}$ .
Confinement to the dangerous residue set.If the residue sequence $(n_{k} mod M)$ visits $V_{danger} (L_{0})$ infinitely often and also visits $V_{\geq L_{0} + 1}$ infinitely often, then there exists $K \geq k_{0}$ such that

$n_{k} mod M \in V_{danger} (L_{0}) for all k \geq K .$

In particular, in the dangerous–windows regime no nonperiodic orbit can have its tail oscillate indefinitely between $V_{danger} (L_{0})$ and $V_{\geq L_{0} + 1}$ .

Proof.

We break the proof up into two parts, one for each statement.

Part 1: High–valuation anti–trapping

We first show that no nonperiodic orbit can eventually stay inside

V_{\geq L_{0} + 1}

once

L_{0}

is chosen large enough.

(1.1) Growth estimate for high–valuation steps. Let

a_{k} : = ν_{2} (3 n_{k} + 1)

. If

a_{k} \geq L_{0} + 1

, then

n_{k + 1} = \frac{3 n_{k} + 1}{2^{a_{k}}} \leq \frac{3 n_{k} + 1}{2^{L_{0} + 1}} = \frac{3}{2^{L_{0} + 1}} n_{k} + \frac{1}{2^{L_{0} + 1}} .

Define

λ : = \frac{3}{2^{L_{0} + 1}}, C_{0} : = \frac{1}{2^{L_{0} + 1}} .

If

L_{0} \geq 2

then

λ < 1

, and every high–valuation step satisfies

n_{k + 1} \leq λ n_{k} + C_{0} whenever a_{k} \geq L_{0} + 1 .

(189)

(1.2) Iterated contraction and boundedness. Assume there exists K such that

a_{k} \geq L_{0} + 1

for all

k \geq K

, i.e. the orbit is eventually contained in

V_{\geq L_{0} + 1}

. Then (189) holds for all

k \geq K

. Iterating gives, by a standard induction,

n_{K + t} \leq λ^{t} n_{K} + C_{0} \sum_{j = 0}^{t - 1} λ^{j} \leq λ^{t} n_{K} + \frac{C_{0}}{1 - λ} for all t \geq 0 .

Since

λ < 1

, we have

λ^{t} n_{K} \to 0

as

t \to \infty

, and hence

sup_{k \geq K} n_{k} \leq \frac{C_{0}}{1 - λ} < \infty .

Thus the accelerated odd orbit

(n_{k})

is bounded. Because the accelerated map is a self–map of the positive odd integers, any bounded orbit is eventually periodic: only finitely many odd integers are

\leq \frac{C_{0}}{1 - λ}

, so some value repeats and the map is deterministic thereafter.

This contradicts the hypothesis that the orbit is infinite and nonperiodic. Therefore no nonperiodic orbit can be eventually trapped in

V_{\geq L_{0} + 1}

.

(1.3) Interaction with the dangerous–windows hypothesis. The high–valuation anti–trapping conclusion does not require the dangerous–windows hypothesis. The latter is compatible with this conclusion provided we choose

L_{0}

large enough so that

L_{0} + 1 > {log}_{2} 3 - ε,

which ensures that every valuation in

V_{\geq L_{0} + 1}

is individually above the dangerous threshold. In particular, if an orbit were eventually in

V_{\geq L_{0} + 1}

, then every window (of any length) would automatically have average valuation

> {log}_{2} 3 - ε

, yet the orbit would still be forced to be bounded and hence eventually periodic. This contradiction establishes Part (1).

For any

L_{0} \geq 2

, an accelerated odd Collatz orbit that eventually stays in

V_{\geq L_{0} + 1}

must be bounded and hence eventually periodic. Thus no infinite nonperiodic orbit can be eventually contained in

V_{\geq L_{0} + 1}

.

Part 2: Confinement to the dangerous residue set

We now assume that the orbit visits both

V_{danger} (L_{0})

and

V_{\geq L_{0} + 1}

infinitely often and that all windows of length

\leq t_{max}

are dangerous. The goal is to show that the orbit must eventually remain in

V_{danger} (L_{0})

.

(2.1) Local bounds along danger and high–valuation segments. Write

a_{k} : = ν_{2} (3 n_{k} + 1)

.

For steps with

n_{k} \in V_{\geq L_{0} + 1}

(high valuations) we have, as in Part 1,

n_{k + 1} = \frac{3 n_{k} + 1}{2^{a_{k}}} \leq \frac{3}{2^{L_{0} + 1}} n_{k} + \frac{1}{2^{L_{0} + 1}} = : λ n_{k} + C_{0}, λ : = \frac{3}{2^{L_{0} + 1}} < 1 .

For steps with

n_{k} \in V_{danger} (L_{0})

we know from the dangerous windows condition (taking

t = 1

) that

{log}_{2} 3 - ε < a_{k} \leq L_{0},

and hence

n_{k + 1} = \frac{3 n_{k} + 1}{2^{a_{k}}} \leq \frac{3}{2^{a_{k}}} n_{k} + \frac{1}{2^{a_{k}}} \leq \frac{3}{2^{{log}_{2} 3 - ε}} n_{k} + 1 = 2^{ε} n_{k} + 1 .

Thus there exists

C_{1}

such that

n_{k + 1} \leq 2^{ε} n_{k} + C_{1} whenever n_{k} \in V_{danger} (L_{0}) .

(190)

Iterating (190) along a block of t consecutive indices with

n_{k}, \dots, n_{k + t - 1} \in V_{danger} (L_{0})

gives

n_{k + t} \leq 2^{ε t} n_{k} + C_{2} (t), C_{2} (t) \leq C_{1} \sum_{j = 0}^{t - 1} 2^{ε j} \leq \frac{C_{1}}{1 - 2^{- ε}} .

(191)

Similarly, iterating the high–valuation bound along a block of ℓ consecutive indices with

n_{k}, \dots, n_{k + ℓ - 1} \in V_{\geq L_{0} + 1}

yields

n_{k + ℓ} \leq λ^{ℓ} n_{k} + C_{3} (ℓ), C_{3} (ℓ) \leq \frac{C_{0}}{1 - λ} .

(192)

(2.2) Patterns and exact multiplicative factors. Consider a finite pattern

P = (t steps in V_{danger} (L_{0})) \circ (ℓ steps in V_{\geq L_{0} + 1}),

realized by the orbit. Let

a_{1}, \dots, a_{t}

be the valuations in the danger segment and

b_{1}, \dots, b_{ℓ}

those in the high–valuation segment. The exact multiplicative factor of

P

in the accelerated map is

F (P) : = \prod_{i = 1}^{t} \frac{3}{2^{a_{i}}} \cdot \prod_{j = 1}^{ℓ} \frac{3}{2^{b_{j}}} .

(193)

The additive contributions from the

+ 1

terms accumulate but do not affect the multiplicative factor.

On the danger segment the dangerous–windows condition enforces

\frac{1}{t} \sum_{i = 1}^{t} a_{i} > {log}_{2} 3 - ε ⟹ \sum_{i = 1}^{t} a_{i} > t ({log}_{2} 3 - ε),

so

\prod_{i = 1}^{t} \frac{3}{2^{a_{i}}} = \frac{3^{t}}{2^{\sum_{i = 1}^{t} a_{i}}} < \frac{3^{t}}{2^{t ({log}_{2} 3 - ε)}} = 2^{ε t} .

On the high–valuation segment

b_{j} \geq L_{0} + 1

, hence

\prod_{j = 1}^{ℓ} \frac{3}{2^{b_{j}}} \leq {(\frac{3}{2^{L_{0} + 1}})}^{ℓ} = λ^{ℓ} .

Combining these,

F (P) \leq 2^{ε t} λ^{ℓ} .

(194)

Thus

2^{ε t} λ^{ℓ}

is a worst–case upper bound: if

2^{ε t} λ^{ℓ} \leq 1

, then necessarily

F (P) \leq 1

.

Now decompose the tail of the orbit into alternating “danger blocks” and “high–valuation excursions”. Let

k_{j}

be the start of the j-th danger block, of length

t_{j}

, followed by an excursion of length

ℓ_{j}

:

n_{k_{j}} \in V_{danger} (L_{0}), n_{k_{j} + 1}, \dots, n_{k_{j} + t_{j} - 1} \in V_{danger} (L_{0}), n_{k_{j} + t_{j}}, \dots, n_{k_{j} + t_{j} + ℓ_{j} - 1} \in V_{\geq L_{0} + 1},

and assume

n_{k_{j} + t_{j} + ℓ_{j}} \in V_{danger} (L_{0})

again. For this realized pattern

P_{j}

we have an affine relation

n_{k_{j} + t_{j} + ℓ_{j}} = F (P_{j}) n_{k_{j}} + β_{j},

for some integer

β_{j}

depending only on the residue pattern; in particular there is a constant

C_{4}

(independent of j) such that

n_{k_{j + 1}} \leq F (P_{j}) n_{k_{j}} + C_{4} for all j .

(195)

(2.3) Contraction along each finite pattern. Let

θ : = {log}_{2} 3 - ε .

By the dangerous–windows hypothesis for the full orbit, we in particular have

a_{k} = ν_{2} (3 n_{k} + 1) > θ for every k \geq k_{0},

since this is the case

t = 1

in the window condition.

Fix one of the patterns

P_{j}

from Step 2.2, of total length

L_{j} : = t_{j} + ℓ_{j} .

Let

a_{0}^{(j)}, \dots, a_{L_{j} - 1}^{(j)}

be the valuation sequence along this pattern, so

a_{m}^{(j)} = ν_{2} (3 n_{k_{j} + m} + 1), m = 0, \dots, L_{j} - 1 .

By the preceding remark,

a_{m}^{(j)} > θ for all m .

Define the excesses

δ_{m}^{(j)} : = a_{m}^{(j)} - θ, m = 0, \dots, L_{j} - 1 .

Since each

a_{m}^{(j)}

is an integer and

a_{m}^{(j)} > θ

, we have

a_{m}^{(j)} \geq ⌊ θ ⌋ + 1 .

Choose

ε > 0

small enough that

1 < θ = {log}_{2} 3 - ε < 2,

so

⌊ θ ⌋ = 1

and hence

a_{m}^{(j)} \geq 2

for all m. Thus

δ_{m}^{(j)} = a_{m}^{(j)} - θ \geq 2 - θ = 2 - ({log}_{2} 3 - ε) = 2 - {log}_{2} 3 + ε .

Set

η : = 2 - {log}_{2} 3 + ε .

Since

2 - {log}_{2} 3 > 0

, we have

η > ε, δ_{m}^{(j)} \geq η for all m .

The exact multiplicative factor of the pattern

P_{j}

is

F (P_{j}) = \prod_{m = 0}^{L_{j} - 1} \frac{3}{2^{a_{m}^{(j)}}} = \prod_{m = 0}^{L_{j} - 1} \frac{3}{2^{θ + δ_{m}^{(j)}}} = \prod_{m = 0}^{L_{j} - 1} (\frac{3}{2^{θ}}) 2^{- δ_{m}^{(j)}} .

As before,

\frac{3}{2^{θ}} = \frac{3}{2^{{log}_{2} 3 - ε}} = 2^{ε},

so

F (P_{j}) = 2^{L_{j} ε} 2^{- \sum_{m = 0}^{L_{j} - 1} δ_{m}^{(j)}} = 2^{L_{j} ε - \sum_{m = 0}^{L_{j} - 1} δ_{m}^{(j)}} .

Using

δ_{m}^{(j)} \geq η

,

\sum_{m = 0}^{L_{j} - 1} δ_{m}^{(j)} \geq L_{j} η,

so we obtain

F (P_{j}) \leq 2^{L_{j} ε - L_{j} η} = 2^{- L_{j} (η - ε)} .

Since

η - ε = 2 - {log}_{2} 3 > 0

is fixed, we can define

Λ : = 2^{- (η - ε)} = 2^{- (2 - {log}_{2} 3)} \in (0, 1),

and conclude

F (P_{j}) \leq Λ^{L_{j}} \leq Λ for every j .

Thus each danger–excursion pattern

P_{j}

is uniformly contracting in its multiplicative factor, with a contraction constant

Λ < 1

that depends only on

ε

(and not on j or the particular pattern).

Crucially, this argument uses only the pointwise inequality

a_{m}^{(j)} > θ

(the case

t = 1

of the dangerous–windows condition) plus integrality, and applies directly to the finite sequence along

P_{j}

. No completion to a closed walk in the residue graph is required.

(2.4) Boundedness and periodicity. From (195) and the uniform contraction bound we obtain

n_{k_{j + 1}} \leq Λ n_{k_{j}} + C_{4} for all j,

with

0 < Λ < 1

. Iterating this affine inequality yields

n_{k_{j}} \leq Λ^{j} n_{k_{0}} + \frac{C_{4}}{1 - Λ},

so the subsequence

(n_{k_{j}})

is bounded.

The intermediate values inside each danger block and each high–valuation excursion are then bounded using (191) and (192): the constants

C_{2} (t)

and

C_{3} (ℓ)

are uniformly bounded in

t \leq t_{max}

and

ℓ \geq 0

, and each block has finite length. Consequently the entire tail of the orbit is bounded.

Any bounded accelerated odd Collatz orbit is eventually periodic, so we obtain a contradiction with the assumption that the orbit is infinite and nonperiodic while visiting both

V_{danger} (L_{0})

and

V_{\geq L_{0} + 1}

infinitely often. Therefore, under the dangerous–windows hypothesis, an infinite nonperiodic orbit cannot oscillate indefinitely between the two sets: it must eventually remain in

V_{danger} (L_{0})

.

For Part (2), the only ingredient from the dangerous–windows hypothesis needed for the contraction argument is the pointwise bound

ν_{2} (3 n_{k} + 1) > {log}_{2} 3 - ε for all sufficiently large k

on the valuations along the orbit tail. Together with integrality, this forces every danger–excursion pattern

P_{j}

to have multiplicative factor

F (P_{j}) \leq Λ < 1

uniformly in j, yielding a global contraction inequality

n_{k_{j + 1}} \leq Λ n_{k_{j}} + C_{4}

along the subsequence of entries into

V_{danger} (L_{0})

. This implies boundedness and eventual periodicity of the orbit, contradicting the hypothesis of an infinite nonperiodic orbit that visits both

V_{danger} (L_{0})

and

V_{\geq L_{0} + 1}

infinitely often. Hence any infinite nonperiodic orbit satisfying the dangerous–windows condition must eventually remain in the dangerous residue set, completing the proof of the theorem. □

Theorem 7.17 (Non–Realizability of Dangerous Residue Cycles). Let H be the induced subgraph of the residue–valuation graph

G_{L_{0}} (M)

on the dangerous residue set

V_{danger} (L_{0})

. Let

C = (v_{0} \to v_{1} \to \dots \to v_{p - 1} \to v_{0})

be any directed cycle in H, where

v_{i} \in V_{danger} (L_{0})

and seach

v_{i} \to v_{i + 1}

(indices taken modulo p) is an edge in

G_{L_{0}} (M)

. sThen no infinite accelerated odd Collatz orbit

{(n_{k})}_{k \geq 0}

can eventually realize C modulo M in the following sense: there do not exist integers

i \geq 0

and

p \geq 1

such that

n_{k} \equiv v_{(k - i) mod p} (mod M) for all k \geq i .

Equivalently, no nonperiodic accelerated odd orbit can have its residue sequence s

(n_{k} mod M)

eventually trapped on a directed cycle inside the subgraph H.

Proof. We prove that no infinite nonperiodic accelerated odd Collatz orbit can eventually follow a directed cycle in the dangerous residue graph H.

(1) Structure of a dangerous cycle. Let

C = (v_{0} \to v_{1} \to \dots \to v_{p - 1} \to v_{0})

be a directed cycle in H of length

p \geq 1

. For each i set

a_{i} : = ν_{2} (3 v_{i} + 1) .

Since

v_{i} \in V_{danger} (L_{0})

, each

a_{i}

satisfies

1 \leq a_{i} \leq L_{0} .

Define

S_{p} : = \sum_{i = 0}^{p - 1} a_{i}, α : = \frac{3^{p}}{2^{S_{p}}} .

Then

S_{p} \geq p \geq 1

, so

2^{S_{p}} > 1

and

α

is a nontrivial rational number. In lowest terms we may write

α = \frac{A}{B}, A = 3^{p} odd, B = 2^{S_{p}} > 1 .

In particular

α \neq 1

since

3^{p} \neq 2^{S_{p}}

.

(2) Lifting the cycle to an integer orbit. Assume for contradiction that there exists an infinite nonperiodic accelerated odd Collatz orbit

{(n_{k})}_{k \geq 0}

that eventually follows C modulo M. Thus there is

i_{0} \geq 0

such that for all

k \geq i_{0}

,

n_{k} \equiv v_{(k - i_{0}) mod p} (mod M) .

For

k = i_{0} + m p + r

with

0 \leq r < p

, iterating the accelerated map

T (x) = \frac{3 x + 1}{2^{a (x)}}

along one period of the cycle yields an affine relation of the form

n_{k + p} = \frac{3^{p}}{2^{S_{p}}} n_{k} + \frac{B_{r}}{2^{S_{p}}} = α n_{k} + β_{r},

(196)

where

B_{r} \in Z

depends only on the residue class

v_{r}

, and

β_{r} : = B_{r} / B

.

(3) Recurrence and explicit solution. Fix

r \in {0, \dots, p - 1}

and define the subsequence

x_{m} : = n_{i_{0} + m p + r}, m \geq 0 .

Then (196) becomes

x_{m + 1} = α x_{m} + β_{r}, α = \frac{A}{B}, β_{r} = \frac{B_{r}}{B} .

Since

α \neq 1

, the standard solution of this affine recurrence gives

x_{m} = α^{m} x_{0} + β_{r} \frac{α^{m} - 1}{α - 1} = α^{m} C_{r} + D_{r},

where

C_{r} : = x_{0} - \frac{β_{r}}{α - 1}, D_{r} : = \frac{β_{r}}{α - 1} .

Because

x_{m} \in Z

for all m, we have

α^{m} C_{r} \in Z - D_{r} for all m \geq 0 .

(197)

In particular

α^{m} C_{r}

is rational for all m.

(4) Normal form for

C_{r}

and a divisibility condition. Write

C_{r}

in lowest terms as

C_{r} = \frac{c}{2^{t} d},

where

c, d \in Z

are odd and

t \geq 0

. Using

α = A / B = 3^{p} / 2^{S_{p}}

we obtain

α^{m} C_{r} = {(\frac{3^{p}}{2^{S_{p}}})}^{m} \frac{c}{2^{t} d} = \frac{3^{p m} c}{2^{m S_{p} + t} d} .

By (197) the quantity

α^{m} C_{r}

must be an integer for all

m \geq 0

, hence

2^{m S_{p} + t} d ∣ 3^{p m} c for all m \geq 0 .

(198)

(5) Denominator–growth forcing

C_{r} = 0

. Because d and

3^{p m}

are odd, the right–hand side of (198), namely

3^{p m} c

, is an odd integer when

c \neq 0

. From (198) we deduce that for all

m \geq 0

,

2^{m S_{p} + t} ∣ 3^{p m} c .

Since

S_{p} \geq 1

, we have

m S_{p} + t \geq S_{p} + t \geq 1

for

m \geq 1

, so

2^{m S_{p} + t} \geq 2

. If

c \neq 0

, then

3^{p m} c

is a nonzero odd integer that is divisible by 2, which is impossible. Therefore

c = 0

, and hence

C_{r} = 0 .

(6) Eventual periodicity. With

C_{r} = 0

, the explicit formula reduces to

x_{m} = D_{r} for all m \geq 0 .

Thus each residue–class subsequence

{(n_{i_{0} + m p + r})}_{m \geq 0}

is constant. In particular the tail

{(n_{k})}_{k \geq i_{0}}

takes values in the finite set

{n_{i_{0}}, n_{i_{0} + 1}, \dots, n_{i_{0} + p - 1}}

in a periodic pattern of period dividing p. Hence the orbit

(n_{k})

is eventually periodic.

This contradicts the assumption that the orbit is infinite and nonperiodic. Therefore no infinite nonperiodic accelerated odd Collatz orbit can eventually follow a directed cycle in the dangerous residue graph H, which proves the theorem. □

Lemma 7.18 (Finite Graph Periodicity). Let H be the induced subgraph of

G_{L_{0}} (M)

on

V_{danger} (L_{0})

. Then every infinite directed path in H is eventually periodic. That is, there exist integers

i \geq 0

and

p \geq 1

such that the vertex sequence

{(v_{k})}_{k \geq 0}

satisfies

v_{k + p} = v_{k} for all k \geq i .

Proof. Since H has finitely many vertices, say N, the sequence

(v_{k})

takes values in a finite set of size N. Hence there exist indices

0 \leq i < j

with

v_{i} = v_{j}

; choose such a pair with

j - i

minimal, and set

p : = j - i \geq 1

. The segment

v_{i} \to v_{i + 1} \to \dots \to v_{j - 1} \to v_{j} = v_{i}

is a directed cycle of length p in H. Because the path

(v_{k})

is directed, once it reaches

v_{i}

again, all future steps must follow the outgoing edges of this cycle. Thus

v_{k + p} = v_{k} for all k \geq i,

and the path is eventually periodic. □

Lemma 7.19 (Finite Graph Obstruction)Assume Theorem 7.17. Then no infinite directed path in H can arise as the sequence of residues

n_{k} mod M

for a valid infinite nonperiodic accelerated odd Collatz orbit

{(n_{k})}_{k \geq 0}

.

Proof. Let

{(n_{k})}_{k \geq 0}

be an infinite nonperiodic accelerated odd Collatz orbit, and set

v_{k} : = n_{k} mod M

. Suppose that

v_{k} \in V_{danger} (L_{0})

for all sufficiently large k, so that

{(v_{k})}_{k \geq K}

is an infinite directed path in H for some K. By Lemma 7.18, this path is eventually periodic: there exist integers

i \geq K

and

p \geq 1

such that

v_{k + p} = v_{k} for all k \geq i .

In particular, the finite list

C = (v_{i}, v_{i + 1}, \dots, v_{i + p - 1})

forms a directed cycle in H, and the residue sequence

(v_{k})

follows this cycle from time i onward:

v_{k} = v_{(k - i) mod p + i} for all k \geq i .

Equivalently,

n_{k} \equiv v_{(k - i) mod p + i} (mod M) for all k \geq i .

This is precisely excluded by Theorem 7.17, which asserts that no infinite accelerated odd Collatz orbit can eventually follow a cycle in H modulo M. Hence, under the Theorem 7.17, no such infinite path in H can be realized by the residue sequence of a valid infinite nonperiodic orbit. □

Theorem 7.20 (Weak Non–Retreat). Assume Lemmas 7.12, 7.15, 7.18, and 7.19, together with Theorem 7.16. Then every nonperiodic accelerated odd Collatz orbit admits infinitely many windows of length

1 \leq t \leq t_{max}

whose average valuation satisfies

A (W (k, t)) \leq {log}_{2} 3 - ε .

Consequently, by Theorem 6.13 there exist constants

c > 0

and

C_{0}

such that

J (k + t) \geq J (k) + c t - C_{0}

for infinitely many pairs

(k, t)

, and the weak non–retreat property of 1.1(6) holds under these assumptions.

Proof. Let

{(n_{k})}_{k \geq 0}

be a nonperiodic accelerated odd Collatz orbit, and suppose toward contradiction that only finitely many pairs

(k, t)

with

1 \leq t \leq t_{max}

satisfy

A (W (k, t)) \leq {log}_{2} 3 - ε .

Then there exists

k_{0}

such that for all

k \geq k_{0}

and all

1 \leq t \leq t_{max}

, the window

W (k, t)

is dangerous. This places us in the dangerous–windows regime of Theorem 7.16. Fix

L_{0}

and the associated modulus M and residue sets

V_{danger} (L_{0})

and

V_{\geq L_{0} + 1}

. By Lemma 7.15 there exists

k_{1} \geq k_{0}

such that for all

k \geq k_{1}

,

n_{k} mod M \in V_{danger} (L_{0}) \cup V_{\geq L_{0} + 1} .

By Theorem 7.16 (1), for

L_{0}

chosen large enough no nonperiodic orbit can eventually remain entirely inside the high–valuation set

E_{\leq t_{max}} (L_{0} + 1)

, hence the residue sequence cannot be eventually contained in

V_{\geq L_{0} + 1}

. There are therefore infinitely many k with

k \geq k_{1}

and

n_{k} mod M \in V_{danger} (L_{0})

. If the sequence

(n_{k} mod M)

visits

V_{\geq L_{0} + 1}

only finitely many times, then there exists

K \geq k_{1}

with

n_{k} mod M \in V_{danger} (L_{0}) for all k \geq K .

If, on the other hand, it visits

V_{\geq L_{0} + 1}

infinitely often, then Theorem 7.16 (2) applies in the dangerous–windows regime and again yields an index

K \geq k_{1}

such that

n_{k} mod M \in V_{danger} (L_{0}) for all k \geq K .

In either case, the tail

{(n_{k} mod M)}_{k \geq K}

defines an infinite directed path inside the induced subgraph H of

G_{L_{0}} (M)

on

V_{danger} (L_{0})

.

By Lemma 7.19, such an infinite path cannot arise from a valid infinite nonperiodic Collatz orbits. This contradicts the assumed nonperiodicity of

(n_{k})

and the existence of only finitely many valuation–deficit windows.

Therefore every nonperiodic accelerated odd orbit must admit infinitely many pairs

(k, t)

with

1 \leq t \leq t_{max}

and

A (W (k, t)) \leq {log}_{2} 3 - ε .

For each such window, Theorem 6.13 gives a forward drift estimate

J (k + t) \geq J (k) + c t - C_{0}

with

c > 0

and

C_{0}

depending only on the global parameters. Since there are infinitely many such windows, these inequalities hold along an infinite subsequence of times, which is precisely the weak non–retreat property. □

Logical Closure of the Spectral-Dynamical Framework

In this section we formalize all the propositions and their proofs for the Dynamical Forms Theorem 1.1

(3) ⇒ (2)

Proposition 7.21 (Orbit–supported invariant functionals are impossible)Let P be the backward Collatz transfer operator on

B_{tree, σ}

and assume the Peripheral Spectral Classification Theorem 6.2. Then there is no nonzero

P^{*}

–invariant functional in

B_{tree, σ}^{*}

whose support is contained in a single forward Collatz orbit.

In particular, statement (3) of Theorem1.2(existence of a nonzero orbit–supported invariant functional for every infinite orbit) cannot hold unless there are no infinite forward orbits at all. Hence (3) implies (2).

Proof. By Theorem 6.2 and positivity of P, there exist

h > 0

and

ϕ > 0

with

P h = h

,

P^{*} ϕ = ϕ

, and the eigenspace for eigenvalue 1 of

P^{*}

is one–dimensional and spanned by

ϕ

. Thus if

Ψ \in B_{tree, σ}^{*}

satisfies

P^{*} Ψ = Ψ

, then

Ψ = c ϕ

for some scalar c.

Fix a forward orbit

O^{+} (n_{0}) = {T^{k} (n_{0}) : k \geq 0}

and suppose there exists a nonzero

P^{*}

–invariant functional

Ψ

supported on this orbit, in the sense that

Ψ (f) = 0

whenever f vanishes on

O^{+} (n_{0})

. Then

Ψ = c ϕ

with

c \neq 0

.

Consider the nonnegative test function

f (n) : = 1_{N ∖ O^{+} (n_{0})} (n) .

By support of

Ψ

on the orbit we have

Ψ (f) = 0

. On the other hand,

f \geq 0

and

f \neg \equiv 0

, so strict positivity of

ϕ

gives

ϕ (f) > 0

and hence

Ψ (f) = c ϕ (f) \neq 0,

a contradiction. Therefore no nonzero

P^{*}

–invariant functional can be supported on a single forward orbit. If statement (3) of Theorem 1.1 holds, then every infinite forward orbit would support such a functional, which we have just ruled out. Hence there are no infinite forward orbits, and in particular every orbit has bounded block index:

{sup}_{k \geq 0} j (n_{k}) < \infty

, which is statement (2). □

(3) + (8) ⇒ (1)

Proposition 7.22 (Strong Collatz from orbit averages and residue graphs). Assume Conjecture 6.3 ( (3) of Theorem 1.1 ) with Theorems 7.17 and 7.16. Then statement (1)of Theorem 1.1 (Strong Collatz Conjecture) holds.

Proof.

(1): Exclusion of nontrivial cycles. By Theorems 7.17 and 7.16, no nontrivial accelerated odd Collatz cycle can persist inside the dangerous residue graph, and every nonperiodic accelerated odd orbit eventually escapes it. The argument given previously in the proof of Theorem 7.20 therefore excludes all nontrivial cycles; we do not repeat it here.

(2) Exclusion of infinite orbits via orbit–averaging and spectral rigidity. Suppose for contradiction that there exists an infinite forward Collatz orbit

{(n_{k})}_{k \geq 0}

with base point

n_{0}

. By statement (3) (Orbit–Averaging Conjecture), this orbit produces a nonzero

P^{*}

–invariant functional

Λ \in B_{tree, σ}^{*}, P^{*} Λ = Λ,

which is obtained as an orbit–average along

{(n_{k})}_{k \geq 0}

. In particular, if a test function

f \in B_{tree, σ}

satisfies

f (n_{k}) = 0

for all sufficiently large k, then the orbit–average of f along

(n_{k})

vanishes and hence

Λ (f) = 0 .

(199)

On the other hand, the spectral classification theorem (uniqueness of the positive eigenfunctional, Theorem 5.1) asserts that every

P^{*}

–invariant functional is a scalar multiple of the distinguished Perron–Frobenius eigenfunctional

ϕ

:

Λ = α ϕ, α \neq 0 .

We now construct a smooth cutoff that lies in

B_{tree, σ}

and vanishes along the tail of the orbit. By the nontrapping and escape statement (8), the forward orbit

(n_{k})

cannot remain indefinitely in any fixed low region of the tree. Hence we may choose

m \in N

so large that

n_{k} > m for all k \geq K

for some

K \geq 0

. Define a test function

h_{m} : N \to [0, \infty)

by

h_{m} (n) : = m^{- 2 σ} 1_{{n \leq m}} n^{- σ} .

We first verify that

h_{m} \in B_{tree, σ}

. The weighted

ℓ^{1}

norm is finite:

∥ h_{m} ∥_{ℓ_{σ}^{1}} = \sum_{n \geq 1} h_{m} (n) n^{- σ} = \sum_{n \leq m} m^{- 2 σ} n^{- 2 σ} \leq m^{- 2 σ} m 1^{- 2 σ} = m^{1 - 2 σ} < \infty

since

σ > 1

. Moreover

h_{m}

is supported on the finite set

{1, \dots, m}

, so only finitely many tree blocks

I_{j}

intersect its support. In each such block, the contribution to the block seminorm comes from finitely many points with uniformly bounded size, multiplied by the decaying prefactor

m^{- 2 σ}

. Thus the tree seminorm of

h_{m}

is finite and

h_{m} \in B_{tree, σ}

.

By construction, we have

h_{m} (n_{k}) = 0

for all

k \geq K

, since

n_{k} > m

implies

h_{m} (n_{k}) = 0

. Therefore

h_{m}

vanishes along the tail of the orbit, and the defining orbit–averaging property of

Λ

yields

Λ (h_{m}) = 0

via (199). On the other hand, write

ϕ (f) = \sum_{n \geq 1} f (n) h (n)

with h the strictly positive Perron–Frobenius eigenfunction satisfying

P h = h

. Then

ϕ (h_{m}) = \sum_{n \leq m} h_{m} (n) h (n) = m^{- 2 σ} \sum_{n \leq m} n^{- σ} h (n) > 0,

because

h (n) > 0

for all

n \geq 1

and the sum is finite and nonempty. Combining the two expressions for

Λ (h_{m})

gives

0 = Λ (h_{m}) = α ϕ (h_{m}),

which contradicts

α \neq 0

and

ϕ (h_{m}) > 0

. This contradiction shows that no infinite forward Collatz orbit can exist.

We have ruled out both nontrivial cycles (Step 1) and infinite orbits (Step 2). Hence every forward Collatz orbit is finite, and the only cycle is the trivial

1 \to 4 \to 2 \to 1

cycle. This is precisely statement (1) of Theorem 1.1. □

(4) ⇒ (3)

Proposition 7.23 (Block orbit averages imply the Orbit–Averaging Conjecture).Assume statement (4) of Theorem 1.1 (the Block–Orbit–Averaging Conjecture). Then statement (3) of Theorem 1.1 (the Orbit–Averaging Conjecture, Conjecture 6.3) holds. In particular, (4) implies (3) in Theorem1.1.

Proof. Let

{(n_{k})}_{k \geq 0}

be an infinite forward Collatz orbit with base point

n_{0}

. For each

N \geq 1

consider the orbit Cesàro functional

Φ_{N} (f) : = \frac{1}{N} \sum_{k = 0}^{N - 1} f (n_{k}), f \in B_{tree, σ} .

By Lemma 5.30, the family

{(Φ_{N})}_{N \geq 1}

is uniformly bounded in

B_{tree, σ}^{*}

, hence weak^* relatively compact.

Statement (4) asserts, for this orbit, the existence of a nontrivial block–orbit average. Concretely, it guarantees that there exists a sequence

N_{j} \to \infty

and a nonzero functional

Λ \in B_{tree, σ}^{*} ∖ {0}

such that

Φ_{N_{j}} \overset{{weak}^{*}}{⟶} Λ .

Thus

Λ

is a nonzero weak^* limit point of the orbit Cesàro functionals for

(n_{k})

. By Proposition 5.31, every such weak^* limit of

(Φ_{N})

along a forward Collatz orbit satisfies two key properties:

P^{*} Λ = Λ,

and

Λ

is supported entirely on the orbit

O^{+} (n_{0})

in the sense that

f (n_{k}) = 0 \forall k \geq 0 ⟹ Λ (f) = 0 .

Since

Λ \neq 0

, this gives exactly the conclusion of statement (3) for the orbit

O^{+} (n_{0})

: it produces a nonzero

P^{*}

–invariant linear functional supported on that orbit.

Because the choice of the infinite orbit

(n_{k})

was arbitrary, the same argument applies to every infinite forward orbit. Hence the Orbit–Averaging Conjecture (3) holds whenever the Block–Orbit–Averaging statement (4) holds, and we have shown (4) ⇒ (3) in Theorem 1.1. □

(5) ⇒ (4)

Proposition 7.24 (Implication from supercritical linear block growth to BOA). Assume Statement (5) (Block–Escape Implies Supercritical Linear Block Growth, Conjecture 6.10). Then Statement (4) (Block–Orbit–Averaging, Conjecture 6.4) holds.

Proof. Suppose for contradiction that Statement (4) fails. Then there exists an infinite forward Collatz orbit

O^{+} (n_{0}) = {T^{k} (n_{0})}_{k \geq 0}

such that for every integer

J \geq 0

the orbit does not spend a positive proportion of time in the finite union of low blocks

⋃_{j \leq J} I_{j}

. In terms of the block index

J (k) : = the unique j \geq 0 with T^{k} (n_{0}) \in I_{j},

this means that for every

J \geq 0

the lower asymptotic frequency of visits to

⋃_{j \leq J} I_{j}

is zero. Equivalently, the orbit satisfies the block–escape condition of Definition 6.5, namely

lim_{N \to \infty} \frac{1}{N} \sum_{k = 0}^{N - 1} 1_{{J (k) \leq J}} = 0 for every J \geq 0,

(200)

or the Block–Escape Property. By Statement (5) (Conjecture 6.10), every infinite orbit satisfying BEP must exhibit supercritical linear block growth. Concretely, there exists a constant

α > \frac{log 2}{log 6}

and a strictly increasing subsequence

{(k_{ℓ})}_{ℓ \geq 1}

with

k_{ℓ} \to \infty

such that

J (k_{ℓ}) \geq α k_{ℓ} for all ℓ \geq 1 .

(201)

Thus the orbit

O^{+} (n_{0})

satisfies both the block–escape condition (200) and the linear growth condition (201).

We now invoke Proposition 7.5. That result states that if an infinite orbit satisfies the block–escape condition

\forall J_{0} \geq 0 : lim_{N \to \infty} \frac{1}{N} \sum_{k = 0}^{N - 1} 1_{{J (k) \leq J_{0}}} = 0

and there exists

β > 0

and a subsequence

(k_{ℓ})

with

J (k_{ℓ}) \geq β k_{ℓ} for all ℓ,

then the orbit violates the universal exponential growth bound of Lemma 6.7. In the proof of Proposition 7.5 one obtains

\underset{ℓ \to \infty}{lim sup} \frac{1}{k_{ℓ}} log T^{k_{ℓ}} (n_{0}) \geq β log 6 .

Taking

β = α

and using

α > log 2 / log 6

, this yields

\underset{ℓ \to \infty}{lim sup} \frac{1}{k_{ℓ}} log T^{k_{ℓ}} (n_{0}) \geq α log 6 > log 2,

which contradicts Lemma 6.7, since that lemma gives the universal upper bound

\underset{k \to \infty}{lim sup} \frac{1}{k} log T^{k} (n_{0}) \leq log 2 .

Therefore no infinite orbit can simultaneously satisfy BEP and the supercritical linear growth condition (201). In particular, our initial assumption that there exists an infinite orbit for which Statement (4) fails leads to a contradiction, because failure of (4) was exactly what forced BEP in (200).

Taking the contrapositive, every infinite orbit must fail BEP. Thus for each infinite orbit

O^{+} (n_{0})

there exists some

J_{0} \geq 0

such that the lower asymptotic frequency of visits to

⋃_{j \leq J_{0}} I_{j}

is strictly positive:

\underset{N \to \infty}{lim inf} \frac{1}{N} \sum_{k = 0}^{N - 1} 1_{{J (k) \leq J_{0}}} > .

This is precisely the Block–Orbit–Averaging statement (4). Hence, under Statement (5), Statement (4) holds. □

(6) ⇒ (5)

Lemma 7.25 (Flexible choice of drift constants). Let

(n_{k})

be a forward orbit and suppose there exist

ε > 0

,

t_{max} \in N

, and

K_{0} \geq 0

such that for all

k \geq K_{0}

and all

1 \leq t \leq t_{max}

one has

j (n_{k + t}) \geq j (n_{k}) + ε t - 1 .

(202)

Then for any

δ \in (0, ε)

the choice

c : = ε - δ, C_{0} : = 1

also yields a valid weak non–retreat inequality:

j (n_{k + t}) \geq j (n_{k}) + c t - C_{0}, 1 \leq t \leq t_{max}, k \geq K_{0} .

(203)

Moreover, for these constants one has

\frac{c t_{max} - C_{0}}{t_{max}} = (ε - δ) - \frac{1}{t_{max}} .

(204)

Proof. Fix

δ \in (0, ε)

and set

c : = ε - δ

and

C_{0} : = 1

. Then for each

1 \leq t \leq t_{max}

we compute

(ε t - 1) - (c t - C_{0}) = (ε t - 1) - ((ε - δ) t - 1) = δ t \geq δ > 0 .

Thus the right–hand side of (202) is strictly larger than

c t - C_{0}

for every

t \in [1, t_{max}]

, and hence (202) implies (203) for all

k \geq K_{0}

. The identity (204) is an immediate algebraic rearrangement:

\frac{c t_{max} - C_{0}}{t_{max}} = \frac{(ε - δ) t_{max} - 1}{t_{max}} = (ε - δ) - \frac{1}{t_{max}} .

This proves the lemma. □

Corollary 7.26 (Consistency of quantitative deficiency with valuation drift). Let

α_{*} = ({log}_{2} 3 - 1) / {log}_{2} 6

. Suppose that for some choice of

L_{0}

and

t_{max}

the residue–graph analysis and Theorem 7.20 produce a deficit parameter

ε = ε (L_{0}, t_{max})

such that

ε (L_{0}, t_{max}) > α_{*} + η + \frac{1}{t_{max}}

(205)

for a given

η > 0

. Then there exist constants

c > 0

,

C_{0} \geq 0

, and the same

t_{max}

such that

\frac{c t_{max} - C_{0}}{t_{max}} > α_{*} + η,

Proof. Apply Theorem 6.13 with the deficit parameter

ε = ε (L_{0}, t_{max})

to obtain an inequality of the form (202) for some

K_{0}

. Fix

η > 0

and choose

δ \in (0, ε (L_{0}, t_{max}))

so small that

ε (L_{0}, t_{max}) - δ - \frac{1}{t_{max}} > α_{*} + η .

(This is possible precisely because of the strict inequality (205).) With this choice of

δ

, define

c : = ε - δ

and

C_{0} : = 1

as in Lemma 7.25. Then (203) holds for all

1 \leq t \leq t_{max}

and all

k \geq K_{0}

, and by (204) we have

\frac{c t_{max} - C_{0}}{t_{max}} = (ε - δ) - \frac{1}{t_{max}} > α_{*} + η .

□

Proposition 7.27 (Quantitative weak non–retreat implies supercritical linear block growth). Let

(c, C_{0}, t_{max}, J_{0})

be constants for which the Quantitative Weak Non–Retreat Principle(6) holds for every infinite forward orbit. Suppose these constants arise from the valuation–drift analysis (Theorem 6.13) and satisfy

\frac{c t_{max} - C_{0}}{t_{max}} > α_{*}, α_{*} : = \frac{{log}_{2} 3 - 1}{{log}_{2} 6} .

In particular

c t_{max} - C_{0} > 0

. Then for every infinite orbit satisfying the Block–Escape Property there exists a constant

α > α_{*}

and a subsequence

k_{ℓ} \to \infty

such that

j (n_{k_{ℓ}}) \geq α k_{ℓ}

for all sufficiently large ℓ. Consequently, under the above condition on

(c, C_{0}, t_{max})

, statement (6) implies the supercritical linear block growth statement (5) of Theorem 1.1.

Proof. Let

{(n_{k})}_{k \geq 0}

be an infinite forward orbit satisfying the Block–Escape Property. By Theorem 7.20 and the residue–graph construction, there exist parameters

L_{0}

and

t_{max}

and a deficit parameter

ε = ε (L_{0}, t_{max}) > 0

such that the valuation drift estimate of Theorem 6.13 holds in the form

j (n_{k + t}) \geq j (n_{k}) + ε t - 1, 1 \leq t \leq t_{max},

(206)

for all sufficiently large k. Assume that for some

η > 0

the lower bound

ε (L_{0}, t_{max}) > α_{*} + η + \frac{1}{t_{max}}, α_{*} : = \frac{{log}_{2} 3 - 1}{{log}_{2} 6},

(207)

holds. By Lemma 7.25, for any

δ \in (0, ε)

we may define

c : = ε - δ, C_{0} : = 1,

and then (206) implies the weak non–retreat inequality

j (n_{k + t}) \geq j (n_{k}) + c t - C_{0} for all 1 \leq t \leq t_{max}

(208)

for all sufficiently large k. Moreover,

\frac{c t_{max} - C_{0}}{t_{max}} = (ε - δ) - \frac{1}{t_{max}} .

Using the strict inequality (207), one may choose

δ > 0

small enough so that

(ε - δ) - \frac{1}{t_{max}} > α_{*} .

Thus there exist constants

c > 0

,

C_{0} \geq 0

, and the same

t_{max}

with

\frac{c t_{max} - C_{0}}{t_{max}} > α_{*},

for which (208) holds for all sufficiently large k.

(1) Constructing the subsequence. Choose

k_{0}

large enough that

j (n_{k_{0}}) \geq J_{0}

and that (208) holds for all

k \geq k_{0}

with

j (n_{k}) \geq J_{0}

. Define an increasing sequence

(k_{ℓ})

by

k_{ℓ + 1} : = k_{ℓ} + t_{ℓ},

where

t_{ℓ} \in [1, t_{max}]

is chosen so that (208) holds at

k = k_{ℓ}

. Induction shows that

j (n_{k_{ℓ}}) \geq J_{0}

for all ℓ, and hence the construction continues indefinitely with

k_{ℓ} \to \infty

.

(2) Accumulated drift. From (208), for each ℓ,

j (n_{k_{ℓ + 1}}) \geq j (n_{k_{ℓ}}) + c t_{ℓ} - C_{0} .

Summing for

ℓ = 0, \dots, m - 1

and using

k_{m} = k_{0} + \sum_{ℓ = 0}^{m - 1} t_{ℓ}

gives

j (n_{k_{m}}) \geq j (n_{k_{0}}) + c (k_{m} - k_{0}) - m C_{0} .

Since

1 \leq t_{ℓ} \leq t_{max}

, we have

m \geq \frac{k_{m} - k_{0}}{t_{max}},

and therefore

j (n_{k_{m}}) \geq j (n_{k_{0}}) + c (k_{m} - k_{0}) - \frac{C_{0}}{t_{max}} (k_{m} - k_{0}) .

Let

γ : = \frac{c t_{max} - C_{0}}{t_{max}} .

Then

j (n_{k_{m}}) \geq γ k_{m} - C^{'}, C^{'} : = γ k_{0} - j (n_{k_{0}}) .

(3) Supercritical slope. By assumption,

γ = \frac{c t_{max} - C_{0}}{t_{max}} > α_{*} .

Hence

\underset{m \to \infty}{lim inf} \frac{j (n_{k_{m}})}{k_{m}} \geq γ .

Choose any

α

with

α_{*} < α < γ

; for all sufficiently large m,

j (n_{k_{m}}) \geq α k_{m} .

Renaming the tail of

(k_{m})

as

(k_{ℓ})

yields the desired subsequence.

Since the orbit satisfies the Block–Escape Property, this linear lower growth of

j (n_{k})

along a subsequence is exactly the supercritical linear block growth required in statement (5) of Theorem 1.1. The implication (6) ⇒ (5) follows. □

(8) ⇒ (6)

This is Theorem 7.20.

Proposition 7.28 (Spectral hypotheses imply orbitwise discrepancy vanishing). Assume the spectral hypotheses of Remark 7.6 for the backward Collatz operator P acting on

B_{tree, σ}

, together with the forward growth bounds used in Lemma 5.24 to control Cesàro averages along forward orbits. Let

{(n_{k})}_{k \geq 0}

be any infinite forward Collatz orbit and, for

N \geq 1

, define the Cesàro functionals

Λ_{N} (f) : = \frac{1}{N} \sum_{k = 0}^{N - 1} f (n_{k}), f \in B_{tree, σ} .

Then along any sequence

N_{r} \to \infty

there is a subsequence

N_{r_{j}} \to \infty

and a functional

Λ \in B_{tree, σ}^{*}

such that

Λ_{N_{r_{j}}} \overset{w^{*}}{⟶} Λ

, and every such weak–* limit satisfies:

Λ is $P^{*}$ –invariant, that is, $Λ (P f) = Λ (f)$ for all $f \in B_{tree, σ}$ ;
the discrepancy averages vanish along the orbit, in the sense that

$lim_{j \to \infty} \frac{1}{N_{r_{j}}} \sum_{k = 0}^{N_{r_{j}} - 1} D (f) (n_{k}) = 0, D (f) : = P f - f \circ T,$

for every $f \in B_{tree, σ}$ .

In particular the Orbitwise Discrepancy Vanishing statement(7)holds with

A = B_{tree, σ}

.

Proof. By Lemma 5.24, which uses the forward growth bounds for Collatz orbits together with the definition of the

B_{tree, σ}

–dual norm, the family

{(Λ_{N})}_{N \geq 1}

is uniformly bounded in

B_{tree, σ}^{*}

:

sup_{N \geq 1} {∥ Λ_{N} ∥}_{B_{tree, σ}^{*}} < \infty .

Hence, by the Banach–Alaoglu theorem, any sequence

N_{r} \to \infty

admits a weak–* convergent subsequence

Λ_{N_{r_{j}}} \to Λ

for some

Λ \in B_{tree, σ}^{*}

.

We now invoke Lemma 7.4 with

A = B_{tree, σ}

, applied to the fixed forward orbit

(n_{k})

and the subsequence

(N_{r_{j}})

. That lemma is a purely dynamical statement about the discrepancy operator

D (f) = P f - f \circ T

and the Cesàro functionals

Λ_{N}

: it asserts that for any weak–* limit

Λ

of

(Λ_{N_{r}})

along a subsequence, one has

$Λ (P f) = Λ (f)$ for all $f \in B_{tree, σ}$ , and
$lim_{j \to \infty} \frac{1}{N_{r_{j}}} \sum_{k = 0}^{N_{r_{j}} - 1} D (f) (n_{k}) = 0 for all f \in B_{tree, σ} .$

In particular, every such weak–* limit is

P^{*}

–invariant and satisfies orbitwise discrepancy vanishing for all

f \in B_{tree, σ}

, which is exactly statement (7) with

A = B_{tree, σ}

. This proves the proposition. □

Proposition 7.29 (Spectral rigidity forbids finite–block trapping). Assume the spectral hypotheses 7.6 and the spectral rigidity statement of Proposition 7.5, which forbids nonzero

P^{*}

–invariant functionals supported on a finite union of blocks. Then there is no infinite forward Collatz orbit

{(n_{k})}_{k \geq 0}

whose trajectory is contained in a finite union of blocks

⋃_{j \leq J} I_{j}

.

Proof. Suppose, for contradiction, that there exists an infinite forward orbits

{(n_{k})}_{k \geq 0}

and an index

J \geq 0

such that

n_{k} \in B_{J} : = ⋃_{j = 0}^{J} I_{j} for all k \geq 0 .

Define the Cesàro functionals

Λ_{N} (f) : = \frac{1}{N} \sum_{k = 0}^{N - 1} f (n_{k}), f \in B_{tree, σ} .

By Proposition 7.28, along some subsequence

N_{r_{j}} \to \infty

the functionals

Λ_{N_{r_{j}}}

converge weakly–* to a nonzero functional

Λ \in B_{tree, σ}^{*}

that is

P^{*}

–invariant:

Λ (P f) = Λ (f) for all f \in B_{tree, σ} .

On the other hand, the assumption

n_{k} \in B_{J}

for all k forces

Λ

to be supported on

B_{J}

. Indeed, let

f \in B_{tree, σ}

be supported in the complement

B_{J}^{c} = ⋃_{j > J} I_{j}

. Then

f (n_{k}) = 0

for all

k \geq 0

, so

Λ_{N} (f) = \frac{1}{N} \sum_{k = 0}^{N - 1} f (n_{k}) = 0 for all N \geq 1 .

Passing to the subsequence

N_{r_{j}}

and using weak–* convergence, we obtain

Λ (f) = 0

for every f supported on

B_{J}^{c}

, hence

Λ

is supported entirely on

B_{J}

.

Thus

Λ

is a nonzero

P^{*}

–invariant functional supported inside the finite union of blocks

B_{J}

, contradicting the rigidity statement of Proposition 7.5. This contradiction shows that no infinite forward orbit can be contained in a finite union of blocks. □

8. Toward a Spectral Calculus for Arithmetic Dynamical Systems

The analytic framework developed here for the backward Collatz operator highlights a broader spectral calculus for discrete arithmetic maps. For any map

T : N \to N

with finitely many inverse branches, one may associate the transfer operator

(P f) (n) = \sum_{m : T (m) = n} \frac{f (m)}{w (m)},

whose spectral behavior reflects the combinatorial and arithmetic structure of T.

When P acts on weighted sequence spaces such as

ℓ_{σ}^{1}

or on the multiscale tree space

B_{tree, σ}

, its Dirichlet transform intertwines

D (P f) (s) = L_{s} D (f) (s), D (f) (s) = \sum_{n \geq 1} f (n) n^{- s},

so that spectral information for P passes directly to the analytic continuation and pole structure of the complex family

L_{s}

. In this duality, the arithmetic operator P and its analytic realization

L_{s}

represent two facets of a single dynamical mechanism: backward iteration in arithmetic space mirrored by analytic continuation in Dirichlet space.

For quasi–compact operators satisfying Lasota–Yorke inequalities on

B_{tree, σ}

, one obtains the spectral decomposition

P = \sum_{| λ_{i} | > ρ_{ess} (P)} λ_{i} Π_{i} + N, ρ_{ess} (P) < 1,

together with the associated dynamical zeta function

ζ_{P} (s) = det {(I - s P)}^{- 1} = exp (\sum_{k \geq 1} \frac{s^{k}}{k} Tr (P^{k})),

whose poles coincide with eigenvalues of P outside the essential spectrum and with the resonant singularities of

L_{s}

. This creates a coherent analytic picture in which resolvents, spectral projections, Dirichlet envelopes, and dynamical determinants arise as aspects of the same operator geometry.

Beyond the Collatz operator, analogous structures arise for general affine–congruence systems

n ⟼ a_{j} n + b_{j}, a_{j}, b_{j} \in N,

for which

(P f) (m) = \sum_{j} 1_{{m \equiv b_{j} (mod a_{j})}} f (\frac{m - b_{j}}{a_{j}}) .

The corresponding Dirichlet transforms

L_{s}

act by weighted composition on generating series. A unified spectral calculus would classify such arithmetic systems according to whether their backward operators are quasi–compact, admit meromorphic decompositions, or exhibit a spectral gap on suitable Banach geometries. This analytic classification parallels the dynamical trichotomy into terminating, periodic, and divergent regimes.

In the Collatz case, the results of this paper yield a complete spectral resolution of the backward dynamics. The operator P and its Dirichlet realization

L_{s}

together provide a model of an arithmetic transfer operator in which analytic continuation, spectral gaps, and decay of correlations follow from explicit Lasota–Yorke estimates on

B_{tree, σ}

. The contraction of

L_{s}

for

ℜ (s) > 1

, combined with the bound

λ_{LY} < 1

on

B_{tree, σ}

, ensures that P is quasi–compact with a strict spectral gap. Consequently, the associated dynamical Dirichlet series admit uniform pole–remainder decompositions, and the invariant density exhibits an averaged

1 / n

law: its block averages satisfy

c_{j} = \frac{1}{6^{j}} \sum_{n \in I_{j}} h (n) = \frac{c}{6^{j}} + o (6^{- j}),

which corresponds to the mass distribution behaving like

c / n

to leading order on each block

I_{j}

.

Boundary spectral geometry and parameter optimization

Theorems 4.17 and 4.1 show that the Lasota–Yorke inequality on

B_{tree}

yields a strict spectral gap at the boundary

σ = 1

. A natural next step is to optimize the parameters

(α, ϑ)

defining the tree seminorm and to determine whether

B_{tree}

is minimal or universal among Banach geometries admitting contraction. A quantitative study of

{∥ P f ∥}_{tree} \leq C_{P} ({λ | f |}_{tree} + {∥ f ∥}_{1})

may reveal how

λ

depends on

ϑ

and how this dependence reflects asymmetries in the Collatz preimage tree. Showing that

λ (ϑ) \to 0

as

ϑ \to 0

would relate analytic contraction rates to the combinatorial entropy of inverse trajectories.

Residues, duality, and forward–backward correspondence

The residue coefficients

A_{k} (1)

, which decay geometrically as

λ^{k}

, represent spectral invariants of the pole part of the dynamical Dirichlet zeta function. On the forward side, the heuristic contraction

{(3 / 4)}^{k}

describes the average shrinkage of integers under iteration. A precise duality between these quantities would connect analytic and probabilistic aspects of the dynamics, expressing average stopping times and fluctuations in terms of the spectral radius of a normalized backward operator. Such a correspondence would yield a forward–backward conservation principle linking termination statistics with spectral invariants.

Extensions and universality

The multiscale tree space equipped with a hybrid

ℓ^{1}

–oscillation norm provides a flexible analytic environment for nonlinear integer maps. Future work may examine metric entropy, measure concentration, and universality phenomena induced by the tree geometry, seeking optimal weights or identifying extremal systems among those with

λ < 1

. Such analysis would illuminate how nonlinear arithmetic recursions embed naturally into Banach geometries enforcing global contraction.

Dynamical Dirichlet zeta functions

The functions

ζ_{C} (s, k) = \sum_{n \geq 1} \frac{1}{{(C^{k} (n))}^{s}}

belong to a broader class of dynamical Dirichlet zeta functions

ζ_{T} (s, k)

associated with iterates of arithmetic maps with finitely many inverse branches. Spectral gaps govern their meromorphic structure, and residues of their poles capture dynamical invariants. Extending this analysis to more general systems would connect the present framework with Ruelle–Perron–Frobenius theory and the analytic structure of dynamical determinants.

Broader outlook

The spectral resolution of the Collatz dynamics developed here suggests a general spectral calculus for arithmetic dynamics in which termination, recurrence, and periodicity correspond to spectral features of noninvertible operators on Banach spaces of arithmetic functions. Future work should clarify how universal the Lasota–Yorke mechanism is for nonlinear arithmetic recursions, how arithmetic symmetries influence spectral gaps, and how probabilistic models of integer iteration emerge as weak limits of deterministic transfer operators. The Collatz operator studied here provides a detailed example in which a complete spectral description is obtained through explicit Lasota–Yorke estimates on a multiscale Banach space.

References

Terras, R. A stopping time problem on the positive integers. Acta Arith. 1976, 30, 241–252. [Google Scholar] [CrossRef]
Terras, R. A stopping time problem on the positive integers. II. Acta Arith. 1979, 33, 241–255. [Google Scholar]
Lagarias, J.C. The 3x+1 problem and its generalizations. Amer. Math. Monthly 1985, 92, 3–23. [Google Scholar] [CrossRef]
Lagarias, J.C. The Ultimate Challenge: The 3x+1 Problem; Amer. Math. Soc., 2010. [Google Scholar]
Meinardus, G. Some Analytic Aspects Concerning the Collatz Problem. Technical Report 261, Universität Mannheim, Fakultät für Mathematik und Informatik, 2001. [Google Scholar]
Applegate, D.; Lagarias, J.C. Density bounds for the 3x+1 problem. Experimental Mathematics 2005, 14, 129–146. [Google Scholar]
Ruelle, D. Statistical Mechanics of a One-dimensional Lattice Gas. Communications in Mathematical Physics 1968, 9, 267–278. [Google Scholar] [CrossRef]
Ruelle, D. A Measure Associated with Axiom A Attractors. American Journal of Mathematics 1976, 98, 619–654. [Google Scholar] [CrossRef]
Leventides, J.; Poulios, C. An operator theoretic approach to the 3x + 1 dynamical system. IFAC-PapersOnLine 24th International Symposium on Mathematical Theory of Networks and Systems MTNS 2020 2021, 54, 225–230. [Google Scholar] [CrossRef]
Neklyudov, M. Functional analysis approach to the Collatz conjecture. arXiv 2022, arXiv:2106.11859. [Google Scholar] [CrossRef]
Hennion, H. Sur un théorème spectral et son application aux noyaux lipschitziens. Proceedings of the American Mathematical Society 1993, 118, 627–634. [Google Scholar] [CrossRef]
Ionescu Tulcea, C.T.; Marinescu, G. Théorie ergodique pour des classes d’opérations non complètement continues. Annals of Mathematics 1950, 52, 140–147. [Google Scholar] [CrossRef]
Wirsching, G.J. The Dynamical System Generated by the 3n+1 Function. In Lecture Notes in Mathematics; Springer, 1998; Volume 1681. [Google Scholar] [CrossRef]
Korec, I. A density estimate for the 3x+1 problem. Math. Slovaca 1994, 44, 85–89. [Google Scholar]
Tao, T. Almost all orbits of the Collatz map attain almost bounded values. arXiv 2019, arXiv:1909.03562. [Google Scholar] [CrossRef]
Furstenberg, H. Recurrence in Ergodic Theory and Combinatorial Number Theory; Princeton Univ. Press, 1981. [Google Scholar]
Katok, A.; Hasselblatt, B. Introduction to the Modern Theory of Dynamical Systems; Cambridge Univ. Press, 1995. [Google Scholar]
Krasikov, I. How many numbers satisfy the 3x+1 conjecture? Internat. J. Math. Math. Sci. 1989, 12, 791–796. [Google Scholar] [CrossRef]
Kontorovich, A.V.; Lagarias, J.C. Stochastic models for the 3x+1 and 5x+1 problems. Unif. Distrib. Theory 2010, 5, 121–164. [Google Scholar]
Applegate, D.; Lagarias, J.C. Distribution trees for the 3x+1 problem. Exp. Math. 2003, 12, 475–490. [Google Scholar] [CrossRef]
Everett, C.J. Iteration of the number-theoretic function f(2n)=n, f(2n+1)=3n+2. Adv. Math. 1977, 25, 42–45. [Google Scholar] [CrossRef]
Sinai, Y.G. Statistical (3x+1) problem. Comm. Pure Appl. Math. 2003, 56, 1016–1028. [Google Scholar] [CrossRef]
Eliahou, S. The 3x+1 problem: new lower bounds on nontrivial cycle lengths. Discrete Math. 1993, 118, 45–56. [Google Scholar] [CrossRef]

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.

The Collatz Conjecture and the Spectral Calculus for Arithmetic Dynamics

Abstract

Keywords:

Subject:

1. Introduction

2. Analytic Foundations for the Transfer Operator

2.1. Weighted Dirichlet Spaces and Transform Estimates

2.2. Collatz Preimage Geometry and the Backward Recursion

2.3. Dirichlet Envelopes for Iterated Backward Dynamics

3. Transfer Operator Formulation

3.1. Backward Transfer Operator

3.2. Dirichlet–Ruelle Operator and Intertwining Relations

4. Spectral Framework and Multiscale Contraction Theory

4.1. Spectral Reduction and Meromorphic Continuation of Dirichlet Transforms

4.2. Spectral Criteria for Boundedness on Weighted Spaces

4.3. Construction of the Multiscale Tree Space B tree , σ

4.4. Lasota–Yorke Contraction on the Multiscale Tree Space

4.4.1. Even-Branch Distortion and Contraction Estimates

4.4.2. Odd-Branch Distortion and Contraction Estimates

4.5. Derivation of the Global Lasota–Yorke Inequality

4.6. Quasi-Compactness and Spectral Gap for P

5. Invariant Profiles, Block Recursion, and Perron–Frobenius Rigidity

5.1. Invariant Density Profile and Refined Tree Geometry

5.2. Effective Block Recursion and Block-Level Spectral Estimates

5.3. Explicit Block Coefficients and Summable Error Terms

Extension to isolated divergent trajectories

5.3.1. Explicit Lasota–Yorke constants

5.4. Perron–Frobenius Rigidity and Structure of Invariant Functionals

5.5. Positivity, Dual Invariants, and Support Properties

5.6. Spectral Gap and Operator-Theoretic Consequences for P

6. Orbit Averages, Block Escape, and Forward Dynamics

6.1. Orbit Averages and P*-Invariant Functionals

Discussion and equivalent forms.

6.2. Block-Structured Implications of the Orbit-Averaging Framework

6.2.1. Contrapositive Route to the Block-Escape Property

6.2.2. Quantitative Forward Valuation Growth

6.3. From Block Escape to Linear Block Growth

6.4. The Linear Block Growth Conjecture and Valuation Drift

Divergence of the average is insufficient.

6.5. Dangerous Residues and Valuation Statistics

7. Completion of the Spectral–Dynamical Implication Chain

7.1. Discrepancy Decay and P*-Invariance

7.2. Spectral Obstructions and Exclusion of Low-Block Invariant Functionals

7.3. A Structured Program Toward the Weak Non-Retreat Principle

8. Toward a Spectral Calculus for Arithmetic Dynamical Systems

References

MDPI Initiatives

Important Links

Subscribe

4.3. Construction of the Multiscale Tree Space $B_{tree, σ}$