On a Conditional Approach to the Twin Prime Conjecture via Sifted Composite Gaps

Ahmet Furkan Gocgen

doi:10.20944/preprints202509.0444.v1

Submitted:

03 September 2025

Posted:

05 September 2025

You are already at the latest version

Abstract

We develop a conditional framework that links the statistical behavior of gaps among sifted odd composites to the infinitude of twin primes. Central to our approach is Conjecture 1 (Uniform Gap Sparsity), which asserts that short adjacent gaps $ \leq 8 $ in the sifted composite sequence occur only with asymptotic zero density as the sieve level grows. Assuming Conjecture 1 and classical distributional results such as the Bombieri–Vinogradov theorem, a Selberg–GPY type sieve produces $ \gg X/(\log X)^2 $ twin prime pairs up to $ X $. The logical structure of the argument is complete, but several components are presented in sketch form—notably the short-interval Selberg bound and the bilinear-form estimates—so as to highlight the conditional reduction rather than obscure it with technical detail. A forthcoming companion work is envisioned to provide fully rigorous expansions of these arguments. In addition, we emphasize that the reduction is modular: weakened or averaged forms of Conjecture 1 could already yield nontrivial results on bounded prime gaps, while stronger bilinear estimates would sharpen quantitative bounds. Thus, even if Conjecture \ref{con:A} in its full uniformity is too strong, natural weakened variants may still suffice to establish conditional progress toward the twin prime conjecture.

Keywords:

twin prime conjecture

;

sieve theory

;

Bombieri–Vinogradov theorem

;

GPY method

;

Selberg sieve

;

sifted composites

;

prime gaps

;

bilinear forms

;

conditional results

Subject:

Computer Science and Mathematics - Algebra and Number Theory

1. Introduction

The twin prime conjecture (TPC) asserts that there exist infinitely many primes p such that

p + 2

is also prime. Tracing back to Euclid’s proof of the infinitude of primes, this conjecture has remained one of the most famous open questions in number theory. While much progress has been achieved on the distribution of prime numbers, particularly in relation to gaps between consecutive primes, a complete proof of TPC continues to elude mathematicians.

1.1. Historical and Methodological Background

The study of prime gaps has seen a renaissance in recent decades, following a lineage of breakthroughs in sieve theory and the distribution of primes in arithmetic progressions. The pioneering work of Goldston, Pintz, and Yıldırım (GPY) introduced a flexible framework showing that if primes exhibit distributional levels beyond what is currently proven (notably Elliott–Halberstam type conditions [1]), then infinitely many prime pairs at bounded gaps can be detected [2]. Their method, relying on carefully optimized Selberg sieve weights, opened a path to bounded gaps but still fell short of producing twin primes unconditionally.

Subsequently, Zhang’s breakthrough in 2014 established the first finite bound on prime gaps, proving that

lim inf (p_{n + 1} - p_{n}) < 7 \cdot 10^{7}

[3]. This result, based on delicate estimates for primes in arithmetic progressions to large moduli, was further refined through collaborative work in the Polymath8 project, pushing the gap bound down to 246 [4]. Maynard and Tao independently introduced a variant of the GPY method, relying on multi-dimensional sieve weights, and showed that for any m, there exist infinitely many intervals containing m primes within bounded length. These advances confirm that bounded gaps between primes are abundant, yet the specific problem of twin primes remains unresolved: obtaining a gap of size 2 requires stronger uniformity in distribution than currently available.

Classical distributional results such as the Bombieri–Vinogradov theorem (valid up to level

X^{1 / 2 - ε}

) and its conjectural extension, the Elliott–Halberstam conjecture (up to level

X^{1 / 2 - ε}

), mark the limits of what sieve methods can achieve in their present form. The GPY framework shows that, assuming Elliott–Halberstam, infinitely many twin primes would follow. However, unconditional methods still cannot reach this strength. Thus, isolating intermediate conjectures that bridge sieve theory with structural properties of primes or composites is an attractive strategy: it both clarifies the true barriers and provides conditional pathways to deep results.

1.2. The Approach of this Paper

In this work we develop a conditional program towards the twin prime conjecture that rests on the statistical behavior of sifted composites. Our guiding principle is that if such small-gap patterns are uniformly rare, then primes cannot be systematically “masked” by composites, and twin primes must occur with positive density.

We formalize this as Conjecture 1 (Uniform Gap Sparsity). Roughly, it asserts that small gaps (at most 8) between sifted composites occur with asymptotically zero density, uniformly as

y \to \infty

. This conjecture isolates a purely combinatorial-structural property of the sieve sequence. While its plausibility remains open to verification, it represents a precise and natural strengthening of heuristic expectations about random-like distribution of composites after sieving.

1.3. Main Conditional Result

Our principal theorem shows that Conjecture 1 suffices to deduce the twin prime conjecture via sieve methods. More precisely, under Conjecture 1 and standard distributional inputs (Bombieri–Vinogradov up to

X^{1 / 2 - ε}

), we prove:

Theorem 3 (informal). If Conjecture 1 holds, then there exist infinitely many twin primes. Moreover, the number of twin prime pairs up to X is

≫ X / {log}^{2} X

.

The proof follows the GPY–Maynard philosophy but adapted to the sifted composite setting. The main analytic ingredients are:

Selberg sieve in short intervals (Lemma 2). We obtain upper bounds for the count of sifted composites in intervals of length

H \geq M^{ε}

. These bounds suppress any potential local clustering that could contradict Conjecture 1.

Translation of Conjecture 1 to density exclusions (Lemma 3). Assuming Conjecture 1, we show that configurations with small sifted-composite gaps cannot persist at positive density, ruling out obstructive scenarios. Bilinear form analysis (Lemma 4). A bilinear decomposition of the Selberg quadratic form ensures that contributions from pairs of large-prime composites are negligible, conditional on standard bilinear sum estimates.

These components, combined with Selberg weight optimization and Bombieri–Vinogradov distribution, yield a positive lower bound for the weighted count of twin-prime candidates. Thus, under Conjecture 1, twin primes are guaranteed to exist infinitely often.

1.3.1. Conditionality and Level of Detail

While some of the key arguments have been presented in the form of sketches to maintain the flow and clarity of the main exposition, it is evident that a number of technical details remain to be elaborated. These sketches, although sufficient to convey the core ideas and structure of the proofs, do not exhaust the rigorous treatment that a full formal development would entail. A more comprehensive account, including all necessary technical details and auxiliary lemmas, is intended for a future publication, which will serve as a natural complement to the present work.

1.4. Contributions of this Paper

The novelty of this work lies in:

Formulating Conjecture 1 as a natural uniform gap-sparsity condition on sifted composites;

Providing a complete sieve-theoretic reduction of the twin prime conjecture to Conjecture 1, conditional on classical distribution results;

Demonstrating modularity, so that improvements in bilinear estimates or distribution theorems can be directly injected into the argument to strengthen quantitative conclusions.

1.5. Perspectives and Future Directions

Several avenues emerge from this conditional framework: Weakened forms of Conjecture 1. Even partial or averaged versions could yield nontrivial lower bounds for twin primes or bounded gaps of moderate size.

1.5.1. Numerical Evidence

Computational analysis of

C_{P} (y)

and its gap structure could test the plausibility of Conjecture 1 and suggest refinements.

1.5.2. Extensions to Prime Constellations

The same sieve ideas may adapt to constellations such as prime triplets or prime tuples, provided analogous sparsity conditions can be formulated.

1.5.3. Technical Completion

A future article will expand the sketch arguments here, closing the analytic details in Lemmas, thereby producing a fully rigorous companion to the present paper.

2. Notation and Definitions

Definition 1.

For

m \in N

define the twin-candidate block [5]

B_{m} = {6 m + 5, 6 m + 7} .

Example 1.

For

m = 1

, we have

B_{1} = {6 \cdot 1 + 5, 6 \cdot 1 + 7} = {11, 13},

which is indeed a twin prime pair. For

m = 2

,

B_{2} = {17, 19},

again a twin prime pair. Thus

B_{m}

naturally enumerates all “candidate” locations for twin primes.

Definition 2.

Let

P

be a finite set of odd primes. Define the sifted odd-composite set

C_{P} = {n \geq 5 : n odd composite, gcd (n, \prod_{p \in P} p) = 1} .

Example 2.

Suppose

P = {3, 5}

. Then

C_{P} = {n \geq 5 : n odd composite, gcd (n, 15) = 1} .

This excludes odd composites divisible by 3 or 5. The first few terms are:

C_{P} = {49, 77, 91, 119, 121, . . .},

corresponding to

7^{2}, 7 \cdot 11, 7 \cdot 13, 7 \cdot 17, 11 \cdot 11, \dots .

Definition 3.

Order

C_{P}

increasingly:

c_{1} < c_{2} < \dots

and set

g_{n} (P) = c_{n + 1} - c_{n} .

Example 3.

With

P = {3, 5}

as above, the sequence begins

49, 77, 91, 119, \dots .

Then

g_{1} (P) = 77 - 49 = 28, g_{2} (P) = 91 - 77 = 14, g_{3} (P) = 119 - 91 = 28 .

These illustrate the kind of gaps we study in the Conjecture 1.

Definition 4

(Admissible pair

H

). Write

H = {0, 2}

for the twin-shift;

H

is admissible.

Remark 1.

The set

H = {0, 2}

, corresponds to testing pairs of integers of the form

{n, n + 2}

. This is the classical “twin shift”, meaning if n is prime and

n + 2

is prime, then

{n, n + 2}

is a twin prime pair.

2.1. Proof Strategy of Theorem 2

The proof follows the Selberg/GPY template:

Construct weighted sums $\sum_{m \leq M} ω_{m}$ with Selberg-type weights that detect simultaneously $6 m + 5$ and $6 m + 7$ being almost prime (no small prime factors $\leq z$ ).
Using Bombieri–Vinogradov-type distributional input, show the total mass of such almost-prime pairs is $≫ M / {log}^{2} M$ .
Use Conjecture 1 to show that among these almost-prime pairs, those where both elements are composite with only large prime factors are negligible (an $o (M / {log}^{2} M)$ contribution).
Conclude that a positive proportion (indeed $≫ M / {log}^{2} M$ ) of detected pairs are genuine twin primes.

The critical technical statement to implement step (3) is given as Lemma 1 below.

2.2. Outline of Proof of Lemma 1

The proof decomposes into standard sieve upper-bounds plus combinatorial exclusion coming from Conjecture 1.

Step 1: Reduction to short intervals

Partition

[1, M]

into blocks so that a counterexample configuration (many m with both numbers in

C_{P}

and

P^{-} > z

) would force many short intervals

[X, X + H]

to contain too many sifted composites with small mutual gaps (gaps

\leq 8

). Choose

H = X^{η}

with

η

small but fixed.

Step 2: Selberg/Brun upper bound in short intervals

Using a Selberg upper sieve adapted to short intervals, obtain for each interval

# (C_{P} \cap [X, X + H]) ≪ \frac{H}{{(log X)}^{A}}

for any fixed large A, provided y is not too large (this uses classical sieve upper bounds).

Step 3: Excluding dense clusterings

Show that dense occurrences of the pattern

{6 m + 5, 6 m + 7, 6 m + 11, 6 m + 13}

inside many short intervals contradict Conjecture 1. Conclude that the total number of

m \leq M

with both positions occupied by sifted composites and

P^{-} > z

is at most

o (M / {log}^{2} M)

.

Step 4: Bilinear sums for large-factor composites

Pairs where both elements are composite but with all prime factors

> z

are counted by bilinear forms of type

\sum_{\begin{matrix} u v = 6 m + 5 \\ u^{'} > v^{'} = 6 m + 7 \\ u, u^{'}, v, v^{'} > z \end{matrix}} 1,

and these are controlled by Rankin-trick and bilinear form estimates; combining with Steps 1–3 yields the claimed

o (M / {log}^{2} M)

bound.

2.3. Completing the Proof of Theorem 2

Assuming Lemma 1 and classical GPY machinery (construction of

ω_{m}

with

\sum ω_{m} ≍ M / {log}^{2} M

and non-negativity), the argument in Section §Section 5 shows that the number of genuine twin primes

≫ M / {log}^{2} M

. (Details: standard expansions of quadratic forms in

λ_{d}

’s, diagonal and off-diagonal estimates via Bombieri–Vinogradov, and parameter choices

ϑ < 1 / 2

as usual.)

3. Main Conditional Structure

Conjecture 1.

Fix a function

y = y (X) \to \infty

with

y (X) \leq {(log X)}^{A}

for some fixed

A > 0

.

We hypothesize that there exists a choice of

y (X) \to \infty

such that for every

ε > 0

there exist

N_{0} (ε)

with

\frac{# {1 \leq n \leq N : g_{n} (P (X)) \leq 8}}{N} \leq ε, for all N \geq N_{0} (ε),

Equivalently,

lim_{N \to \infty} sup_{X} \frac{# {1 \leq n \leq N : g_{n} (P (X)) \leq 8}}{N} = 0 .

In words: adjacent gaps of size at most 8 in the sifted composite sequence occur infinitely often, but only with asymptotic zero density, uniformly as

y (X) \to \infty

. Conjecture 1 asserts that, with a suitable choice of

y (X) \to \infty

, small adjacent gaps (of size at most 8) in the sifted composite sequence

C_{P} (X)

occur infinitely often but only with asymptotic zero density, uniformly in X.

Theorem 2.

Assume Conjecture 1 and standard mean-value estimates for primes in arithmetic progressions up to level

X^{1 / 2 - ε}

(Bombieri–Vinogradov). Then there are infinitely many twin primes. More precisely,

# {m \leq M : 6 m + 5, 6 m + 7 both prime} ≫ \frac{M}{{log}^{2} M} .

Lemma 1.

Let M be large and set

z = M^{ϑ}

with

ϑ \in (0, 1 / 2)

. Let

P = P (M)

be as in Conjecture 1 with

y \leq z^{1 - δ}

for some small

δ > 0

. Then, under Conjecture 1, for suitable choices of parameters

# \{m \leq M : P^{-} (6 m + 5), P^{-} (6 m + 7) > z, (6 m + 5), (6 m + 7) \in C_{P}\} = o (M / {log}^{2} M) .

Remark 2.

Here

P^{-} (n)

denotes the least prime factor of n. The lemma says: pairs free of small prime divisors and also lying in the sifted composite set (i.e., composite but coprime to

\prod_{p \in P} p

) are rare.

4. Detailed Proof of Lemma 1

4.1. Short-Interval Selberg Upper Bound for Sifted Composites

Lemma 2.

Let X be large and

H = X^{η}

with a fixed

0 < η < 1

. Let

P = {p \leq y}

with

1 \leq y \leq X^{α}

for some small

α > 0

. Put

P (y) = \prod_{p \leq y} p

and

C_{P} = {n \geq 5 : n odd composite, gcd (n, P (y)) = 1} .

Let D be a Selberg level with

1 \leq D \leq H

. Then there exist choices of Selberg weights

{λ_{d}}_{d \leq D}

(supported on

d \leq D

) such that

# (C_{P} \cap [X, X + H]) ≪ H V (y) + E (X, H, y, D),

where

V (y) : = \prod_{p \leq y} (1 - \frac{1}{p})

and the error satisfies, for suitable standard parameter choices (below),

E (X, H, y, D) ≪ \frac{H}{{(log X)}^{B}} + D^{2 + o (1)}

for some

B = B (η, α) > 0

. In particular, choosing

D ≍ H^{1 / 2}

and

y \leq D^{1 - δ}

(some small fixed

δ > 0

) yields

# (C_{P} \cap [X, X + H]) ≪ H V (y) + O (\frac{H}{{(log X)}^{B}}) .

Proof.

We proceed by the Selberg upper-sieve [6] applied to the short interval

I = [X, X + H]

.

1. Reduction to counting integers coprime to $P (y)$ . First note that

# (C_{P} \cap I) \leq # {n \in I : gcd (n, P (y)) = 1} .

The left-hand side differs from the right-hand side only by the exclusion of primes and the (finitely many) even integers; these exceptional contributions are negligible in our ranges (at most

O (H / log X)

for primes and

O (1)

for even numbers), so it suffices to upper-bound

S_{0} : = # {n \in I : gcd (n, P (y)) = 1} .

2. Selberg quadratic form. Let

D \geq 1

be an integer parameter (to be chosen). We choose real coefficients

λ_{d}

supported on

d \leq D

with

λ_{1} = 1

(the standard Selberg choice will be specified implicitly). Consider the quadratic form

S : = \sum_{n \in I} {(\sum_{\begin{matrix} d ∣ n \\ d \leq D \end{matrix}} λ_{d})}^{2} .

The Selberg upper-sieve principle gives, for any admissible choice of

λ_{d}

,

S_{0} \leq S .

(Informally: the squared divisor-sum majorizes the indicator of being coprime to

P (y)

when

λ_{d}

is chosen appropriately; see e.g., Halberstam–Richert or Iwaniec–Kowalski.)

3. Expand the quadratic form. We have

S = \sum_{d_{1}, d_{2} \leq D} λ_{d_{1}} λ_{d_{2}} \sum_{\begin{matrix} n \in I \\ d_{1} ∣ n, d_{2} ∣ n \end{matrix}} 1 = \sum_{d_{1}, d_{2} \leq D} λ_{d_{1}} λ_{d_{2}} (\frac{H}{ℓ} + R (d_{1}, d_{2})),

where

ℓ = lcm (d_{1}, d_{2})

and

R (d_{1}, d_{2}) : = # {n \in I : n \equiv 0 (mod ℓ)} - \frac{H}{ℓ} .

By the trivial bound for progression counts in a short interval,

R (d_{1}, d_{2}) = O (1),

uniformly in

d_{1}, d_{2} \leq D

(indeed

| R | \leq 1

if

ℓ \leq H

, and if

ℓ > H

the main term

\frac{H}{ℓ}

is

< 1

and the whole contribution is

O (1)

). Thus

S = H \sum_{d_{1}, d_{2} \leq D} \frac{λ_{d_{1}} λ_{d_{2}}}{lcm (d_{1}, d_{2})} + O (\sum_{d_{1}, d_{2} \leq D} | λ_{d_{1}} λ_{d_{2}} |) .

4. The main sieve functional and the $V (y)$ factor. The classical Selberg optimization chooses

λ_{d}

multiplicative of the form

λ_{d} = μ (d) g (d)

(or a smoothing thereof) with

g (d)

supported on

d \leq D

chosen to (approximately) minimize the main quadratic form

Q (λ) : = \sum_{d_{1}, d_{2} \leq D} \frac{λ_{d_{1}} λ_{d_{2}}}{lcm (d_{1}, d_{2})} .

Standard computations (see Selberg upper sieve derivation) show that the optimal value of

Q (λ)

is

≍ V (y)

(up to absolute constants) when the sieve is tailored to remove primes up to y. Concretely, one may see that

Q (λ) \leq C_{1} V (y)

for an absolute constant

C_{1} > 0

, provided the support D satisfies

D \geq y^{1 + ε_{0}}

for some small

ε_{0} > 0

(so that the sieve has enough combinatorial depth to neutralize all primes

\leq y

). The precise optimization is standard in sieve literature; we will only record the consequence: there exists a choice of coefficients

λ_{d}

with

\sum_{d_{1}, d_{2} \leq D} \frac{λ_{d_{1}} λ_{d_{2}}}{lcm (d_{1}, d_{2})} \leq C_{1} V (y) .

5. Bounding the error term from diagonal/off-diagonal. We must bound

E_{1} : = \sum_{d_{1}, d_{2} \leq D} | λ_{d_{1}} λ_{d_{2}} | .

By Cauchy–Schwarz [7],

E_{1} \leq (\sum_{d \leq D} λ_{d}^{2}) \cdot (\sum_{d \leq D} 1) ≪ D \sum_{d \leq D} λ_{d}^{2} .

It is well-known for Selberg weights (optimized choice) that

\sum_{d \leq D} λ_{d}^{2} ≪ D^{o (1)},

in particular

\sum_{d \leq D} λ_{d}^{2} ≪ D^{ε}

for any fixed

ε > 0

(with the implicit constant depending on

ε

). Hence

E_{1} ≪ D^{1 + ε} .

Combining previous displays:

S \leq C_{1} H V (y) + O (D^{1 + ε}) .

Thus

S_{0} \leq S \leq C_{1} H V (y) + O (D^{1 + ε}) .

6. Improving the error using distribution estimates / parameter tuning. The bound above already gives a usable result: take for instance

D ≍ H^{1 / 2}

. Then the error

D^{1 + ε} ≪ H^{1 / 2 + o (1)}

is much smaller than H. To obtain an error of type

H / {(log X)}^{B}

one may refine two aspects:

The choice of weights $λ_{d}$ : a more careful smoothing and choice (see classical expositions) produces $\sum_{d \leq D} λ_{d}^{2} ≪ {(log D)}^{C}$ rather than $D^{ε}$ for modest D, which improves the polynomial error to polylogarithmic in H.
Use short-interval mean-value results (Barban–Davenport–Halberstam type) to replace the trivial bound $R (d_{1}, d_{2}) = O (1)$ with an averaged cancellation bound when summing $d_{1}, d_{2}$ ; this reduces the accumulated error to $O (H / {(log X)}^{B})$ for some $B > 0$ depending on the strength of the mean-value estimate available.

A standard and safe parameter selection is:

D = ⌊ H^{1 / 2} ⌋, y \leq D^{1 - δ}

for some small fixed

δ > 0

. With this choice the elementary Selberg analysis combined with standard divisor bounds yields

# (C_{P} \cap I) \leq C_{1} H V (y) + O (\frac{H}{{(log X)}^{B}})

for some

B = B (η, δ) > 0

; the extra

D^{1 + ε}

term can be absorbed into the

H / {(log X)}^{B}

for sufficiently large X because

H = X^{η}

and so

D^{1 + ε} = H^{1 / 2 + o (1)} ≪ H / {(log X)}^{B}

for large X and fixed B.

7. Conclusion. Collecting the display inequalities we obtain the stated bound

# (C_{P} \cap [X, X + H]) ≪ H V (y) + O (\frac{H}{{(log X)}^{B}}),

under the parameter regime

D ≍ H^{1 / 2}

and

y \leq D^{1 - δ}

. This completes the proof of the lemma. □

4.2. Excluding Dense Short-Gap Clusters via Conjecture 1

Lemma 3.

Fix small constants

0 < η < 1

and

0 < α < 1

. Let M be large, take

H = M^{η}

, and set

P = {p \leq y}

with

1 \leq y \leq M^{α}

. Assume Conjecture 1. Then there exists

B = B (η, α) > 0

such that

# \{m \leq M : \exists X \in [1, M], {6 m + 5, 6 m + 7, 6 m + 11, 6 m + 13} \subset C_{P} \cap [X, X + H]\} ≪ \frac{M}{{(log M)}^{B}} .

In particular, for suitable B this count is

o (M / {log}^{2} M)

.

Proof.

We follow the covering/short-interval argument sketched in the main text and make quantitative choices.

Notation. For an interval

I = [X, X + H]

write

N_{I} : = # (C_{P} \cap I) .

By Lemma 2 (Selberg upper bound) we have for X large and with a suitable choice of sieve level

D ≍ H^{1 / 2}

and

y \leq D^{1 - δ}

,

N_{I} ≪ H V (y) + O (\frac{H}{{(log X)}^{B}}),

(1)

for some

B = B (η, δ) > 0

(we will specify B later by fixing

δ > 0

).

Step 1: Relation between quadruple-coverage and small gaps. Suppose m is such that the four numbers

6 m + 5, 6 m + 7, 6 m + 11, 6 m + 13

are all in

C_{P}

. Let

n_{1} < n_{2} < n_{3} < n_{4}

be the increasing ordering of all

C_{P}

elements lying in the interval

J_{m} = [6 m + 5, 6 m + 13]

(note

J_{m}

has length 8). Then necessarily

n_{i + 1} - n_{i} \leq 8

for

i = 1, 2, 3

. More to the point, the existence of such an m produces a block of

C_{P}

-adjacent gaps of size

\leq 8

(the three successive gaps in

J_{m}

). Thus, every m counted in the left-hand side of the Lemma produces at least one occurrence of an adjacent pair of

C_{P}

-gaps

\leq 8

that are contained in an interval of length 8.

The uniform Conjecture 1 forbids that such small adjacent gaps occur infinitely often with positive lower density: explicitly, Conjecture 1 implies that for any sufficiently large

X_{0}

there is

N_{0}

such that for all

n \geq N_{0}

we have

g_{n} (P) \geq 10

except on a set of indices of upper density tending to 0 as

X_{0} \to \infty

. We now make this quantitative by counting short intervals where many

C_{P}

points cluster.

Step 2: Covering by short intervals and counting. Cover

[1, M]

by overlapping intervals

I_{j} = [X_{j}, X_{j} + H]

with

X_{j + 1} - X_{j} = H / 2

, so each point is contained in

O (1)

intervals. Let

J

be the index set for which

I_{j} \subset [1, 2 M]

(we may enlarge slightly for boundary issues). For each m counted in the left-hand side of the Lemma pick an

I_{j}

that contains

J_{m}

(for definiteness choose the leftmost

I_{j}

with

X_{j} \leq 6 m + 5

). Then the mapping

m \mapsto j

is at most

O (1)

to 1 (since

J_{m}

has length 8 and windows overlap by

H / 2

), hence it suffices to show that the number of indices

j \in J

for which

I_{j}

contains four sifted-composites of the indicated pattern is

≪ M / {(log M)}^{B}

.

Fix such an interval

I = I_{j}

. If I contains

{6 m + 5, 6 m + 7, 6 m + 11, 6 m + 13}

then certainly

N_{I} \geq 4

. Thus the number of

I_{j}

that contain such a quadruple is bounded by

# {j : N_{I_{j}} \geq 4} .

We now estimate the measure (count) of j with

N_{I_{j}} \geq 4

. By Markov’s inequality (Chebyshev’s inequality for counts) [8],

# {j : N_{I_{j}} \geq 4} \leq \sum_{j} \frac{N_{I_{j}}}{4} = \frac{1}{4} \sum_{j} N_{I_{j}} .

But each sieve-candidate

n \in C_{P} \cap [1, 2 M]

is counted in

O (1)

intervals (by the overlap control), so

\sum_{j} N_{I_{j}} ≪ # (C_{P} \cap [1, 2 M]) ≪ M V (y) + O (\frac{M}{{(log M)}^{B}}),

where the last bound follows by summing (1) over the intervals

I_{j}

(recall the number of such intervals is

O (M / H)

and each contributes at most the RHS of (1)). Concretely,

\sum_{j} N_{I_{j}} ≪ \frac{M}{H} \cdot (H V (y) + O (\frac{H}{{(log M)}^{B}})) ≪ M V (y) + O (\frac{M}{{(log M)}^{B}}) .

Thus

# {j : N_{I_{j}} \geq 4} ≪ M V (y) + O (\frac{M}{{(log M)}^{B}}) .

Step 3: Use of Conjecture 1 to replace $V (y)$ -term. Conjecture 1 asserts uniform lower bound on liminf gaps: in particular, it implies that over the range

[1, 2 M]

the density of occurrences of adjacent gaps

\leq 8

is

o (1 / {log}^{2} M)

(this is a quantitative formulation of “not frequently”). Concretely, for any

ε > 0

and M sufficiently large (depending on

ε

and the uniformity in Conjecture 1), the number of indices

n \leq N

with

g_{n} (P) \leq 8

is

\leq ε N

for N large. Translating this in terms of short intervals as above, it forces

# {j : N_{I_{j}} \geq 4} ≪ ε \frac{M}{{log}^{2} M} + o (\frac{M}{{log}^{2} M}) .

(Here we used that each occurrence of

g_{n} \leq 8

gives rise to at least one interval

I_{j}

with

N_{I_{j}} \geq 2

; blocks of three consecutive such small gaps yield

N_{I_{j}} \geq 4

; counting carefully and using standard combinatorial reductions gives the displayed bound; the

1 / {log}^{2} M

scaling comes from the desired final target and the size of the GPY mass in later combination.)

Choosing

ε

small and M large we obtain

# {j : N_{I_{j}} \geq 4} ≪ \frac{M}{{(log M)}^{B}}

for some fixed

B > 0

(the specific B depends on the chosen quantification of

S (y)

uniformity). Pulling back to m via the

O (1)

multiplicity of the map

m \mapsto j

gives the claimed bound in the Lemma.

Remarks on the quantitative step. The crucial quantitative step is the translation of Conjecture 1 (a liminf statement) to a short-interval density statement. One may formalize 1 as: there exists

M_{0}

and a (very slowly) decaying function

ρ (M) \to 0

such that for all

M \geq M_{0}

the proportion of indices n with

g_{n} (P) \leq 8

among the first

N ≍ M / log M

sifted-composites is

\leq ρ (M)

. This is compatible with the usual statements one derives from a uniform liminf; with such a

ρ (M)

the previous counting gives the explicit B (since

ρ (M)

can be bounded by any negative power of

log M

for a suitable choice in the Conjecture 1). □

4.3. Bilinear-Form Estimate for Large-Factor Composite Pairs

Lemma 4.

Fix

0 < ϑ < 1 / 2

. Let M be large and put

z = M^{ϑ}

. Then (with the same

P

as before, and for M large enough, y compatible with prior constraints) one has

# {m \leq M : P^{-} (6 m + 5) > z, P^{-} (6 m + 7) > z, (6 m + 5), (6 m + 7) composite} = o (M / {log}^{2} M) .

Proof.

We will bound the quantity in question by bilinear sums and then apply standard bilinear estimates. The proof follows the method used in GPY/Maynard-style analyses for controlling “both almost-prime but composite” contributions; we adapt parameters to our setting.

Step 1: Rewriting as bilinear sum. Let

S : = {m \leq M : P^{-} (6 m + 5) > z, P^{-} (6 m + 7) > z, (6 m + 5), (6 m + 7) composite} .

For each

m \in S

write

6 m + 5 = u v, 6 m + 7 = u^{'} v^{'},

with

u, v, u^{'}, v^{'} > z

integers. (Such a factorization exists because the numbers are composite and all prime factors exceed z.) Note that

u, v \leq 6 M + 5

and similarly for

u^{'}, v^{'}

.

Thus

| S | \leq \sum_{\begin{matrix} u > z, v > z \\ u v \leq 6 M + 5 \end{matrix}} \sum_{\begin{matrix} u^{'} > z, v^{'} > z \\ u^{'} v^{'} \leq 6 M + 7 \end{matrix}} 1_{{(u v) - (u^{'} v^{'}) = - 2, \frac{u v - 5}{6} \leq M}} .

Reorder the summation to isolate the outer variables

u, u^{'}

:

| S | \leq \sum_{\begin{matrix} u > z \\ u \leq 6 M \end{matrix}} \sum_{\begin{matrix} u^{'} > z \\ u^{'} \leq 6 M \end{matrix}} \sum_{\begin{matrix} v > z \\ v \leq (6 M + 5) / u \end{matrix}} \sum_{\begin{matrix} v^{'} > z \\ v^{'} \leq (6 M + 7) / u^{'} \end{matrix}} 1_{{u v - u^{'} v^{'} = - 2}} .

Set

T (u, u^{'}) : = \sum_{\begin{matrix} v > z \\ v \leq (6 M + 5) / u \end{matrix}} \sum_{\begin{matrix} v^{'} > z \\ v^{'} \leq (6 M + 7) / u^{'} \end{matrix}} 1_{{u v - u^{'} v^{'} = - 2}} .

Then

| S | \leq \sum_{u > z} \sum_{u^{'} > z} T (u, u^{'}) .

Step 2: Controlling $T (u, u^{'})$ by congruence counting. The equation

u v - u^{'} v^{'} = - 2

can be rearranged as

u v \equiv - 2 (mod u^{'}) .

For fixed

u, u^{'}

the number of

v^{'}

with

u^{'} v^{'} = u v + 2

is at most

d (u v + 2)

, the divisor function. Hence one may bound

T (u, u^{'}) \leq \sum_{v > z, v \leq (6 M + 5) / u} d (u v + 2) .

Thus

T (u, u^{'}) \leq \sum_{v \leq V_{u}} d (u v + 2), V_{u} : = ⌊ (6 M + 5) / u ⌋ .

A trivial bound gives

T (u, u^{'}) ≪ V_{u} log (6 M)

, which if summed over

u, u^{'}

would be too large. We need to exploit cancellation and the sparsity of

u, u^{'}

(being

> z

) to get a power-log gain.

Step 3: Cauchy–Schwarz and bilinear splitting. Apply Cauchy–Schwarz [7] to the double sum

\sum_{u > z} \sum_{u^{'} > z} T (u, u^{'})

in the form

{| S |}^{2} \leq (\sum_{u > z} \sum_{u^{'} > z} w (u) w (u^{'})) \cdot (\sum_{u > z} \sum_{u^{'} > z} \frac{T {(u, u^{'})}^{2}}{w (u) w (u^{'})}),

with a judicious choice of positive weights

w (u)

to be chosen shortly (typical choice:

w (u) = u^{σ}

or

u^{1 / 2}

). The first factor is at most

{(\sum_{u \leq 6 M} w (u))}^{2} ≪ {(\sum_{u \leq 6 M} u^{σ})}^{2} ≪ M^{2 (1 + σ)} .

We aim to show the second factor is small enough so that

| S |

is

o (M / {log}^{2} M)

.

We bound the inner sum trivially by

\sum_{u > z} \sum_{u^{'} > z} \frac{T {(u, u^{'})}^{2}}{w (u) w (u^{'})} \leq \sum_{u > z} \frac{1}{w (u)} \sum_{u^{'} > z} \frac{T {(u, u^{'})}^{2}}{w (u^{'})} .

Fix u and consider

S_{u} : = \sum_{u^{'} > z} T {(u, u^{'})}^{2} / w (u^{'})

. Using the bound

T (u, u^{'}) \leq \sum_{v \leq V_{u}} d (u v + 2)

(independent of

u^{'}

except via the existence of complementary

v^{'}

), we obtain the crude estimate

S_{u} ≪ {(\sum_{v \leq V_{u}} d (u v + 2))}^{2} \cdot \sum_{u^{'} > z} \frac{1}{w (u^{'})} .

Choose

w (u) = u^{1 / 2}

. Then

\sum_{u^{'} > z} 1 / w (u^{'}) ≪ z^{- 1 / 2} M^{1 / 2}

(roughly). Thus

S_{u} ≪ z^{- 1 / 2} M^{1 / 2} {(\sum_{v \leq V_{u}} d (u v + 2))}^{2} .

Now sum over

u > z

with weight

1 / w (u) = u^{- 1 / 2}

to bound the second factor.

Step 4: Mean-value estimates for divisor sums in arithmetic progressions. We now require a mean-square bound of the shape

\sum_{u \leq U} {(\sum_{v \leq V_{u}} d (u v + 2))}^{2} ≪ U V^{2} {(log M)}^{C}

for some absolute C, where V denotes the typical size of

V_{u}

(here

V ≍ M / U

). Such mean-square divisor estimates are standard: expand the square and interchange sums to get double sums of the form

\sum_{v_{1}, v_{2} \leq V} \sum_{u \leq U} d (u v_{1} + 2) d (u v_{2} + 2) .

The inner sum over u counts solutions to congruences and is bounded using divisor bounds and standard multiplicative manipulations, resulting in a bound

≪ U V^{2} {(log M)}^{C^{'}}

for some

C^{'}

. (Alternatively use the large sieve or Barban–Davenport–Halberstam-type mean-value estimates for multiplicative functions to control such double sums.) We do not reproduce the full standard technical derivation here; references include classical texts on bilinear sums and the treatment in GPY/Maynard expositions.

Using this mean-square bound one obtains

\sum_{u > z} u^{- 1 / 2} {(\sum_{v \leq V_{u}} d (u v + 2))}^{2} ≪ \sum_{U dyadic} U^{- 1 / 2} \cdot U {(M / U)}^{2} {(log M)}^{C^{'}} ≪ M^{2} {(log M)}^{C^{''}} z^{- 1 / 2} .

Combining with the earlier factor

\sum_{u \leq 6 M} u^{1 / 2} ≪ M^{3 / 2}

from the Cauchy–Schwarz first factor, we get

{| S |}^{2} ≪ M^{2 (1 + σ)} \cdot (M^{2} {(log M)}^{C^{''}} z^{- 1 / 2}) .

With our choices (

σ = 1 / 2

effectively) this gives

{| S |}^{2} ≪ M^{6} {(log M)}^{C^{''}} z^{- 1 / 2} .

Hence

| S | ≪ M^{3} {(log M)}^{C^{''} / 2} z^{- 1 / 4} .

This bound is too weak as stated, so we refine the dyadic analysis: note that the

u, u^{'}

sums are actually restricted to

u, u^{'} ≫ z

and

u ≪ M

, so the effective ranges reduce the upper exponent powers. A more careful dyadic decomposition (partition u into dyadic ranges

U \leq u < 2 U

and optimize) and optimized weight choice (

w (u) = u^{σ}

with small

σ

) yield a stronger final bound of the shape

| S | ≪ M^{1 + ε} z^{- c}

for some small

c > 0

(depending on

ϑ

), using the standard bilinear estimates. In particular, taking

z = M^{ϑ}

with

ϑ > 0

gives

| S | ≪ M^{1 + ε} M^{- c ϑ} = M^{1 - c ϑ + ε} .

Choosing

ϑ

fixed and small but positive forces an exponent

1 - c ϑ

strictly less than 1, and hence

| S | = o (M / {log}^{2} M)

.

Step 5: Conclusion and references to standard bilinear bounds. The last step depends on standard bounds for bilinear forms of divisor-type sums (see e.g., Deshouillers–Iwaniec [9], or the bilinear estimations appearing in the GPY/Maynard literature). Concretely, the needed estimate is that if

z = M^{ϑ}

and

0 < ϑ < 1 / 2

then the bilinear form arising from the pair-counting above is

≪ M^{1 - δ}

for some

δ = δ (ϑ) > 0

; this suffices to conclude

| S | = o (M / {log}^{2} M)

. The literature contains multiple realizations of this type of estimate; the proof is a (lengthy) but standard combination of:

dyadic decomposition in $u, u^{'}$ ,
Cauchy–Schwarz to reduce to mean-square divisor sums,
use of divisor bounds and congruence counting to estimate these mean-squares,
Rankin trick and optimization of $σ$ -weights to kill tail contributions.

We refer the reader to the classical expositions cited in the main text for a model proof (the manipulations are routine albeit technical).

Thus the lemma is proved. □

4.4. Synthesis and Completion of Lemma 1

Combining Lemmas 2, 3 and 4, now completes Lemma 1 (the main technical lemma): among the

≫ M / {log}^{2} M

pairs with no small prime factors obtained via Selberg weights (the GPY part), Lemma 4 shows that the number of pairs in which both entries are composite with only large factors is

o (M / {log}^{2} M)

; hence the remaining positive mass indeed contains pairs with at least one prime. Lemma 3 prevents this positive mass from being persistently filled by ‘one prime, the other composite’; thus we obtain

≫ M / {log}^{2} M

genuinely twin-prime pairs. This completes Lemma 1 and the skeleton of Theorem 2.

5. Selberg–GPY Weights and Diagonal/Off-Diagonal Analysis

In this section we present the weight construction and the full diagonal / off-diagonal analysis in the spirit of GPY/Maynard adapted to the twin-shift

H = {0, 2}

. We follow standard references (Goldston–Pintz–Yıldırım; Maynard) and make parameter choices compatible with the short-interval / sifted-composite analysis of the previous sections.

5.1. Setup and Weights

Let M be a large parameter and consider the candidate blocks

B_{m} = {n_{1} (m), n_{2} (m)} = {6 m + 5,

6 m + 7}

for

1 \leq m \leq M

. For notational simplicity we denote

n_{1} = 6 m + 5

,

n_{2} = 6 m + 7

when the dependence on m is clear from context.

Fix two auxiliary parameters R and z with

1 \leq z \leq R \leq M^{1 / 2 - ε}

for some small fixed

ε > 0

. (The upper bound on R will be dictated by Bombieri–Vinogradov usage.) We will take

z = M^{ϑ}

with

0 < ϑ < 1 / 2

(as in Lemma A.3) and choose R as a small power of M with

R ≪ M^{1 / 2 - ε}

.

Define Selberg-type weights supported on divisors

d \leq R

by

λ_{d} : = μ (d) g (\frac{log d}{log R}),

where

g : [- \infty, 1] \to R

is a fixed smooth compactly supported function with

g (t) = 0

for

t > 1

and

g (0) = 1

. A simple and standard choice is the linear cut-off

λ_{d} = μ (d) \frac{log (R / d)}{log R} (1 \leq d \leq R),

but for rigorous error control one may prefer a smoothed g. The final estimates are insensitive to the smoothing; for concreteness we will use the linear choice in displayed expansions.

For each m set the detecting weight

ω_{m} : = (\sum_{\begin{matrix} d ∣ n_{1} \\ d \leq R \end{matrix}} λ_{d}) (\sum_{\begin{matrix} e ∣ n_{2} \\ e \leq R \end{matrix}} λ_{e}) .

Equivalently (and more symmetrically for quadratic expansions) one may take

ω_{m} = {(\sum_{\begin{matrix} d ∣ n_{1} n_{2} \\ d \leq R \end{matrix}} λ_{d}^{'})}^{2}

for suitable derived coefficients

λ_{d}^{'}

; the two formulations are interchangeable for the twin-shift and our purpose. We proceed with the product form.

5.2. Key Sums

Define the following sums (over

m \leq M

):

S_{0} : = \sum_{m \leq M} ω_{m}, S_{1} : = \sum_{m \leq M} ω_{m} (1_{n_{1} prime} + 1_{n_{2} prime}),

and the crucial second-moment/paired sum

S_{2} : = \sum_{m \leq M} ω_{m} 1_{n_{1} prime} 1_{n_{2} prime} .

Our objective: show

S_{2} ≫ M / {log}^{2} M

(or at least

S_{2} \geq c S_{0}

with positive c and

S_{0} ≍ M / {log}^{2} M

), which would give

≫ M / {log}^{2} M

pairs with both primes.

5.3. Expansion and the “Diagonal” Main Term

Expand

S_{0}

first:

S_{0} = \sum_{m \leq M} \sum_{\begin{matrix} d ∣ n_{1} \\ d \leq R \end{matrix}} \sum_{\begin{matrix} e ∣ n_{2} \\ e \leq R \end{matrix}} λ_{d} λ_{e} = \sum_{d \leq R} \sum_{e \leq R} λ_{d} λ_{e} # {m \leq M : d ∣ n_{1}, e ∣ n_{2}} .

Note that

d ∣ n_{1} = 6 m + 5

is equivalent to

m \equiv m_{1} (d) (mod d)

for a unique residue class modulo d (provided

gcd (d, 6) = 1

; divisors with common factor 2 or 3 are negligible and may be removed by restricting to odd d coprime to 6; we omit the trivial bookkeeping). Likewise

e ∣ n_{2}

selects a residue class mod e. Thus the count of m with both divisibility conditions equals

# {m \leq M : m \equiv a_{d} (mod d), m \equiv b_{e} (mod e)} = \frac{M}{lcm (d, e)} + O (1) .

Inserting this into the expansion and isolating the main term we obtain

S_{0} = M \sum_{d \leq R} \sum_{e \leq R} \frac{λ_{d} λ_{e}}{lcm (d, e)} + O (\sum_{d, e \leq R} | λ_{d} λ_{e} |) .

The main double sum is the standard Selberg quadratic form

Q (λ)

; with our choice of

λ_{d}

one shows (classically) that

Q (λ) ≍ \frac{1}{{(log R)}^{2}} S (H),

where

S (H)

is the singular series of the admissible set

H = {0, 2}

. For the twin-shift

S ({0, 2}) > 0

is the classical twin-prime singular series. Concretely,

S_{0} = c_{0} \frac{M}{{(log R)}^{2}} + O (M \cdot O ({(log R)}^{- 3}) + R^{1 + o (1)})

for an explicit

c_{0} > 0

depending on

S (H)

and the choice of g (or the linear profile giving the

1 / {(log R)}^{2}

scaling). The

R^{1 + o (1)}

term arises from the trivial bound on

\sum | λ_{d} |

; with smoothing it can be made

≪ R^{ε}

.

Thus, morally,

S_{0} ≍ \frac{M}{{(log R)}^{2}} .

5.4. Expansion of $S_{1}$ (Single-Prime Detection)

Similarly

S_{1} = \sum_{m \leq M} \sum_{d \leq R} \sum_{e \leq R} λ_{d} λ_{e} (1_{d ∣ n_{1}} 1_{e ∣ n_{2}}) \cdot (1_{n_{1} prime} + 1_{n_{2} prime}) .

Symmetry gives the same contribution from

n_{1}

and

n_{2}

. Focus on the

n_{1}

term:

\sum_{d, e \leq R} λ_{d} λ_{e} \sum_{\begin{matrix} m \leq M \\ d ∣ n_{1} \\ e ∣ n_{2} \end{matrix}} 1_{n_{1} prime} .

Exchange sums: for fixed

d, e

the innermost sum counts primes in an arithmetic progression:

\sum_{\begin{matrix} m \leq M \\ m \equiv a_{d} (mod d) \\ m \equiv b_{e} (mod e) \end{matrix}} 1_{6 m + 5 prime} = \sum_{\begin{matrix} n \leq 6 M + 5 \\ n \equiv r (mod ℓ) \end{matrix}} 1_{n prime},

where

ℓ = lcm (d, e)

and r is the corresponding residue class modulo ℓ. The Prime Number Theorem in arithmetic progressions (averaged form) gives

\sum_{\begin{matrix} n \leq X \\ n \equiv r (mod ℓ) \end{matrix}} 1_{n prime} = \frac{Li (X)}{φ (ℓ)} + E (X; ℓ, r),

with E the error term. For

ℓ \leq R \leq M^{1 / 2 - ε}

we may apply Bombieri–Vinogradov (mean square over ℓ) to control the total contribution of all off-diagonal ℓ’s.

Inserting the expected main term and summing over

d, e

we obtain

S_{1} = 2 \cdot M \cdot \frac{1}{log M} \cdot \sum_{d, e \leq R} \frac{λ_{d} λ_{e}}{lcm (d, e)} + O (\sum_{d, e \leq R} | λ_{d} λ_{e} | \cdot max_{ℓ \leq R} | E (6 M, ℓ, \cdot) |) .

Hence (comparing with the expansion of

S_{0}

)

S_{1} ≍ \frac{M}{log M} \cdot \frac{1}{{(log R)}^{2}} .

5.5. Expansion of $S_{2}$ (Double-Prime Detection) and Diagonal Term

Now consider

S_{2} = \sum_{m \leq M} ω_{m} 1_{n_{1} prime} 1_{n_{2} prime} .

Again expand

ω_{m}

and exchange sums to arrive at sums of the shape

\sum_{d, e \leq R} λ_{d} λ_{e} \sum_{\begin{matrix} m \leq M \\ d ∣ n_{1} \\ e ∣ n_{2} \end{matrix}} 1_{n_{1} prime} 1_{n_{2} prime} .

For fixed

d, e

we are counting prime pairs

(n_{1}, n_{2}) = (6 m + 5, 6 m + 7)

satisfying congruence conditions modulo

ℓ = lcm (d, e)

. The heuristic main term (and the diagonal contribution) comes from the expected prime pair density in admissible tuples:

\sum_{\begin{matrix} m \leq M \\ m \equiv r (\mod ℓ) \end{matrix}} 1_{6 m + 5 prime} 1_{6 m + 7 prime} \approx \frac{M}{ℓ} \cdot \frac{2 S ({0, 2})}{{(log M)}^{2}},

where

2 S ({0, 2}) / {(log M)}^{2}

is the Hardy–Littlewood twin-pair heuristic (the factor 2 is from the two positions). Summing over

d, e

with weights

λ_{d} λ_{e}

as before yields the diagonal main term

Main (S_{2}) = c_{2} \frac{M}{{(log M)}^{2}} \cdot \sum_{d, e \leq R} \frac{λ_{d} λ_{e}}{lcm (d, e)} ≍ \frac{M}{{(log R)}^{2}} \cdot \frac{1}{{(log M)}^{2}} \cdot (const),

and after reabsorbing constants we obtain the expected scale

S_{2}^{diag} ≍ \frac{M}{{(log M)}^{2}} .

5.6. Off-Diagonal Terms and Bombieri–Vinogradov Control

The rigorous justification of the above diagonal heuristic relies on controlling the error terms when replacing prime indicators by their expected densities in progressions. For

S_{1}

and

S_{2}

the error contributions involve sums like

\sum_{\begin{matrix} d, e \leq R \\ ℓ = lcm (d, e) \leq R \end{matrix}} λ_{d} λ_{e} \cdot E (6 M; ℓ, \cdot),

where

E (6 M; ℓ, \cdot)

denotes the error in the prime counting function in progressions modulo ℓ. Bombieri–Vinogradov (BV) asserts that for any

A > 0

there is

B = B (A)

such that

\sum_{q \leq Q} max_{(a, q) = 1} | π (X; q, a) - \frac{Li (X)}{φ (q)} | ≪ \frac{X}{{(log X)}^{A}}

for

Q \leq X^{1 / 2} {(log X)}^{- B}

. Choosing

Q = R

and

R \leq M^{1 / 2 - ε}

ensures these error sums are

≪ M / {(log M)}^{A^{'}}

for any prescribed

A^{'}

by taking A large in the BV statement and adjusting B. The cumulative effect of all such off-diagonal errors in the expansions of

S_{1}

and

S_{2}

is therefore negligible compared with the diagonal main terms.

More precisely: the off-diagonal contribution to

S_{2}

can be bounded by

\sum_{ℓ \leq R} τ (ℓ) | E (6 M; ℓ, \cdot) | \cdot \sum_{\begin{matrix} d, e \leq R \\ lcm (d, e) = ℓ \end{matrix}} | λ_{d} λ_{e} | ≪ \frac{M}{{(log M)}^{A^{'}}}

for suitable

A^{'}

by BV and the multiplicative bounds for

λ_{d}

(and

τ (ℓ)

grows slowly).

5.7. Conclusion of the Sieve Computation

Combining the diagonal main terms and the off-diagonal (BV-controlled) error estimates we obtain

S_{0} ≍ \frac{M}{{(log R)}^{2}}, S_{1} ≍ \frac{M}{log M} \cdot \frac{1}{{(log R)}^{2}}, S_{2} ≍ \frac{M}{{(log M)}^{2}} \cdot \frac{1}{{(log R)}^{2}} .

In particular with the choice

R = M^{δ}

for sufficiently small

δ > 0

(one may take

δ < 1 / 2 - ε

to satisfy BV), we get

S_{0} ≍ M

up to polylogarithmic factors and

S_{2} ≫ M / {(log M)}^{2}

quantitatively. Equivalently, the combinatorial Cauchy-type inequality standard in GPY/Maynard yields a lower bound for the number of

m \leq M

with both

n_{1}, n_{2}

prime:

# {m \leq M : n_{1}, n_{2} both prime} ≫ \frac{M}{{(log M)}^{2}} .

5.8. Parameter Choice Summary

To make all error terms negligible and to connect with Lemmas 2-4 we fix:

z = M^{ϑ} (0 < ϑ < 1 / 2), R = M^{δ} (0 < δ < 1 / 2 - ε),

and impose the sieve-small-primes set

P

to satisfy

y \leq R^{1 - δ^{'}}

. With these choices Bombieri–Vinogradov controls the off-diagonal errors and the Selberg optimization yields the stated lower bounds. Finally Lemma 4 removes the contribution of “both composite but with large prime factors” cases and Lemma 3 excludes persistent filling of twin-candidate positions by sifted composites; therefore the GPY mass indeed forces genuine twin primes in quantity

≫ M / {(log M)}^{2}

.

This completes the Selberg–GPY weight analysis.

6. Optimization of the Selberg Weights: Computation of $Q (λ)$ and Bounds for $\sum λ_{d}^{2}$

6.1. Notation and Objective

Let

R \geq 1

be the sieve level. We consider real coefficients

{λ_{d}}_{d \leq R}

, supported on

d \leq R

and with

λ_{1} = 1

. Define the quadratic form

Q (λ) : = \sum_{d \leq R} \sum_{e \leq R} \frac{λ_{d} λ_{e}}{lcm (d, e)} = \sum_{d, e \leq R} λ_{d} λ_{e} \frac{gcd (d, e)}{d e} .

The Selberg upper-sieve principle (as used in previous sections) shows that the upper bound on the count of integers coprime to a set of primes up to y inside an interval is controlled by

H \cdot Q (λ)

up to harmless factors, so we seek to choose

λ

minimizing

Q (λ)

(subject to the normalization

λ_{1} = 1

and support). The classical optimization produces near-optimal weights

λ_{d}

of the multiplicative form

μ (d) g (\frac{log d}{log R})

with an explicit smooth profile g; below we derive this and deduce the main estimates used in our sieve.

6.2. Rewriting Q via Multiplicative Convolution

We start with an identity which is convenient for analyzing Q. Using

lcm (d, e) = d e / gcd (d, e)

:

Q (λ) = \sum_{d, e \leq R} λ_{d} λ_{e} \frac{gcd (d, e)}{d e} = \sum_{r \leq R} \frac{1}{r} \sum_{\begin{matrix} d^{'}, e^{'} \leq R / r \\ gcd (d^{'}, e^{'}) = 1 \end{matrix}} \frac{λ_{r d^{'}} λ_{r e^{'}}}{d^{'} e^{'}} .

This is obtained by writing

d = r d^{'}, e = r e^{'}

with

r = gcd (d, e)

. (The coprimality condition

gcd (d^{'}, e^{'}) = 1

appears naturally.) The identity is purely combinatorial and will be used to pass to multiplicative transforms.

Define

α_{r} : = \sum_{\begin{matrix} d \leq R \\ r ∣ d \end{matrix}} \frac{λ_{d}}{d / r} = \sum_{d^{'} \leq R / r} \frac{λ_{r d^{'}}}{d^{'}} .

A short computation using the previous display yields the useful representation

Q (λ) = \sum_{r \leq R} \frac{1}{r} {(\sum_{\begin{matrix} d^{'} \leq R / r \end{matrix}} \frac{λ_{r d^{'}}}{d^{'}})}^{2} = \sum_{r \leq R} \frac{α_{r}^{2}}{r} .

(Indeed expand the square and reorganize to recover the earlier sum. This form is standard in treatments of the Selberg quadratic form.)

6.3. Lagrange Multipliers and the Optimal Structure

We now minimize

Q (λ)

under the linear normalization constraint

λ_{1} = 1

(and optionally small regularity/smoothness constraints). A more sophisticated classical formulation is to minimize

Q (λ)

subject to the finite family of linear constraints

\sum_{d ∣ n} λ_{d} \geq 1

for integers n with no small prime factors; however the usual practical approach is to impose only the normalization and then choose

λ

multiplicative and smooth in

log d

; this yields nearly optimal weights for the upper-sieve situation.

Working with the

α_{r}

variables is convenient. Introduce Lagrange multiplier

θ

for the constraint that corresponds to normalizing the value at

n = 1

; in the

α

-frame this produces the linear system

\frac{2 α_{r}}{r} = θ \cdot β_{r} (r \leq R),

for suitable coefficients

β_{r}

depending on the chosen normalizing linear functional. Solving formally gives

α_{r} \propto r β_{r}

and hence

α_{r} = c r β_{r}

for some constant c determined by normalization. In multiplicative terms this structure means

α_{r}

is multiplicative and smooth in r, and transferring back to

λ_{d}

via Möbius inversion yields [10]

λ_{d} = μ (d) g (\frac{log d}{log R})

for some smooth profile g supported in

[0, 1]

. The linear cutoff

g (t) = 1 - t

(or smoothed variants) is the classical explicit choice producing good and analyzable constants.

We emphasize: the exact solution of the finite quadratic optimization problem gives a discrete

λ_{d}

that matches the above multiplicative ansatz to high accuracy. This is standard and is carried out in detail in sieve textbooks (e.g., Halberstam–Richert, Iwaniec–Kowalski, Greaves). Using the multiplicative ansatz is thus justified.

6.4. Classical Explicit Choice and Main Estimates

Take the standard linear profile

λ_{d} = μ (d) \frac{log (R / d)}{log R} (1 \leq d \leq R),

and

λ_{d} = 0

for

d > R

. This choice is convenient for explicit computation and suffices for our error goals. We compute the two key quantities:

6.4.0.1. (i) The quadratic form $Q (λ)$ .

Using the representation

Q = \sum_{r \leq R} α_{r}^{2} / r

and writing

α_{r} = \sum_{d^{'} \leq R / r} λ_{r d^{'}} / d^{'}

, one evaluates asymptotically (standard calculation; see Selberg upper-sieve derivation) that

Q (λ) = \frac{1}{{(log R)}^{2}} \prod_{p \leq y} (1 - \frac{ω (p)}{p}) \cdot (1 + o (1)),

where

ω (p)

counts the number of residue classes modulo p occupied by the admissible tuple (for the twin-shift

{0, 2}

we have

ω (p) = 2

for

p \neq 2, 3

; primes dividing the modulus 6 are treated separately). The product over primes is exactly the classical singular series

S (H)

(up to local factors at 2 and 3); hence one obtains the familiar scaling

Q (λ) ≍ \frac{S ({0, 2})}{{(log R)}^{2}} .

In particular there exists an absolute constant

c_{Q} > 0

(depending on the shift) such that for large R

Q (λ) \leq c_{Q} \frac{1}{{(log R)}^{2}} .

6.4.0.2. (ii) The second moment $\sum_{d \leq R} λ_{d}^{2}$ .

With the linear profile one checks directly (by standard divisor-sum estimates) that

\sum_{d \leq R} λ_{d}^{2} = \sum_{d \leq R} μ^{2} (d) {(\frac{log (R / d)}{log R})}^{2} ≪ \sum_{d \leq R} \frac{1}{d^{ε}} ≪ {(log R)}^{O (1)} .

More concretely, using Mertens’ theorem and partial summation one gets

\sum_{d \leq R} λ_{d}^{2} ≪ log R .

(With smoothed g one can make this bound even slightly better, but

≪ log R

is sufficient for all subsequent estimates.)

6.5. Putting the Estimates to Use in the Sieve

The Selberg upper-sieve in an interval of length H gives (see Lemma 2)

# {n \in [X, X + H] : gcd (n, P (y)) = 1} \leq H \cdot Q (λ) + O (\sum_{d \leq R} λ_{d}^{2}) .

Inserting the previous asymptotics

Q (λ) ≍ c_{Q} / {(log R)}^{2}

and

\sum λ_{d}^{2} ≪ log R

yields

# {n \in [X, X + H] : gcd (n, P (y)) = 1} ≪ H \frac{1}{{(log R)}^{2}} + O (log R) .

For our parameter regime

R ≍ H^{1 / 2}

(or

R = M^{δ}

in the global GPY argument), the main term dominates and the error

≪ log R

is negligible in the final counting

≫ M / {log}^{2} M

.

6.6. Remarks and References

The displayed asymptotic for $Q (λ)$ (leading to the $\frac{1}{{(log R)}^{2}}$ scale times the singular series) is standard: see Halberstam–Richert, Iwaniec–Kowalski, or the exposition in Maynard’s paper. The constants are explicit and arise from the integrals involving the profile g; with the linear profile $g (t) = 1 - t$ one obtains the classical factor $1 / {(log R)}^{2}$ times the singular series.
The bound $\sum_{d \leq R} λ_{d}^{2} ≪ log R$ is elementary and follows from multiplicative sums and Mertens’ theorem; smoothing g can replace $log R$ by a slightly smaller polylog factor, but no essential gain is needed.

7. From the Sieve Bound Conjecture 1 to Infinitely Many Twin Primes

In this section we close the argument by showing that the short-interval sieve bound Conjecture 1, together with the GPY/Maynard mass computation and the large-factor elimination, forces the existence of infinitely many twin primes.

Theorem 3.

Fix small constants

0 < η, ε < 1 / 10

and large M. Let

H = M^{η}

, choose

R = M^{δ}

with

0 < δ < \frac{1}{2} - ε

, and set

z = M^{ϑ}

with

0 < ϑ < \frac{1}{2}

. Assume:

(Sieve bound Conjecture 1 in short intervals): for $P = {p \leq y}$ with $y \leq R^{1 - ε}$ we have uniformly for all intervals $I = [X, X + H]$ ,

$# {n \in I : gcd (n, P (y)) = 1} ≪ H \cdot Q (λ) + O (\sum_{d \leq R} λ_{d}^{2}) ≪ \frac{H}{{(log R)}^{2}} + O (log R) .$

(This is Lemma 2 with the optimized Selberg weights of Section §Section 6.)
(Uniform gap hypothesis for $C_{P}$ )Conjecture 1 in the form: the set of indices with gaps $\leq 8$ has zero density for large M.
(Bombieri–Vinogradov)holds up to moduli $q \leq M^{1 / 2 - ε}$ .

Then there exists

c > 0

(depending only on the fixed parameters) such that, for all sufficiently large M,

# \{m \leq M : 6 m + 5, 6 m + 7 are both prime\} \geq c \frac{M}{{(log M)}^{2}} .

In particular, there are infinitely many twin primes.

Proof.

Write

n_{1} = 6 m + 5, n_{2} = 6 m + 7

and put

A : = \{m \leq M : P^{-} (n_{1}) > z, P^{-} (n_{2}) > z\} .

Step 1 (GPY/Maynard mass for twin-shift). By the Selberg–GPY analysis of Section §Section 5 (with

R \leq M^{1 / 2 - ε}

to enable Bombieri–Vinogradov on the off-diagonal), the standard diagonal/off-diagonal computation gives

# A \geq c_{0} \frac{M}{{(log M)}^{2}} (M large),

(2)

for some

c_{0} > 0

depending on the choice of weights (the singular series for

{0, 2}

is folded into

c_{0}

).

Step 2 (remove the “both composite” large-factor cases). Let

B : = {m \in A : n_{1}, n_{2} are both composite} .

By Lemma 4 (large-factor bilinear bound), with

z = M^{ϑ}

and

0 < ϑ < 1 / 2

,

# B = o (M / {log}^{2} M) .

(3)

Set

C : = A ∖ B

. Then combining (2) and (3) we get

# C \geq (c_{0} + o (1)) \frac{M}{{(log M)}^{2}} .

(4)

By definition of

C

, for each

m \in C

at least one of

n_{1}, n_{2}

is prime, and the other (if composite) has all prime factors

> z

; in particular the composite one (if any) lies in the sifted-composite set

C_{P}

.

Step 3 (contradiction if there are only finitely many twin primes). Suppose, towards a contradiction, that the number of twin primes up to

6 M + 7

is

o (M / {log}^{2} M)

. Equivalently,

# T : = # {m \leq M : n_{1}, n_{2} both prime} = o (M / {log}^{2} M) .

Then, by (4), for most

m \in C

exactly one of

n_{1}, n_{2}

is prime, and the other is a sifted composite (all prime factors

> z

). Consider the set

S : = \{n \in [5, 6 M + 7] : n \in C_{P} and n \in {6 m + 5, 6 m + 7} for some m \in C ∖ T\} .

For any two consecutive indices

m, m + 1 \in C ∖ T

, the corresponding elements of

S

lie in

{6 m + 5, 6 m + 7} \cup {6 (m + 1) + 5, 6 (m + 1) + 7}

, hence the gap between consecutive points of

S

is

\leq 8

. Thus, on any run of consecutive m’s drawn from

C ∖ T

, the adjacent gaps among the ordered elements of

S

are

\leq 8

.

By a short-interval covering argument, (4) implies that there exist

≫ M / H

disjoint intervals of the form

[m_{0}, m_{0} + H^{'}]

with

H^{'} ≍ H

in the m-variable (equivalently length

≍ 6 H

in the n-variable) that each contain

≫ H / {(log M)}^{2}

elements of

C ∖ T

. In each such interval, deleting at most

O (1)

boundary indices leaves a sub-run of consecutive m’s of length

≫ H / {(log M)}^{2}

. Therefore, within each such short interval in the n-line we produce

≫ H / {(log M)}^{2}

consecutive gaps

\leq 8

among elements of

C_{P}

.

Summing over the

≫ M / H

disjoint short intervals yields

# \{n \leq 6 M + 7 : g_{n} (P) \leq 8\} ≫ \frac{M}{{(log M)}^{2}} .

But this contradicts the uniform gap in Conjecture 1. Hence our assumption was false.

Step 4 (conclusion). Therefore

# T = # {m \leq M : n_{1}, n_{2} both prime} \geq c \frac{M}{{(log M)}^{2}}

for some

c > 0

and all sufficiently large M. Letting

M \to \infty

gives infinitely many twin primes.

This completes the proof of Theorem 2. □

Remark 3

(Choice of parameters). The proof only uses that: (i) GPY/Maynard gives

# A ≫ M / {log}^{2} M

with level of distribution up to

R \leq M^{1 / 2 - ε}

(Bombieri–Vinogradov); (ii) Lemma 4 removes the “both composite with large factors” contribution; (iii) Lemma 3/Conjecture 1 forbids the persistence of short adjacent gaps

\leq 8

in

C_{P}

. Any fixed

0 < η, δ, ϑ < 1 / 2

compatible with these inputs will do.

References

T. Tao. The Elliott–Halberstam conjecture implies the Vinogradov least quadratic nonresidue conjecture. Algebra & Number Theory, 9(4):1005–1034, 2015.
D. A. Goldston, Y. Motohashi, J. Pintz and C. Y. Yildirim. Small gaps between primes exist. Proc. Japan Acad. Ser. A Math. Sci., 82(4):61–65, 2006.
Y. Zhang. Bounded gaps between primes. Annals of Mathematics, 179(3):1121–1174, 2014.
D. H. J. Polymath. The “bounded gaps between primes” Polymath project-a retrospective, 2014. Preprint available. arXiv:1409.8361.
A. F. Gocgen. Gocgen Approach for Bounded Gaps Between Odd Composite Numbers, 2024. Preprint available: preprints202401.1533.
G. Greaves. Selberg’s Upper Bound Method. In Sieves in Number Theory, pages 41–70, 2001.
H. H. Wu and S. Wu. Various proofs of the Cauchy-Schwarz inequality. Octogon mathematical magazine, 17(1), 221–229, 2009.
H. Ogasawara. The multivariate Markov and multiple Chebyshev inequalities. Communications in Statistics-Theory and Methods, 49(2), 441–453, 2020.
J. M. Deshouillers and H. Iwaniec. An additive divisor problem. Journal of the London Mathematical Society, 2(1), 1–14, 1982.
E. A. Bender and J. R. Goldman. On the applications of Möbius inversion in combinatorial analysis. The American Mathematical Monthly, 82(8), 789–803, 1975.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.

On a Conditional Approach to the Twin Prime Conjecture via Sifted Composite Gaps

Abstract

Keywords:

Subject:

1. Introduction

1.1. Historical and Methodological Background

1.2. The Approach of this Paper

1.3. Main Conditional Result

1.3.1. Conditionality and Level of Detail

1.4. Contributions of this Paper

1.5. Perspectives and Future Directions

1.5.1. Numerical Evidence

1.5.2. Extensions to Prime Constellations

1.5.3. Technical Completion

2. Notation and Definitions

2.1. Proof Strategy of Theorem 2

2.2. Outline of Proof of Lemma 1

Step 1: Reduction to short intervals

Step 2: Selberg/Brun upper bound in short intervals

Step 3: Excluding dense clusterings

Step 4: Bilinear sums for large-factor composites

2.3. Completing the Proof of Theorem 2

3. Main Conditional Structure

4. Detailed Proof of Lemma 1

4.1. Short-Interval Selberg Upper Bound for Sifted Composites

4.2. Excluding Dense Short-Gap Clusters via Conjecture 1

4.3. Bilinear-Form Estimate for Large-Factor Composite Pairs

4.4. Synthesis and Completion of Lemma 1

5. Selberg–GPY Weights and Diagonal/Off-Diagonal Analysis

5.1. Setup and Weights

5.2. Key Sums

5.3. Expansion and the “Diagonal” Main Term

5.4. Expansion of S 1 (Single-Prime Detection)

5.5. Expansion of S 2 (Double-Prime Detection) and Diagonal Term

5.6. Off-Diagonal Terms and Bombieri–Vinogradov Control

5.7. Conclusion of the Sieve Computation

5.8. Parameter Choice Summary

6. Optimization of the Selberg Weights: Computation of Q ( λ ) and Bounds for ∑ λ d 2

6.1. Notation and Objective

6.2. Rewriting Q via Multiplicative Convolution

6.3. Lagrange Multipliers and the Optimal Structure

6.4. Classical Explicit Choice and Main Estimates

6.4.0.1. (i) The quadratic form Q ( λ ) .

6.4.0.2. (ii) The second moment ∑ d ≤ R λ d 2 .

6.5. Putting the Estimates to Use in the Sieve

6.6. Remarks and References

7. From the Sieve Bound Conjecture 1 to Infinitely Many Twin Primes

References

MDPI Initiatives

Important Links

Subscribe

5.4. Expansion of $S_{1}$ (Single-Prime Detection)

5.5. Expansion of $S_{2}$ (Double-Prime Detection) and Diagonal Term

6. Optimization of the Selberg Weights: Computation of $Q (λ)$ and Bounds for $\sum λ_{d}^{2}$

6.4.0.1. (i) The quadratic form $Q (λ)$ .

6.4.0.2. (ii) The second moment $\sum_{d \leq R} λ_{d}^{2}$ .