Preprint
Article

This version is not peer-reviewed.

Proof of the Riemann Hypothesis via the Chebyshev Function and the Integral Convergence

Submitted:

22 May 2026

Posted:

22 May 2026

You are already at the latest version

Abstract

In this article, we offer a complete, self-contained, and entirely elementary proof of the mean-square estimate for the Chebyshev function. From this we deduce the convergence of the integral is valid, thus proving the validity of the Riemann hypothesis. The proof primarily employs elementary estimates of the Chebyshev function, the Cauchy-Schwarz inequality, and a dyadic decomposition (with Abel summation applied in the appendix), in which the argument results of this article are already optimal within the elementary framework and sufficient to derive the convergence of the required integral that it is a suffficient condition for the Riemann hypothesis. In particular, the appendix of this paper provides a theoretical complement linking integral convergence, pointwise bounds and analyticity, and concludes that the well-known $o$-bound is valid, thereby reconfirming the validity of the Riemann hypothesis. In other words, we give a self-contained elementary proof for the mean-square estimate that $\displaystyle\int_2^{X} \bigl(\psi(t)-t\bigr)^{2}\,dt = O(X^{2}\log^{2} X),$ where $\displaystyle \psi(x)$ is the Chebyshev function. From this we deduce that $\displaystyle\int_{1}^{\infty}\frac{|\psi(x)-x|}{x^{\frac{3}{2}+\varepsilon}}\,dx < \infty$ holds for every $\varepsilon>0,$ thus concluding the integral $\displaystyle \int_1^{\infty} \frac{\psi(x)-x}{x^{\frac{3}{2}+\varepsilon}}\,dx$ converges absolutely for every $\varepsilon>0,$ so that the integral $\displaystyle \int_1^{\infty} \frac{\psi(x)-x}{x^{\frac{3}{2}+\varepsilon}}\,dx$ converges conditionally for every $\varepsilon>0,$ whereas the integral converges conditionally $\iff \text{RH},$ so then the Riemann hypothesis is true. In particular, the absolute convergence of the integral is equivalent to the conditional convergence of the integral, either of which is equivalent to the $o$-bound: $|\psi(x)-x| = o(x^{\frac{1}{2}+\varepsilon}),$ and all of them imply the $O$-bound: $\psi(x)-x= O(x^{\frac{1}{2}+\varepsilon})$ is also valid for every $\varepsilon>0,$ thus reconfirming the validity of the Riemann hypothesis.

Keywords: 
;  ;  ;  ;  

1. Introduction

The Chebyshev function ψ ( x ) = n x Λ ( n ) plays a central role in the study of the distribution of prime numbers. The prime number theorem (PNT), proved independently by Hadamard and de la Vallée Poussin in 1896 using complex analysis, states that ψ ( x ) x , refer to Edwards [6].
In fact, the prime number theorem is equivalent to not only the asymptotic behavior ψ ( x ) x , but also to the statement that the error term ψ ( x ) x satisfies ψ ( x ) x = o ( x ) as x . A major breakthrough in the twentieth century was the discovery of elementary proofs of the prime number theorem by Selberg [3] and Erdos [4] in 1949, avoiding complex analysis, but the error term remained rather weak. In this paper, we will deduce the stronger error term that ψ ( x ) x = o ( x 1 2 + ε ) holds for every ε > 0 .
In this paper, we focus on the mean-square estimate of ψ ( x ) x . Our goal is to prove, by purely elementary means, that
2 X ψ ( t ) t 2 d t = O ( X 2 log 2 X ) .
This estimate is unconditional and, as we shall see, already suffices to prove the convergence of 1 | ψ ( x ) x | x 3 2 ε d x for every ε > 0 .
The proof mainly uses elementary estimates of the Chebyshev function, and Cauchy-Schwarz inequality. In particular, we avoid any complex analysis or unproven hypotheses.
The paper is organized as follows.
Section 1 and Section 2 recall necessary notation and preliminary estimates. Section 3 proves an upper bound for U ( N ) = k = 1 N k A ( N / k ) 2 , where A ( x ) = ψ ( x ) x .  Section 4 expands U ( N ) / N .  Section 5 establishes a pointwise lower bound for the weights in U ( N ) / N .  Section 6 using the lower bound for the weights and concluding the core estimate m = 1 N A ( m ) 2 m = O ( N log 2 N ) .  Section 7 converts this discrete estimate into the continuous mean-square bound (1). Section 8 gives the proof of the convergence of 1 | ψ ( x ) x | x 3 2 ε d x for every ε > 0 .  Section 9 concludes with a summary, where we obtain 1 | ψ ( x ) x | x 3 2 + ε d x < for every ε > 0 , thus the integral 1 ψ ( x ) x x 3 2 + ε d x converges absolutely for every ε > 0 , so that the integral 1 ψ ( x ) x x 3 2 + ε d x converges conditionally for every ε > 0 , whereas the integral converges conditionally RH , so then the Riemann hypothesis is true. In particular, an appendix provides a theoretical complement linking integral convergence, pointwise bounds and analyticity, where the absolute convergence of the integral is equivalent to the conditional convergence of the integral, either of which is equivalent to the o-bound: | ψ ( x ) x | = o ( x 1 2 + ε ) , and all of them imply the O-bound: ψ ( x ) x = O ( x 1 2 + ε ) is also valid for every ε > 0 , thus reconfirming the validity of the Riemann hypothesis.

2. Preliminary

2.1. Notation Conventions and Logical Symbols

We use the following notation throughout the paper:
N = { 1 , 2 , 3 , } .
p always denotes a prime.
log x is the natural logarithm.
For a real number x , x is the greatest integer x , the fractional part is { x } = x x , which satisfies 0 { x } < 1 , and { x } = 0 if and only if x is an integer.
f ( x ) = O g ( x ) or f ( x ) g ( x ) means there exists a positive constant C > 0 such that | f ( x ) | C g ( x ) for all sufficiently large x . More importantly, if g ( x ) is positive for all x in the domain, then there exists an absolute constant C 0 > 0 such that the asymptotic estimate holds for all sufficiently large x and all finite x .
f ( x ) = o g ( x ) means lim x f ( x ) / g ( x ) = 0 . It is to be observed that the asymptotic relation f ( x ) = o g ( x ) implies, and is stronger than, the asymptotic relation f ( x ) = O g ( x ) .
f ( x ) g ( x ) means lim x f ( x ) / g ( x ) = 1 .
Summations n x are over integers n with 1 n x .
A ( N / n ) 2 means A ( N / n ) 2 .
For the logical relations in theorems and lemmas we employ the standard symbols:
⇒: implies, if … then
⇐: is implied by
⇔: if and only if (iff)
The following terminology is used:
Sufficient condition: a b means that if a is true, then b is true; a guarantees b .
Necessary condition: a b means that if b is true, then a must be true; without a we cannot have b , i.e., if a is not true, then b must be not true.
Necessary and sufficient condition (equivalence): a b means a b and b a ; a and b are either both true or both false.

2.2. Basic Definitions

The Chebyshev ψ -function is defined by
ψ ( x ) = p k x log p ,
where the sum runs over all prime powers p k (p prime, and integer k 1 ).
The Chebyshev ϑ -function is defined by ϑ ( x ) = p x log p , where the sum runs over primes p x .
The von Mangoldt function Λ ( n ) is defined by
Λ ( n ) = log p if n = p k for some prime p and integer k 1 , 0 otherwise .
A well-known identity connects it with the Chebyshev function ψ ( x ) = n x Λ ( n ) .
Define a n = Λ ( n ) 1 and A ( x ) = n x a n . Then
A ( x ) = n x a n = n x Λ ( n ) n x 1 = ψ ( x ) x = ψ ( x ) x + { x }
with fractional part 0 { x } < 1 . Thus, ψ ( x ) = A ( x ) + x = A ( x ) + x { x } , and ψ ( x ) x = A ( x ) { x } . For integer n , n = n , so A ( n ) = ψ ( n ) n .

2.3. The Prime Number Theorem

Let π ( x ) denote the number of primes not exceeding x ( > 0 ) , which is the familiar prime-counting function.
The classical prime number theorem asserts that ψ ( x ) x as x , or equivalently π ( x ) x log x . While this result is not needed in our derivations (all estimates we employ are elementary or established results analogous to the PNT), it provides the historical context and motivation for studying the error term ψ ( x ) x .

2.4. Non-trivial Zeros of Riemman’s Zeta Function and the Riemann Hypothesis

The Riemman zeta function is defined by ζ ( s ) = n = 1 1 n s , where ζ ( s ) extends meromorphically to the whole complex plane with a simple pole at s = 1 with residue 1 .
Non-trivial zeros of ζ ( s ) are the zeros of ζ ( s ) in the critical strip 0 < Re ( s ) < 1 .
As is well-known, the Riemann hypothesis (RH) states that all non-trivial zeros lie on the line Re ( s ) = 1 2 .
In addition, we have the following facts.
Theorem 1.
(cf.[2], p.17.) If e ( s ) > 1 , then
log ζ ( s ) = p log 1 p s = p , m p m s m ,
where p runs through all primes and m through all positive integers.
Theorem 2.
(cf.[2], pp.17-18.) For e ( s ) > 1 , then
ζ ( s ) ζ ( s ) = p log p p s 1 = p , m ( log p ) p m s = n = 1 Λ ( n ) n s = s 1 ψ ( x ) x s + 1 d x
where p runs through all primes and m through all positive integers.
We see that ζ ( s ) ζ ( s ) 1 s 1 = 1 + s 1 ψ ( x ) x x s + 1 d x holds for e ( s ) > 1 , which is a well-known formula and extends meromorphically to the half-plane e ( s ) > 0 with some poles at non-trivial zeros of ζ ( s ) , but no other poles on the region e ( s ) > 0 .

2.5. Well-known Equivalent Forms of RH in Terms of the Chebyshev Function

There are well-known equivalent forms of the Riemann hypothesis in terms of the Chebyshev function ψ ( x ) :
(1). Standard arithmetical form: ψ ( x ) x = O x 1 2 log 2 x RH .
(2). Another arithmetical form: For every ε > 0 ,
ψ ( x ) x = O x 1 2 + ε RH .
(3). Integral form (analyticity):
1 ψ ( x ) x x s + 1 d x is analytic for Re ( s ) > 1 2 RH .
(4). Integral form (conditional convergence): For every ε > 0 ,
1 ψ ( x ) x x 3 2 + ε d x converges conditionally RH ,
since f ( s ) : = 1 ψ ( x ) x x s + 1 d x is analytic for Re ( s ) > 1 2 1 ψ ( x ) x x 3 2 + ε d x converges conditionally.
Remark 1.
For instance, these celebrated equivalent formulations of the Riemann hypothesis can be found in the paper by von Koch [1] and on pages 83 85 of the monograph by Ingham [2], which offer rich details and comprehensive perspectives.

2.6. Basic Analytical Tools

We shall employ the following standard results.
(i) Cauchy-Schwarz inequality.
For any real numbers u 1 , , u N and v 1 , , v N ,
n = 1 N u n v n 2 n = 1 N u n 2 n = 1 N v n 2 .
We also use its integral form: for functions f , g on an interval I ,
I f ( x ) g ( x ) d x 2 I f ( x ) 2 d x I g ( x ) 2 d x .
For a reference, please see, G. H. Hardy, J. E. Littlewood & G. Pólya [5], Theorems 7 and 181 .
(ii) A lemma on the O-notation.
Lemma 1.
(Bidirectional transfer). Let f ( N ) , g ( N ) , h ( N ) be functions defined on positive integers and suppose
f ( N ) = g ( N ) + O h ( N ) .
Then
g ( N ) = O h ( N ) if and only if f ( N ) = O h ( N ) .
Proof. 
By (4), there exist constants C 0 > 0 and N 0 such that
| f ( N ) g ( N ) | C 0 | h ( N ) | for all N N 0 .
Sufficiency: Assume g(N) = O(h(N)). Then there exist constants C 1 > 0 and N 1 with | g ( N ) | C 1 | h ( N ) | for N N 1 . For N max ( N 0 , N 1 ) , we have
| f ( N ) | | f ( N ) g ( N ) | + | g ( N ) | ( C 0 + C 1 ) | h ( N ) | ,
hence f ( N ) = O ( h ( N ) ) .
Necessity: Assume f ( N ) = O ( h ( N ) ) . Then there exist constants C 2 > 0 and N 2 with | f ( N ) | C 2 | h ( N ) | for N N 2 . For N max ( N 0 , N 2 ) , we have
| g ( N ) | | g ( N ) f ( N ) | + | f ( N ) | ( C 0 + C 2 ) | h ( N ) | ,
hence g ( N ) = O ( h ( N ) ) .  □
Corollary 1.
(Absorption). If h ( N ) = O ( k ( N ) ) , then
f ( N ) = g ( N ) + O h ( N ) f ( N ) = g ( N ) + O k ( N ) .
Proof. 
By h ( N ) = O ( k ( N ) ) , there exist constants C > 0 and N 0 such that | h ( N ) | C | k ( N ) | for all N N 0 . Hence any function that is O ( h ( N ) ) is also O ( k ( N ) ) . Therefore the error term O ( h ( N ) ) can be replaced by O ( k ( N ) ) .  □
(iii) Two lemmas on the partition by floor values.
Lemma 2.
(Partition by Floor Values). For a fixed integer N 1 , define for each integer m 1 the set
J m = k { 1 , , N } : N k = m .
Then the collection { J m : m = 1 , 2 , , N } forms a partition of { 1 , 2 , , N } . Furthermore, an equivalent description is
J m = k N : N m + 1 < k N m .
Proof. 
The proof of the lemma consists of several steps.
Step 1. The values of m range from 1 to N . Since 1 k N , we have N / k is an integer between N / N = 1 and N / 1 = N . Hence m = N / k takes values in { 1 , 2 , , N } .
Step 2. Disjointness. If k J m and also k J m with m m , then N / k would equal two different integers, which is impossible. So the sets J m are pairwise disjoint.
Step 3. Covering. For any k { 1 , , N } , let m = N / k . Then 1 m N and by definition k J m . Hence every element of { 1 , , N } belongs to some J m . Thus { J m } m = 1 N is a partition of { 1 , , N } .
Step 4. Equivalent description. We show J m = k N : N m + 1 < k N m . From N / k = m we have m N k < m + 1 . Inverting (all terms positive) gives N m + 1 < k N m . Conversely, if k satisfies the inequality N m + 1 < k N m , then m N k < m + 1 , so N / k = m . The condition k N together with k N / m N automatically implies 1 k N . Hence the two descriptions coincide. Therefore, the lemma is proved. □
Lemma 3.
For J m = k N : N m + 1 < k N m , we have k J m k = O N 2 m 2 .
Proof. 
The proof of the lemma consists of several steps.
Step 1. Every k J m satisfies k N / m . Hence k J m k N m · | J m | .
Step 2. The length of the interval is N m N m + 1 = N m ( m + 1 ) . Therefore the number of integers in it is at most | J m | N m ( m + 1 ) + 1 .
Step 3. Substituting, k J m k N m N m ( m + 1 ) + 1 = N 2 m 2 ( m + 1 ) + N m .
Step 4. Bound each term: N 2 m 2 ( m + 1 ) N 2 m 3 , and N m = N 2 m 2 · m N N 2 m 2 , where m N .
Step 5. Thus k J m k N 2 m 3 + N 2 m 2 2 N 2 m 2 , which proves k J m k = O ( N 2 / m 2 ) . Therefore, the lemma is proved. □
(iv) Dyadic decomposition. For sums over integers n 2 , we can decompose the range using powers of two.
Lemma 4.
Let N 2 be an integer and set L = log 2 N , so that
2 L N < 2 L + 1 .
Then the following identity holds:
n = 2 N f ( n ) = k = 1 log 2 N 2 k 1 < n min ( N , 2 k ) f ( n ) .
Proof. 
For k = 1 , 2 , , L 1 , we have 2 k 2 L 1 · 2 = 2 L N , hence min ( N , 2 k ) = 2 k . The inner sum becomes
2 k 1 < n 2 k f ( n ) ,
which covers the integers 2 k 1 + 1 , , 2 k . For k = L , since N < 2 L + 1 , we have min ( N , 2 L ) = N , and the inner sum becomes 2 L 1 < n N f ( n ) , covering the remaining integers 2 L 1 + 1 , , N . The intervals for k = 1 , , L 1 together with the last one partition the set { 2 , 3 , , N } without overlap. Thus the equality (6) holds. □
Remark 2.
Note that when N , we have log 2 N , so the number of dyadic intervals grows without bound, this property is essential for the convergence arguments in Section 8. If the sum includes n = 1 , we treat it separately:
n = 1 N f ( n ) = f ( 1 ) + n = 2 N f ( n ) .
This decomposition is used in Section 8 to handle integrals via dyadic intervals.
(v) Abel summation (summation by parts), which will be applied in the appendix.
Lemma 5.
Let ( a n ) n 1 and ( b n ) n 1 be sequences of real numbers. Define A 0 = 0 and A n = k = 1 n a k for n 1 . Then for any N 1 , we have
n = 1 N a n b n = A N b N + n = 1 N 1 A n ( b n b n + 1 ) = A N b N n = 1 N 1 A n ( b n + 1 b n ) .
Proof. 
Write a n = A n A n 1 (with A 0 = 0 ). Then
n = 1 N a n b n = n = 1 N ( A n A n 1 ) b n = n = 1 N A n b n n = 1 N A n 1 b n .
Shift the index in the second sum:
n = 1 N A n 1 b n = k = 0 N 1 A k b k + 1 = k = 1 N 1 A k b k + 1
(since A 0 = 0 ). Thus
n = 1 N a n b n = n = 1 N A n b n n = 1 N 1 A n b n + 1 = A N b N + n = 1 N 1 A n ( b n b n + 1 ) ,
then we obtain (7). □

2.7. Elementary Estimates

We shall apply the following well-known elementary facts.
Lemma 6.
(Asymptotic expansion of harmonic numbers). For n ,
k = 1 n 1 k = log n + γ + O 1 n ,
where γ is the Euler-Mascheroni constant.
Proof. 
See Apostol [7], Chapter 3, Theorems 3.1 and 3.2. □
Lemma 7.
We have the elementary estimates for the bounds:
ϑ ( x ) = O ( x ) , ψ ( x ) = O ( x ) , A ( x ) = O ( x ) .
The two Chebyshev functions ψ ( x ) and ϑ ( x ) satisfy ψ ( x ) = ϑ ( x ) + O ( x ) .
The details of the proof are presented below, consisting of three steps.
Proof. 
Step 1. Proof of ϑ ( x ) = O ( x ) (Chebyshev upper bound)
Consider the binomial coefficient 2 n n .
Upper bound: 2 n n ( 1 + 1 ) 2 n = 4 n .
Lower bound via primes: every prime p with n < p 2 n divides 2 n n . Hence
n < p 2 n p 2 n n 4 n .
Taking logarithms: ϑ ( 2 n ) ϑ ( n ) n log 4 . Apply this repeatedly for n = 2 k and sum: ϑ ( 2 m ) 2 m log 4 . For general x , choose m such that 2 m x < 2 m + 1 . Then
ϑ ( x ) ϑ ( 2 m + 1 ) 2 m + 1 log 4 4 x log 2 = O ( x ) .
Thus there exists C > 0 such that ϑ ( x ) C x for all x 1 .
Step 2. Proof of ψ ( x ) = ϑ ( x ) + O ( x )
Decompose ψ ( x ) by prime power exponents:
ψ ( x ) = k 1 p k x log p = ϑ ( x ) + ϑ ( x ) + ϑ ( x 3 ) + ,
where only terms with x 1 / k 2 (i.e., k log 2 x ) are non-zero.
Using ϑ ( y ) C y from step 1,
ψ ( x ) ϑ ( x ) = k = 2 log 2 x ϑ ( x 1 / k ) C k = 2 log 2 x x 1 / k .
Split the sum:
For k = 2 : term C x .
For k 3 : x 1 / k x 1 / 3 , and there are at most log 2 x such terms. Hence
k = 3 log 2 x x 1 / k x 1 / 3 · log 2 x = o ( x ) ( x ) .
This part is O ( x 1 / 3 log x ) , which is certainly O ( x ) . Therefore
ψ ( x ) ϑ ( x ) = O ( x ) .
(One often writes O ( x log x ) , but the above shows O ( x ) suffices because x 1 / 3 log x = o ( x ) . )
Step 3. From ϑ ( x ) = O ( x ) and ψ ( x ) = ϑ ( x ) + O ( x ) we immediately obtain ψ ( x ) = O ( x ) . Hence | A ( x ) | = | ψ ( x ) x | ψ ( x ) + x = O ( x ) , whereas x x . Therefore, the lemma is proved. □

3. Estimating U(N)

Lemma 8.
Let U ( N ) = k = 1 N k A ( N / k ) 2 . Then
U ( N ) = O ( N 2 log N ) .
Proof. 
Chebyshev’s bound gives | A ( x ) | = | ψ ( x ) x | ψ ( x ) + x = O ( x ) for some constant C > 0 and all x 1 , where ψ ( x ) = O ( x ) and x x . Then
| U ( N ) | = k = 1 N k A ( N / k ) 2 k = 1 N k · C 2 N 2 k 2 = C 2 N 2 k = 1 N 1 k = O ( N 2 log N ) .
Therefore, the lemma is proved. □
Corollary 2.
U ( N ) N = O ( N log N ) .
Proof. 
The proof follows immediately from Lemma 8. □

4. Expanding U(N)/N

For each k = 1 , , N define m = N / k . According to Lemma 2 (Partition by Floor Values), we know that the sets
J m = k { 1 , , N } : N k = m = k N : N m + 1 < k N m
form a partition of { 1 , , N } . Hence
U ( N ) = k = 1 N k A ( N / k ) 2 = m = 1 N k J m k A ( N / k ) 2 .
For k J m we write A ( N / k ) = A ( m ) + δ m , k . To bound δ m , k , note that
δ m , k = A ( N / k ) A ( m ) = ψ ( N / k ) N / k ψ ( m ) m = ψ ( N / k ) ψ ( m ) ,
because N / k = m and m = m . Since m N / k < m + 1 , the interval ( m , N / k ] has length N k m < ( m + 1 ) m = 1 , hence it contains at most one integer. Consequently, ψ ( N / k ) ψ ( m ) = m < n N / k Λ ( n ) is either 0 or Λ ( n 0 ) for a single integer n 0 (if it exists). In either case,
| ψ ( N / k ) ψ ( m ) | log N ,
because Λ ( n 0 ) log n 0 log N when n 0 exists. Therefore | δ m , k | log N .
Expanding the square, we have
U ( N ) = m = 1 N k J m k A ( N / k ) 2 = m = 1 N A ( m ) 2 k J m k + 2 m = 1 N A ( m ) k J m k δ m , k + m = 1 N k J m k δ m , k 2 .
For this U ( N ) , we can estimate the error terms in the cross-term and the quadratic term contribute at most O ( N 2 log 2 N ) .
(1). Cross-term:
m = 1 N A ( m ) k J m k δ m , k log N m = 1 N | A ( m ) | k J m k .
Since k J m k = O ( N 2 / m 2 ) and | A ( m ) | = O ( m ) , we have
m = 1 N | A ( m ) | k J m k N 2 m = 1 N 1 m = O ( N 2 log N ) .
Hence the cross-term is O ( N 2 log 2 N ) .
(2). Quadratic term:
m = 1 N k J m k δ m , k 2 log 2 N m = 1 N k J m k = log 2 N k = 1 N k = O ( N 2 log 2 N ) .
Dividing by N , we obtain
U ( N ) N = m = 1 N A ( m ) 2 k J m k N + O ( N log 2 N ) .

5. A Pointwise Lower Bound for the Weights

Lemma 9.
For all m 1 and N 2 , we have k J m k N 1 2 m .
Proof. 
From the explicit description J m = { k : N / ( m + 1 ) < k N / m } , let k min be the smallest element of J m . Then k min > N / ( m + 1 ) . Hence
k J m k N k min N > 1 m + 1 1 2 m ,
because 2 m m + 1 for all m 1 .  □

6. The Core Estimate

Since A ( m ) 2 0 , multiplying the inequality k J m k N 1 2 m of Lemma 7.1 by A ( m ) 2 and summing over m yields m = 1 N A ( m ) 2 k J m k N 1 2 m = 1 N A ( m ) 2 m . Insert this into (12):
U ( N ) N 1 2 m = 1 N A ( m ) 2 m + O ( N log 2 N ) .
By (10), we have U ( N ) N = O ( N log N ) , and therefore U ( N ) N = O ( N log 2 N ) . Hence
1 2 m = 1 N A ( m ) 2 m U ( N ) N + O ( N log 2 N ) = O ( N log 2 N ) .
Thus
m = 1 N A ( m ) 2 m = O ( N log 2 N ) .
This is the central estimate of the paper.

7. From the Core Estimate to an Upper Bound for the Mean-Square Integral

For t [ n , n + 1 ) , we have ψ ( t ) = ψ ( n ) , hence
ψ ( t ) t = ψ ( n ) t = ψ ( n ) n + ( n t ) = A ( n ) + ( n t ) .
Let u = t n [ 0 , 1 ) . Then
n n + 1 ψ ( t ) t 2 d t = 0 1 A ( n ) u 2 d u = A ( n ) 2 A ( n ) + 1 3 .
Summing from n = 2 to X yields
2 X ψ ( t ) t 2 d t = n X A ( n ) 2 n X A ( n ) + O ( X ) .
By (14), we have n = 1 N A ( n ) 2 n = O ( N log 2 N ) , and using the obvious inequality 1 n N , we can bound n = 1 N A ( n ) 2 :
n = 1 N A ( n ) 2 = n = 1 N n · A ( n ) 2 n N n = 1 N A ( n ) 2 n = O ( N 2 log 2 N ) .
So, we have n X A ( n ) 2 = O ( X 2 log 2 X ) . By Cauchy-Schwarz, we get
n X A ( n ) n X 1 1 / 2 n X A ( n ) 2 1 / 2 = O ( X 1 / 2 · X log X ) = O ( X 3 / 2 log X ) ,
which is negligible compared with O ( X 2 log 2 X ) . Therefore
2 X ψ ( t ) t 2 d t = O ( X 2 log 2 X ) .

8. Convergence of the Integral

Theorem 3.
For every ε > 0 , the integral 1 | ψ ( x ) x | x 3 2 + ε d x < , i.e., the integral 1 ψ ( x ) x x 3 2 + ε d x converges absolutely.
Proof. 
For any ε > 0 , we have
1 | ψ ( x ) x | x 3 2 + ε d x 1 2 | ψ ( x ) x | x 3 2 + ε d x + n = 1 2 n 2 n + 1 | ψ ( x ) x | x 3 2 + ε d x .
Let us examine the integral with weight x 3 2 ε on dyadic interval [ 2 n , 2 n + 1 ] , using (17) we have
2 n 2 n + 1 ψ ( x ) x 2 d x = O 2 2 ( n + 1 ) log 2 ( 2 n + 1 ) = O 2 2 n log 2 ( 2 n + 1 ) ,
and by Cauchy-Schwarz, we get
2 n 2 n + 1 | ψ ( x ) x | x 3 2 + ε d x ( 2 n ) 3 2 ε 2 n 2 n + 1 1 2 d x 1 / 2 2 n 2 n + 1 ( ψ ( x ) x ) 2 d x 1 / 2 ( 2 n ) 3 2 ε · ( 2 n ) 1 / 2 · 2 n log ( 2 n + 1 ) = ( n + 1 ) 2 n ε log 2 ,
the right-hand side is ( n + 1 ) 2 n ε . Hence 2 n 2 n + 1 | ψ ( x ) x | x 3 2 + ε d x ( n + 1 ) 2 n ε , and the series n = 1 ( n + 1 ) 2 n ε converges. Adding the finite contribution from [ 1 , 2 ] , we obtain 1 | ψ ( x ) x | x 3 2 + ε d x < , this completes the proof. □

9. Conclusions

We have shown: Elementary estimates of the Chebyshev function lead to U ( N ) = O ( N 2 log N ) , where U ( N ) = k = 1 N k A ( N / k ) 2 and A ( x ) = ψ ( x ) x . From this we derive the core estimate m = 1 N A ( m ) 2 m = O ( N log 2 N ) . Using only this core estimate and the Cauchy-Schwarz inequality, we obtain
2 X ψ ( t ) t 2 d t = O ( X 2 log 2 X ) .
Consequently, 1 | ψ ( x ) x | x 3 2 + ε d x < holds, thus the integral 1 ψ ( x ) x x 3 2 + ε d x converges absolutely for every ε > 0 . All arguments are purely elementary, avoiding complex analysis and unproven hypotheses. The estimate O ( X 2 log 2 X ) is unconditional and best possible with elementary methods. Since we have proven the integral 1 ψ ( x ) x x 3 2 + ε d x converges absolutely for every ε > 0 , which implies that 1 ψ ( x ) x x 3 2 + ε d x converges conditionally for every ε > 0 , whereas the integral 1 ψ ( x ) x x 3 2 + ε d x converges conditionally for every ε > 0 RH , so then the Riemann hypothesis is true.

Appendix A. A Theorem on Integral Convergence, Pointwise Bound and Analyticity

Theorem A1.
Let ψ ( x ) = n x Λ ( n ) be the Chebyshev function. Define
a n = Λ ( n ) 1 , A ( x ) = n x a n = ψ ( x ) x .
For every ε > 0 define
I abs ( ε ) = 1 | ψ ( x ) x | x 3 2 + ε d x , I cond ( ε ) = 1 ψ ( x ) x x 3 2 + ε d x .
Note that ψ ( x ) x = A ( x ) + { x } where { x } = x x is the fractional part, bounded by 1 . Hence the convergence of I cond ( ε ) is equivalent to the convergence of 1 A ( x ) x 3 2 ε d x (since the integral of { x } x 3 2 ε converges absolutely). Therefore we may work with A ( x ) instead of ψ ( x ) x .
Then the following statements are equivalent:
(1). Absolute convergence: I abs ( ε ) < for all ε > 0 .
(2). Conditional convergence: I cond ( ε ) converges for all ε > 0 .
(3). Pointwise o-bound: | ψ ( x ) x | = o ( x 1 / 2 + ε ) for all ε > 0 (hence also O ( x 1 / 2 + ε ) ).
(4). Analyticity: The functions
f ( s ) = 1 | ψ ( x ) x | x s + 1 d x , g ( s ) = 1 ψ ( x ) x x s + 1 d x
are analytic in the half-plane Re ( s ) > 1 2 .
The details of the proof are presented below, consisting of two parts: A.1. Local bounded variation estimate, A.2. Proof of the equivalences.

Appendix A.1. Local Bounded Variation Estimate

Lemma A1.
There exists an absolute constant C > 0 such that for all x 2 and all h with 1 h x ,
| ψ ( x + h ) ψ ( x ) | C h log x .
Proof. 
The interval ( x , x + h ] contains at most h + 1 h + 1 integers. For any integer n in this interval, if n = p k is a prime power, then
Λ ( n ) = log p log n log ( x + h ) ;
otherwise Λ ( n ) = 0 , where we have
ψ ( x + h ) ψ ( x ) = x < n x + h Λ ( n ) .
Hence
| ψ ( x + h ) ψ ( x ) | ( h + 1 ) log ( x + h ) .
Because h x , we have x + h 2 x and log ( x + h ) log ( 2 x ) = log 2 + log x . For h 1 we also have h + 1 2 h . Thus | ψ ( x + h ) ψ ( x ) | 2 h ( log 2 + log x ) . Choosing C = 2 ( log 2 + 1 ) gives the desired inequality.

Appendix B. Proof of the Equivalences

We prove the chain (1) ⇒ (3) ⇒ (1) and (2) ⇒ (3) ⇒ (2). Furthermore, (1) ⇒ (4), and (4) ⇒ (1). All of these form an equivalence chain.
Proof of (1) ⇒ (3). Assume I abs ( ε ) < for every ε > 0 . Fix ε > 0 . If the o-bound were false, then there would exist δ > 0 and a sequence x n such that
| ψ ( x n ) x n | δ x n 1 / 2 + ε .
Set h n = x n 1 / 2 . For sufficiently large n we have h n 1 and h n x n ; thus (A1) applies. For any t [ x n , x n + h n ] ,
| ψ ( t ) ψ ( x n ) | C h n log x n .
Then
| ψ ( t ) t | | ψ ( x n ) x n | | ψ ( t ) ψ ( x n ) | | t x n | δ x n 1 / 2 + ε C x n 1 / 2 log x n x n 1 / 2 .
Since x n ε , for large n we have δ x n ε 2 ( C log x n + 1 ) , whence
| ψ ( t ) t | δ 2 x n 1 / 2 + ε ( t [ x n , x n + h n ] ) .
Now estimate the integral over [ x n , x n + h n ] . Since t x n + h n 2 x n (as h n x n ), and t 3 2 + ε ( 2 x n ) 3 2 + ε = 2 3 / 2 + ε x n 3 / 2 + ε . Using (A4) we obtain | ψ ( t ) t | t 3 2 + ε δ 2 5 2 + ε x n 1 . Hence
x n x n + h n | ψ ( t ) t | t 3 / 2 + ε d t δ 2 5 / 2 + ε x n 1 · h n = δ 2 5 / 2 + ε x n 1 / 2 .
Choose a subsequence (still denoted by x n ) such that the intervals [ x n , x n + h n ] are pairwise disjoint and satisfy x n n 2 (this is always possible by thinning the sequence if necessary). Then n x n 1 / 2 n 1 / n = . Because the intervals are disjoint, (A5) gives
I abs ( ε ) δ 2 5 / 2 + ε n x n 1 / 2 = ,
contradicting the convergence of I abs ( ε ) . Hence | ψ ( x ) x | = o ( x 1 2 + ε ) .
Proof of (3) ⇒ (1). If | ψ ( x ) x | = o ( x 1 2 + ε ) for all ε > 0 , then for any fixed ε > 0 there exists X 0 such that for all x X 0 , we have | ψ ( x ) x | x 1 2 + ε 2 . Thus | ψ ( x ) x | x 3 2 + ε x 1 ε 2 , which is integrable on [ 1 , ) . Hence I abs ( ε ) < .
Proof of (2) ⇒ (3). Assume that I cond ( ε ) converges conditionally for every ε > 0 . Since ψ ( x ) x = A ( x ) + { x } and the integral of { x } x 3 2 ε converges absolutely, the conditional convergence of I cond ( ε ) implies the convergence of 1 A ( x ) x 3 2 + ε d x . Now write A ( x ) = n x a n . For Re ( s ) > 1 , integration by parts gives
1 A ( x ) x s 1 d x = 1 s n = 1 a n n s .
The right-hand side is analytic for Re ( s ) > 1 , and by standard results in Dirichlet series, which states:
Lemma A2.
If a Dirichlet series D ( s ) = n = 1 a n n s converges at a point s = σ 0 with σ 0 > 0 , then its coefficient partial sums satisfy A ( N ) : = n N a n = O ( N σ 0 ) .
A short proof of A ( N ) = O ( N σ 0 ) using Abel summation is as follows. Let the series H ( N ) = n = 1 N a n n σ 0 . Since the series H ( N ) converges, H ( N ) is bounded, say | H ( N ) | M . Then by partial summation, we have
A ( N ) = n = 1 N a n = n = 1 N a n n σ 0 · n σ 0 = H ( N ) N σ 0 k = 1 N 1 H ( k ) ( k + 1 ) σ 0 k σ 0 .
By the telescoping sum, we have k = 1 N 1 ( k + 1 ) σ 0 k σ 0 = N σ 0 1 . Therefore,
k = 1 N 1 H ( k ) ( k + 1 ) σ 0 k σ 0 M ( N σ 0 1 ) = O ( N σ 0 ) .
Thus, A ( N ) = O ( N σ 0 ) . That is to say, if the integral converges at a point s = σ 0 (which is equivalent to the convergence of the Dirichlet series at s = σ 0 ), then the partial sums of the coefficients satisfy
n N a n = O ( N σ 0 ) .
Here the convergence of the integral at s = 1 2 + ε (for every ε > 0 ) implies that the Dirichlet series n = 1 a n n s converges at s = 1 2 + ε . Therefore (A7) holds with σ 0 = 1 2 + ε , giving
ψ ( N ) N = A ( N ) = O ( N 1 2 + ε ) .
Extending the bound to real x . Let x 2 and set N = x . Since ψ is constant on [ N , N + 1 ) , we have ψ ( x ) = ψ ( N ) . Hence
| ψ ( x ) x | = | ψ ( N ) x | | ψ ( N ) N | + | N x | C N 1 2 + ε + 1 .
Because N x and 1 = O ( x 1 2 + ε ) , we conclude | ψ ( x ) x | = O ( x 1 2 + ε ) . Therefore | ψ ( x ) x | = O ( x 1 2 + ε ) for ε > 0 , i.e., there exists an absolute constant C > 0 such that for all sufficiently large x , we have | ψ ( x ) x | C x 1 2 + δ for every δ > 0 . This actually implies the stronger o-bound: given any ε > 0 , choose δ = ε / 2 ; then | ψ ( x ) x | C x 1 2 + δ = o ( x 1 2 + ε ) .
Proof of (3) ⇒ (2). If | ψ ( x ) x | = o ( x 1 2 + ε ) for all ε > 0 , then for any fixed ε > 0 there exists X 0 such that for x X 0 , we have | ψ ( x ) x | x 1 2 + ε 2 . Hence
ψ ( x ) x x 3 2 + ε x 1 ε 2 ,
which is absolutely integrable. Therefore I cond ( ε ) converges absolutely (and hence conditionally).
Thus (1), (2), (3) are all equivalent. The equivalence with analyticity of f ( s ) and g ( s ) follows from standard properties of Dirichlet integrals: absolute convergence in a half-plane implies analyticity, and analyticity implies convergence on the boundary. That is to say, (4) has an analyticity equivalence with (1) and (2).
Proof of (1) ⇒ (4). Assume I abs ( ε ) < for all ε > 0 . Let σ 0 > 1 / 2 be arbitrary and set ε = ( σ 0 1 2 ) / 2 . Then there exists a constant C > 0 such that
| ψ ( x ) x | C x 1 2 + ε
for all large x . For Re ( s ) σ 0 , we have
| ψ ( x ) x | x s + 1 C x 1 2 + ε σ 0 1 = C x 1 ε ,
and 1 x 1 ε d x < . Hence the integral defining f ( s ) converges uniformly on any half-plane Re ( s ) σ 0 . The integrand is analytic in s , so by the Weierstrass theorem for parameter-dependent integrals, f ( s ) is analytic in Re ( s ) > 1 2 . The same estimate applies to g ( s ) because | g ( s ) | f Re ( s ) . Thus (4) holds.
Proof of (4) ⇒ (1). If f ( s ) is analytic for Re ( s ) > 1 / 2 , then in particular for any ε > 0 , the point s = 1 2 + ε lies in the domain of analyticity, so the integral f ( 1 2 + ε ) = 1 | ψ ( x ) x | x 3 2 ε d x converges, i.e., I abs ( ε ) < . The same conclusion for g ( s ) also gives I cond ( ε ) converges, but (1) follows directly. Hence (1) and (4) are equivalent. By the equivalences of (1), (2), and (3), so (4) is also equivalent to (2), and it is equivalent to (3). Thus (1), (2), (3), and (4) are all equivalent. This completes the proof of Theorem A1. □

Appendix B.1. Remarks

For the Chebyshev function ψ ( x ) , the following are equivalent: Absolutely convergent integral 1 | ψ ( x ) x | x 3 2 ε d x for all ε > 0 ; Conditionally convergent integral 1 ψ ( x ) x x 3 2 ε d x for all ε > 0 ; The pointwise estimate | ψ ( x ) x | = o ( x 1 2 + ε ) for all ε > 0 ; The analyticity of the Dirichlet-type integrals f ( s ) and g ( s ) in Re ( s ) > 1 / 2 . These equivalences rely on the local bounded variation of ψ and standard Dirichlet series theory. They are included as a theoretical complement to the main paper, where we have directly proved the absolute convergence of the integral 1 ψ ( x ) x x 3 2 ε d x via the mean-square estimate 2 X ψ ( t ) t 2 d t = O ( X 2 log 2 X ) and a dyadic decomposition. In a word, the main proof does not rely on this appendix; the appendix is only a theoretical complement that shows the equivalence of the integral convergence, the pointwise o-bound, and the analyticity of f ( s ) , as well as the consequence of the analyticity of g ( s ) . It provides additional insight into the relationships between these properties for the Chebyshev function.

References

  1. von Koch, H. Sur la distribution des nombres premiers. Acta Math. 1901, 24, 159–182. [Google Scholar] [CrossRef]
  2. Ingham, A. E. The Distribution of Prime Numbers, Cambridge Tracts in Mathematics and Mathematical Physics; No. 30; Cambridge University Press: Cambridge, 1932; [Reprinted 1990, Cambridge Mathematical Library]. [Google Scholar]
  3. Selberg, A. An elementary proof of the prime-number theorem. Ann. Math. 1949, Vol.50(No.2), 305–313. [Google Scholar] [CrossRef]
  4. Erdos, P. On a new method in elementary number theory which leads to an elementary proof of the prime number theorem. Proc. Natl. Acad. Sci. USA 1949, Vol.35(No.7), 374–384. [Google Scholar] [CrossRef] [PubMed]
  5. Hardy, G. H.; Littlewood, J. E.; Pólya, G. Inequalities, 2nd ed.; Cambridge University Press, 1952. [Google Scholar]
  6. Edwards, Harold M. Riemann’s Zeta Function; Academic Press: New York, 1974. [Google Scholar]
  7. Apostol, Tom M. Introduction to Analytic Number Theory; Springer-Verlag: New York; Heidelberg/Berlin, 1976. [Google Scholar]
  8. Hardy, G. H.; Wright, E. M. An Introduction to the Theory of Numbers, 5th-ed.; Posts & Telecom Press under licence from Oxford University Press: Beijing, 2007. [Google Scholar]
  9. Tenenbaum, G. Introduction to Analytic and Probabilistic Number Theory; American Mathematical Society: Providence, RI, 2015. [Google Scholar]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.
Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

Disclaimer

Terms of Use

Privacy Policy

Privacy Settings

© 2026 MDPI (Basel, Switzerland) unless otherwise stated