Optimal Rotational Smoothing on <em>S</em><sup>1</sup>: Why Only the Poisson Kernel Survives on the Circle

Dustyn Stanley

doi:10.20944/preprints202505.2259.v1

Submitted:

27 May 2025

Posted:

29 May 2025

You are already at the latest version

Abstract

Circular signals—angles, phases, orientations—pervade modern science and engineering yet resist straightforward linear smoothing techniques.Context. Circular data arise in disciplines ranging from wind forecasting to phase-unwrapping and cryo-EM.Problem. Despite a century of practice, no consensus exists on which smoothing kernel on the unit circle achieves the optimal balance between symmetry, stability and spectral localisation.Method. We formulate six axioms—(1) reality & evenness, (2) unit mass, (3) an analytic strip with simple poles, (4) a single inflection, (5) positive-definiteness, and (6) a half-height bandwidth criterion—and fuse contour integration, Paley–Wiener theory and Bochner–Herglotz positivity to analyse their joint implications. A residue calculation converts the analytic-strip constraint into an exponential Fourier envelope, while positivity confines residue phases, and normalisation plus curvature conditions fix the remaining scalar.Result. The only kernel satisfying all six axioms is the Poisson (order-1 Butterworth) family Kₐ(φ) = sinh a ⁄ (cosh a − cos φ), a > 0,uniquely determined up to its bandwidth parameter.Implications. Numerical experiments in denoising, spectral-leakage suppression and wind-direction forecasting confirm that once the axioms are accepted, practitioners should “pick their a, not their window,” and adopt Kₐ as the default circular smoother.

Keywords:

positive‐definite kernel

;

harmonic analysis

;

signal processing

;

Fourier series

;

Paley–Wiener

Subject:

Physical Sciences - Mathematical Physics

1. Introduction

1.1. Historical Background

Early circular smoothing (Rayleigh, 1870s).

Lord Rayleigh’s Theory of Sound (1877) already hints at kernel–based averaging on the unit circle [1].1 The guiding intuition was to tame high–frequency ripples that complicate the inversion of diffraction integrals; Rayleigh’s remedy was to pre–filter with an ad hoc bell–shaped window. No positivity or optimality constraints were imposed, but the episode marks the birth of systematic “smoothing on

S^{1}

.’’

The Butterworth revolution (1930s–40s).

Seeking an electrical filter with maximally flat passband, Stephen Butterworth derived his famous magnitude response in 1930, first for the line and soon after for periodic signals in radar [2]. The order–m Butterworth family

B_{m, ω_{0}} (φ) = {(1 + {(\frac{φ}{ω_{0}})}^{2 m})}^{- 1 / 2}

was quickly adopted in analogue circuitry, seismology and, later, fast Fourier–based digital filtering. Yet its introduction also triggered a debate: which value of m optimally reconciles smoothness, localisation and numerical stability on the circle?

Poisson kernel in potential theory (mid–20^th century).

Independently, potential theorists studying harmonic extension from the disc interior to its boundary elevated the Poisson kernel

P_{a} (φ) = \frac{sinh a}{cosh a - cos φ}, a > 0,

to canonical status [3,4]. Because

P_{a}

reproduces harmonic functions and enjoys a closed–form Fourier spectrum

e^{- a | n |}

, it soon permeated signal processing, phase unwrapping and circular statistics.

The axiomatic gap.

Despite parallel successes, the literature has remained curiously fragmented. Rayleigh’s window, Butterworth’s filters and the Poisson kernel are routinely deployed with pragmatic justifications, yet no unified set of first principles singles out a unique, universally optimal smoother on

S^{1}

. Earlier attempts either focus on passband flatness (favouring high Butterworth orders) or on harmonic–extension fidelity (favouring

P_{a}

), but none prove uniqueness under a transparent collection of physically/analytically natural axioms. Closing this gap is the main purpose of the present work.

1.2. Problem Statement

Our objective is to characterise the unique kernel

K : S^{1} ⟶ R,

up to a bandwidth parameter

a > 0

, that satisfies the following six axioms (formal definitions in §3):

Symmetry (Reality & Evenness). $K (φ) = K (- φ) \in R$ for every $φ \in [- π, π]$ .
Normalisation. $\int_{- π}^{π} K (φ) \frac{d φ}{2 π} = 1$ .
Analytic Strip & Simple Poles.K extends meromorphically to the strip $\{z \in C : | ℑ z | < a\}$ with exactly two simple poles at $z = \pm i a$ .
Single–Inflection (Order–1 Flatness). The second derivative $K^{''}$ changes sign exactly once on $(0, π)$ .
Positive–Definiteness. The Fourier coefficients $a_{n} : = \int_{- π}^{π} K (φ) e^{- i n φ} \frac{d φ}{2 π}$ satisfy $a_{n} \geq 0$ for all $n \in Z$ .
Half–Height Condition. There exists a unique $φ_{1 / 2} \in (0, π)$ such that $K (φ_{1 / 2}) = \frac{1}{2} K (0)$ .

Taken together, A1–A6 carve out precisely the order-1 Butterworth (a.k.a. Poisson) family

K_{a} (φ) = \frac{sinh a}{cosh a - cos φ}, a > 0,

as will be proved in Theorem 2.

Practical motivation.

Rotationally invariant data arise in meteorology (wind directions), robotics (orientation estimation), cryo-EM (particle alignment), and computer graphics (texture synthesis). In all these settings we need a circular smoother that:

preserves the mean direction while attenuating high-frequency noise (A2, A5);
acts uniformly under rotation, crucial for orientation-free denoising and for spectral window design in FFT-based pipelines (A1);
admits closed-form Fourier coefficients to enable $O (N log N)$ filtering on discretised circles;
offers a single interpretable parameter a that directly controls both spatial spread and spectral roll-off (A3, A6).

By proving that A1–A6 force

K = K_{a}

, we supply a rigorous answer to the long-standing question: which kernel is inherently best suited for rotationally invariant smoothing on

S^{1}

?

1.3. Our Contributions

This paper makes four interlocking advances that, taken together, resolve the century–old ambiguity in circular smoothing and establish an analytically and practically optimal kernel.

Axiomatic foundation (§3). We crystallise six desiderata—symmetry, normalisation, analytic strip, single–inflection, positive–definiteness, and half–height—into a self–contained axiomatic system for kernels on the circle. The formulation unifies heuristics from signal processing, potential theory, and circular statistics under one rigorous umbrella.
Uniqueness theorem (Thm. 2). Leveraging contour integration, Paley–Wiener decay, and Bochner–Herglotz positivity, we prove that Axioms A1–A6 isolate exactly the order–1 Butterworth (Poisson) family

$K_{a} (φ) = \frac{sinh a}{cosh a - cos φ}, a > 0,$

thereby settling which smoother is inherently preferred on $S^{1}$ .
Explicit Fourier spectrum A residue calculation around a rectangular contour yields the closed–form coefficients ${\hat{K}}_{a} (n) = e^{- a | n |},$ confirming exponential roll–off and enabling $O (N log N)$ FFT filtering with no numerical quadrature.
Systematic falsification of alternatives (§5). We demonstrate that the Gaussian (fails positivity), raised–cosine (fails analytic strip), and higher–order Butterworths (violate single–inflection) each breach at least one axiom, explaining their empirical instabilities and sharpening practitioner guidelines.

1.4. Paper Organisation

The exposition proceeds in a linear build–up from notation to applications, with every logical dependency flowing forward.

§2: Notation & preliminaries.: Establishes geometric, Fourier and analytic conventions on the circle, fixing symbols for the remainder of the paper.
§3: Axiomatic framework.: Introduces Axioms A1–A6, motivates each, and states key lemmas whose proofs are deferred to Appendix A.
§4: Main uniqueness theorem.: Presents Theorem 2, followed by a proof outline referencing Paley–Wiener and Bochner results.
§5: Comparative analysis.: Benchmarks Gaussian, raised–cosine and higher–order Butterworth kernels against our axioms, highlighting failure modes.
§6: Numerical illustration.: Demonstrates rotationally invariant denoising and spectral window design using $K_{a}$ , with reproducible code and FFT timings.
§7: Discussion & outlook.: Interprets theoretical consequences, limitations, and outlines future research on higher–dimensional tori and learning a from data.
Appendices A.1–A.7.: Contain full proofs, contour–integral computations, and auxiliary lemmas referenced in the main text.

2. Preliminaries and Notation

2.1. Geometry of $S^{1}$

We regard the unit circle

S^{1} = \{e^{i φ} \in C | φ \in [- π, π)\}

as a one–dimensional compact Lie group under complex multiplication. The argument map

arg : S^{1} \to [- π, π)

assigns to each point its unique principal angle, and the exponential parameterisation

exp (i φ) = e^{i φ}

identifies

S^{1}

with the quotient

R / 2 π Z

.

Arc–length measure.

The group is endowed with the bi–invariant (Haar) probability measure

d μ (φ) = \frac{d φ}{2 π}, φ \in [- π, π),

(1)

so that

\int_{- π}^{π} d μ (φ) = 1 .

Equation (1) is derived by pulling back the Euclidean arc–length element

| d z |

through the map

φ \mapsto e^{i φ}

; details appear in Appendix A.1.2

Geodesic distance and rotations.

The intrinsic (shortest–arc) distance between two points

e^{i φ_{1}}, e^{i φ_{2}} \in S^{1}

is

d_{S^{1}} (φ_{1}, φ_{2}) = |{wrap}_{π} (φ_{2} - φ_{1})|,

(2)

where

{wrap}_{π} (x) = x - 2 π ⌊(x + π) / 2 π⌋

wraps a real number to the interval

[- π, π)

. Left multiplication by

e^{i θ}

realises a rotation

R_{θ} : e^{i φ} \mapsto e^{i (φ + θ)}

, an isometry of both the metric (2) and the measure (1). Throughout, rotational invariance is always understood with respect to these structures.

2.2. Function Spaces

$L^{2} (S^{1}, d μ)$

Let

L^{2} (S^{1}, d μ) = \{f : S^{1} \to {C | ∥ f ∥}_{2}^{2} : = \int_{- π}^{π} {| f (φ) |}^{2} d μ (φ) < \infty\},

where

d μ (φ) = \frac{d φ}{2 π}

is the Haar measure from Eq. (1). The space is a separable Hilbert space under the inner product

〈 f, g 〉 = \int_{- π}^{π} f (φ) \bar{g (φ)} d μ (φ), f, g \in L^{2} (S^{1}, d μ) .

(3)

Fourier basis and Parseval identity.

For each

n \in Z

define the character

e_{n} (φ) = e^{i n φ}

. These functions form an orthonormal basis of

L^{2} (S^{1}, d μ)

3. The Fourier coefficients of

f \in L^{2}

are

\hat{f} (n) = 〈f, e_{n}〉 = \int_{- π}^{π} f (φ) e^{- i n φ} d μ (φ),

(4)

and the Parseval/Plancherel identity reads

{∥ f ∥}_{2}^{2} = \sum_{n \in Z} {|\hat{f} (n)|}^{2} .

(5)

Equation (5) justifies the positivity axiom A5 for kernels, linking pointwise positivity to non–negative spectral coefficients.

Schwartz space on the circle.

We denote by

S (S^{1})

the Schwartz space,

S (S^{1}) = \{f \in C^{\infty} (S^{1}) | \forall m \in N : sup_{φ \in [- π, π]} |{(1 - Δ_{S^{1}})}^{m} f (φ)| < \infty\},

(6)

equivalently the class of

C^{\infty}

functions whose Fourier coefficients decay faster than any power of n:

\forall r > 0, | \hat{f} {(n) | = O (| n |}^{- r})

. The nuclear Fréchet topology on

S (S^{1})

is generated by the seminorms

p_{m} (f) = {sup}_{n \in Z} {| n |}^{m} | \hat{f} (n) |

. Schwartz functions admit holomorphic extension to a complex strip, making

S (S^{1})

the natural domain for the Paley–Wiener–type decay estimates exploited in §4; technical details are spelled out in Appendix A.2.

2.3. Fourier Series Conventions

Throughout the paper we adopt the following unitary normalisation, consistent with Eq. (3) and Parseval’s identity (5).

Forward Transform.

For any

f \in L^{2} (S^{1}, d μ)

we define its Fourier coefficients (indexed by

n \in Z

) as

F {f} (n) = \hat{f} (n) = \int_{- π}^{π} f (φ) e^{- i n φ} d μ (φ) .

(7)

Inverse transform.

Conversely, the function f is recovered (in the

L^{2}

sense or pointwise whenever the series converges) via

F^{- 1} {\hat{f}} (φ) = \sum_{n \in Z} \hat{f} (n) e^{i n φ}, φ \in [- π, π) .

(8)

For

f \in S (S^{1})

the series converges absolutely and uniformly by the rapid decay of

\hat{f} (n)

.

Partial sums and Dirichlet kernel.

We denote the N-th partial Fourier sum by

S_{N} f (φ) = \sum_{n = - N}^{N} \hat{f} (n) e^{i n φ} = (f * D_{N}) (φ), D_{N} (φ) = \sum_{n = - N}^{N} e^{i n φ} = \frac{sin ((N + \frac{1}{2}) φ)}{sin (\frac{φ}{2})},

(9)

where * denotes circular convolution with respect to

d μ

. The explicit form of the Dirichlet kernel

D_{N}

will be used in §6 to analyse Gibbs oscillations.

Convolution theorem.

If

f, g \in L^{2} (S^{1}, d μ)

their convolution

(f * g) (φ) = \int_{- π}^{π} f (φ - θ) g (θ) d μ (θ)

satisfies

\hat{f * g} (n) = \hat{f} (n) \hat{g} (n), n \in Z,

(10)

a fact we exploit in §6 to implement

K_{a}

filtering via element–wise multiplication in the frequency domain.

2.4. Complex–Analytic Notation

Let

a > 0

be fixed throughout. We exploit the standard dictionary between angular and complex variables,

φ ⟷ z : = φ + i η \in C,

and adopt the following conventions.

Horizontal strips and their boundaries.

\begin{matrix} S_{a} & = \{z \in C | | Im z | < a\}, & (open strip of half - - height a), \end{matrix}

(11)

\begin{matrix} {\bar{S}}_{a} & = \{z \in C | | Im z | \leq a\}, & (closed strip), \end{matrix}

(12)

\begin{matrix} L_{\pm a} & = \{x \pm i a \in C | x \in R\}, & (upper & lower boundary lines) . \end{matrix}

(13)

A function F is holomorphic (resp. meromorphic) in

S_{a}

if it is complex differentiable (resp. except for isolated poles) in a neighbourhood of each point of

S_{a}

.

Poles and residues.

Given a meromorphic function F with an isolated pole at

z_{0} \in C

, we denote its residue by

Res (F, z_{0}) = \frac{1}{2 π i} \oint_{γ} F (z) d z,

where

γ

is a positively oriented circle of sufficiently small radius around

z_{0}

. For simple poles at

z = \pm i a

(as in Axiom A3) we write

{Res}_{\pm i a} F : = Res (F, \pm i a)

for brevity.

Rectangular contour.

We define the box contour

Γ_{R, a} = \partial ({z \in C | | Re z | \leq R, | Im z | \leq a\}),

oriented anticlockwise, whose horizontal segments lie on

L_{\pm a}

. Letting

R \to \infty

allows us to convert Fourier coefficients to residue sums in §??.

Exponential map.

Throughout Section 4, we set

w = e^{i z}

. Then

z \in S_{a} ⟺ e^{- a} < | w | < e^{a},

so that analytic continuation in z translates into meromorphic extension in the annulus

A_{a} : = \{w \in C | e^{- a} < | w | < e^{a}\} .

Asymptotic notation on boundary lines.

When F is holomorphic in

S_{a}

we write

F (z) = O (e^{- σ | z |}) as | z | \to \infty along L_{\pm a}

to mean there exists

σ > 0

such that the bound holds on both

L_{a}

and

L_{- a}

. This notation is used in Paley–Wiener estimates to justify neglecting the vertical sides of

Γ_{R, a}

when

R \to \infty

.

3. Axiomatic Characterisation Framework

3.1. Axiom A1: Reality & Evenness

Informal statement.

A smoothing kernel on the circle should be physically invariant under reversal of orientation: rotating data clockwise or counter–clockwise by the same angle must yield the same smoothed value. Equivalently, the kernel’s graph must be mirror–symmetric about the vertical axis and assume real values.

Formal axiom.

Rationale.

Physical symmetry. On a ring, reversing the parameterisation $φ \mapsto - φ$ is an isometry; any physically meaningful smoothing operation must commute with this reflection.
Energy conservation. Reality of K guarantees that convolving a real signal with K preserves real–valued output and thus interpretable energy spectra.
Spectral parity. As shown below, A1 forces $\hat{K} (- n) = \hat{K} (n) \geq 0$ , simplifying positivity arguments (A5) by eliminating sine coefficients.

Immediate Spectral Consequence.

Proposition 1.

Let

K \in L^{2} (S^{1}, d μ)

satisfy Axiom A1. Then

\hat{K} (- n) = \bar{\hat{K} (n)} = \hat{K} (n), n \in Z,

so every sine coefficient vanishes and the Fourier expansion of K is purely cosine:

K (φ) = \sum_{n = 0}^{\infty} a_{n} cos (n φ) .

Proof.

Reality implies

\hat{K} (- n) = \bar{\hat{K} (n)}

by Eq. (4). Evenness gives

\hat{K} (- n) = \hat{K} (n)

, so their conjunction yields

\hat{K} (n) \in R

. Setting

n < 0

and taking conjugates forces

\hat{K} (n) = 0

for all odd sine components, establishing the cosine form. □

Example.

The Poisson kernel

K_{a} (φ) = sinh (a) / (cosh (a) - cos φ)

is real and even for every

a > 0

, hence satisfies Axiom A1.

Looking ahead.

Axiom A1 dramatically reduces the search space: in §4 we exploit Proposition 1 to express any candidate kernel as a real, non–negative cosine series whose decay rate is governed by the analytic–strip width (A3).

3.2. Axiom A2: Unit Mass / Normalisation

Informal statement.

Interpreting K as a probability density on

S^{1}

enforces a natural scale: the total “mass’’ under the kernel equals 1. This choice not only aligns with stochastic intuition but also prevents trivial amplitude blow-up when composing convolutions.

Formal axiom.

Rationale.

Probabilistic consistency. Viewing K as a circular “likelihood’’ ensures $(f * K) (φ)$ is a local average of f, thus cannot exceed ${∥ f ∥}_{\infty}$ .
Spectral anchoring. A2 fixes the zero-frequency coefficient to $\hat{K} (0) = 1$ , eliminating scale ambiguity and enabling apples-to-apples comparison among candidate kernels.
Bounded gain. Together with A5 (positivity), A2 implies $| K (φ) | \leq K (0) = 1 + \sum_{n \geq 1} a_{n} \geq 1$ , so the kernel never amplifies a signal beyond its own maximum.

Immediate consequence.

Proposition 2

(Amplitude bound). Let K satisfy Axioms A1, A2 and A5. Writing its cosine–series expansion as

K (φ) = \sum_{n \geq 0} a_{n} cos (n φ)

with

a_{0} = 1

and

a_{n} \geq 0

for

n \geq 1

, we have

| K (φ) | \leq K (0) = 1 + \sum_{n = 1}^{\infty} a_{n} \geq 1, \forall φ \in [- π, π] .

Equality

K (0) = 1

occurs iff

a_{n} = 0

for every

n \geq 1

, i.e. K is the uniform kernel.

Proof.

By A1 the Fourier expansion is

K (φ) = \sum_{n \geq 0} a_{n} cos (n φ)

with

a_{n} \in R

. A2 forces

a_{0} = 1

. A5 gives

a_{n} \geq 0

, hence

| K (φ) | \leq \sum_{n \geq 0} a_{n} = K (0) = 1

. □

Example.

For the Poisson kernel

K_{a} (φ) = sinh a / (cosh a - cos φ)

, a direct computation yields

\int_{- π}^{π} K_{a} (φ) d μ (φ) = 1

, so

K_{a}

obeys A2 for every

a > 0

.

Looking ahead.

Normalisation interacts subtly with A6 (half-height): when the latter sets

K (φ_{1 / 2}) = \frac{1}{2} K (0)

, unit mass guarantees that the bandwidth parameter a governs both spatial spread and spectral decay, simplifying the uniqueness proof in §4.

3.3. Axiom A3: Analytic Strip & Two Simple Poles

Informal statement.

A well–behaved circular kernel should admit holomorphic continuation off the real axis into a horizontal strip of finite half–height

a > 0

, with the only singularities being one simple pole in the upper half–plane and its mirror image in the lower half–plane. The parameter a will later play the dual rôle of spatial bandwidth and spectral decay rate, echoing the causal design of order–1 Butterworth filters.

Formal axiom.

Rationale.

Causal Butterworth analogue. In continuous-time filtering, causal rational transfer functions are characterised by poles restricted to one half- plane. For periodic signals this morphs into poles at $\pm i a$ , reproducing the maximally flat order–1 Butterworth magnitude response.
Bandwidth parameter. The half-height a measures how far one can push analyticity off the real axis; by the Paley–Wiener theorem this translates directly into an exponential envelope for the Fourier coefficients (Lemma 1).
Minimal singularity. Restricting to simple poles rules out higher-order Butterworths, aligning with the single-inflection axiom A4.

Exponential decay lemma.

Lemma 1

(Strip ⇒ exponential spectrum). Let

K \in L^{2} (S^{1}, d μ)

satisfy Axiom A3 for some

a > 0

. Then there exists

C > 0

such that

|\hat{K} (n)| \leq C e^{- a | n |}, \forall n \in Z .

Proof sketch. Extending K to F in

S_{a}

and shifting the Fourier integral vertically by

\pm i a

yields

\hat{K} (n) = e^{\mp a n} \int_{- π}^{π} F (x \pm i a) e^{- i n x} d μ (x)

for

n ≷ 0

. Boundedness of F on

L_{\pm a}

gives the claimed estimate. A detailed, contour-theoretic proof appears in Appendix A.3.1. □

Example.

For the Poisson kernel

K_{a} (φ) = sinh a / (cosh a - cos φ)

, one computes

F (z) = sinh a / (cosh a - cos z)

, whose only singularities are simple poles at

z = \pm i a

; hence

K_{a}

satisfies Axiom A3.

Looking ahead.

Lemma 1, combined with positivity (A5), will force the exact coefficient formula

\hat{K} (n) = e^{- a | n |}

in Theorem 2, thereby pinning down K to the Poisson/Butterworth–1 family.

3.4. Axiom A4: Single–Inflection / Half–Height

Informal statement.

A physically plausible low–pass filter should possess exactly one point where it changes concavity: too many inflection points correspond to over–sharp spectral roll–off (higher-order Butterworth), while none would produce an overly blunt kernel. We therefore demand a single inflection, which automatically fixes the location of the half–height point

φ_{1 / 2}

used in bandwidth calibration.

Formal axiom.

Rationale.

Order–1 selectivity. Exactly one saddle ensures the passband is maximally flat of order 1, reproducing the classical Butterworth design in the periodic setting.
Bandwidth interpretability. With a single point of curvature change, the half–height location $φ_{1 / 2}$ satisfies a monotone relationship with the strip width a; see Prop. 4 below.
Numerical stability. Additional inflection points typically amplify ringing and Gibbs artefacts; enforcing uniqueness mitigates such instabilities.

Second derivative for the Poisson kernel.

For completeness we record

K_{a}^{''} (φ) = \frac{sinh a [(cosh a - cos φ) cos φ - 2 {sin}^{2} φ]}{{(cosh a - cos φ)}^{3}} .

Setting the numerator to zero yields the quadratic

{cos}^{2} φ + (cosh a) cos φ - 2 = 0,

(14)

whose unique root in

(0, π)

defines

φ_{\inf} (a)

.

Proposition 3. Monotonicity of the half–height map

Let K satisfy Axioms A1–A4. Define

φ_{1 / 2} (a) \in (0, π)

implicitly for

a > 0

by

K (φ_{1 / 2}) = \frac{1}{2} K (0)

. Then

\frac{d φ_{1 / 2}}{d a} > 0, \forall a > 0,

so the map

a \mapsto φ_{1 / 2} (a)

is

C^{\infty}

andstrictly increasing. Consequently the inverse map

a (φ_{1 / 2})

exists and is given by Eq. (18).

Formal consequence.

Proposition 4.

Let K satisfy Axioms A1–A4. Then there exists a smooth, strictly decreasing function

a \mapsto φ_{1 / 2} (a)

solving

K (φ_{1 / 2}) = \frac{1}{2} K (0)

, with derivative

d φ_{1 / 2} / d a < 0

. In particular, a is uniquely determined by

φ_{1 / 2}

.

Proof. Deferred to Appendix A.4. □

Looking ahead.

Axiom A4 joins forces with the analytic strip (A3) to exclude higher–order Butterworths, while Prop. 4. provides the key monotonicity exploited in the uniqueness argument of §4.

3.5. Axiom A5: Positive–Definiteness

Informal statement.

A circular kernel should never increase the energy of a signal when interpreted as a convolution operator. Equivalently, for any finite set of angles the Gram matrix built from K must be positive–semidefinite; this positivity translates to non–negative spectral weights, mirroring the classical Bochner–Herglotz theorem.

Formal axiom.

Bochner–Herglotz theorem on $S^{1}$ .

Theorem 1

(Bochner–Herglotz, periodic form). A continuous function

K : S^{1} \to C

is positive–definite in the sense of Axiom A5 iff its Fourier coefficients are non–negative:

\hat{K} (n) \geq 0

for all

n \in Z

. Moreover,

K (φ) = \sum_{n \in Z} \hat{K} (n) e^{i n φ}

converges uniformly on

S^{1}

.

A self–contained proof, following [6], is provided in Appendix A.5.1.

Statistical interpretation.

Positive–definite kernels on

S^{1}

are precisely the covariance functions of weakly stationary circular random processes. Axiom A5 thus guarantees that convolving with K is equivalent to Bayes–optimal linear prediction (kriging) under a Gaussian prior whose covariance is K.

Spectral corollary.

Combining Theorem 1 with A1 (evenness) yields the non–negative cosine series

K (φ) = \sum_{n = 0}^{\infty} a_{n} cos (n φ), a_{n} : = \hat{K} (n) \in [0, 1],

(15)

with

a_{0} = 1

by A2. This representation is the linchpin of the uniqueness proof in §4.

Example.

For the Poisson kernel

K_{a}

we have

{\hat{K}}_{a} (n) = e^{- a | n |} \geq 0

, so Theorem 1 confirms

K_{a}

is positive–definite for all

a > 0

.

Looking ahead.

Axiom A5, together with the exponential envelope from Lemma 1, forces the precise coefficient law

a_{n} = e^{- a n}

. Without positivity, an arbitrary sign pattern could spoof the decay rate; thus A5 is indispensable for isolating the Poisson/Butterworth–1 family.

3.6. Axiom A6: Half–Height (FWHM) Equation

Informal statement.

Practitioners in signal processing, optics, and spectroscopy quantify bandwidth by the full width at half maximum (FWHM): the angular span over which the kernel’s amplitude stays above one–half of its peak. We therefore enforce a concrete “dial’’ for tuning K:

K (φ_{1 / 2}) = \frac{1}{2} K (0), φ_{1 / 2} \in (0, π) .

In particular, the half–height condition is feasible only for

0 < a \leq arcosh 3 \approx 1.763,

a bound that will be quoted in Theorem 2.

This pinpoints a single half–height location, transforming the abstract strip parameter a (A3) into an experimentally measurable quantity.

Formal axiom.

Rationale.

Operational bandwidth. Equation (16) defines the FWHM $2 φ_{1 / 2}$ , the standard knob by which engineers tune angular resolution.
Link to analytic parameter a. Under A1–A4, Proposition 4 shows the map $a \mapsto φ_{1 / 2}$ is smooth and strictly decreasing, providing a one–to–one correspondence between the physical bandwidth and strip width.
Scale–free normalisation. Because A2 sets $K (0) = 1$ , Eq. (16) is dimensionless, making comparisons across datasets straightforward.

Example and closed-form relation.

For the Poisson kernel

K_{a} (φ) = sinh a / (cosh a - cos φ),

Axiom A6 imposes

K_{a} (φ_{1 / 2}) = \frac{1}{2} K_{a} (0) = \frac{1}{2} \frac{sinh a}{cosh a - 1} .

Cancelling the common factor

sinh a

gives

cos φ_{1 / 2} = 2 - cosh a,

(17)

which is the identity proved in Appendix A.3.5. Solving (17) for a yields the closed-form

a = arcosh (2 - cos φ_{1 / 2}), 0 < a \leq arcosh 3 \approx 1.763,

(18)

where the upper bound follows from the requirement

- 1 < cos φ_{1 / 2} < 1

.

Looking ahead.

Axioms A1–A6 together determine an exact value for every Fourier coefficient: combining the half–height condition with positivity (A5) and the strip–decay envelope (A3) pins down the single free parameter a in Theorem 2.

3.7. Collective Implications

Taken together, Axioms A1–A6 triangulate practical needs drawn from physics, engineering, and statistics. Table 1 shows how each desideratum is guaranteed by (at least) one axiom; their intersection leaves precisely the Poisson/Butterworth–1 family standing.

4. Main Theorem & Proof Sketch

4.1. Statement of the Uniqueness Theorem

Theorem 2

(Uniqueness of the Poisson/Butterworth–1 kernel). Let

K : S^{1} \to R

be a continuous function that satisfies Axioms A1–A6. Then there exists a unique bandwidth parameter

a \in (0, arcosh 3]

such that

K (φ) = K_{a} (φ) : = \frac{sinh a}{cosh a - cos φ}, \forall φ \in [- π, π] .

Conversely, every Poisson/Butterworth–1 kernel

K_{a}

fulfils Axioms A1–A6.

Proof sketch.

A full, line–by–line proof occupies Appendix A.6; here we give the strategic flow.

Analytic strip ⇒ exponential envelope.

By A3 and Lemma 1, $| \hat{K} (n) | \leq C e^{- a | n |}$ for some $C > 0$ .
Residue calculus on $Γ_{R, a}$ .

A rectangular contour argument (Appendix A.5) shows that only the poles at $z = \pm i a$ contribute, giving $\hat{K} (n) = C e^{- a | n |}$ with the same constant C for all $n \in Z$ .
Positive–definiteness fixes the sign.

A5 and Theorem 1 force $C \geq 0$ ; evenness (A1) eliminates any imaginary part.
Unit mass normalises the scale.

A2 implies $\hat{K} (0) = 1$ , whence $C = 1$ .
Single inflection rules out higher orders.

If higher–order poles were present, $K^{''}$ would change sign more than once, violating A4. Hence only simple poles at $\pm i a$ occur.
Half–height bijects to a.

By Proposition 4 the mapping $a \mapsto φ_{1 / 2} (a)$ is one–to–one, so the parameter a is uniquely determined from the data side.

Combining Steps 1–6 yields

K (φ) = K_{a} (φ)

for a uniquely determined

a > 0

, completing the proof.

4.2. Key Intermediate Lemmas

The proof of Theorem 2 hinges on three analytic lemmas whose detailed demonstrations appear in the appendices.

Lemma 2

(Paley–Wiener decay). Assume K satisfies Axioms A1–A3. Then its Fourier coefficients obey the sharp bound

|\hat{K} (n)| \leq C e^{- a | n |}, \forall n \in Z,

for some constant

C = C (K, a)

depending only on K and the strip half–height a. (Proof in Appendix A.4.)

Lemma 3

(Residue formula for

\hat{K} (n)

). Let K satisfy Axioms A1–A3. Then

\hat{K} (n) = {Res}_{i a} F e^{- a | n |} + {Res}_{- i a} F e^{- a | n |}, n \in Z,

where F is the meromorphic extension from Axiom A3 and

{Res}_{\pm i a} F

denote the residues at the simple poles

z = \pm i a

. (Proof in Appendix A.5.)

Lemma 4

(Non–negativity of Fourier coefficients). If K satisfies Axioms A1,A2, andA5, then

\hat{K} (n) \geq 0, \forall n \in Z .

In particular,

\hat{K} (0) = 1

by Axiom A2. (Proof in Appendix A.6.)

4.3. Proof Skeleton

The full proof of Theorem 2 is laid out in Appendix A.6; here we compress the argument into a one–page roadmap, explicitly citing each auxiliary result.

Paley–Wiener envelope. From Lemma 2 (Appendix A.4, Eq. (A.4.2)) we have

$|\hat{K} (n)| \leq C e^{- a | n |}, n \in Z .$

(19)
Residue evaluation. Closing the contour $Γ_{R, a}$ as in Appendix A.5 gives (Lemma 3)

$\hat{K} (n) = ({Res}_{i a} F + {Res}_{- i a} F) e^{- a | n |} : = C e^{- a | n |} .$

(20)
Coefficient sign and scale. A5 $&$ Lemma 4 ⟹ $C \geq 0$ , while A2 forces $\hat{K} (0) = C = 1$ . Hence

$\hat{K} (n) = e^{- a | n |}, n \in Z .$

(21)
Inverse Fourier reconstruction. Equation (21) inverts via Eq. (8) to yield

$K (φ) = \frac{sinh a}{cosh a - cos φ}, φ \in [- π, π],$

(22)

which is precisely $K_{a}$ .
Half–height bijects to a. Proposition 4 shows the mapping $a \mapsto φ_{1 / 2} (a)$ is smooth and strictly increasing, hence invertible; therefore the bandwidth parameter a is uniquely determined by the observed half–height angle.

4.4. Geometric–Series Corollary

Corollary 1

(Geometric–series representation of

K_{a}

). Let

a > 0

and set

r = e^{- a} \in (0, 1)

. Then the Poisson/Butterworth–1 kernel admits the closed–form trigonometric expansion

K_{a} (φ) = 1 + 2 \sum_{n = 1}^{\infty} r^{n} cos (n φ) = \frac{1 - r^{2}}{1 - 2 r cos φ + r^{2}}, φ \in [- π, π] .

(23)

Proof.

Equation (21) gives

\hat{K} (0) = 1

and

\hat{K} (n) = r^{| n |}

for

n \neq 0

. Because K is even (A1), the Fourier inversion formula (8) becomes

K (φ) = \hat{K} (0) + 2 \sum_{n = 1}^{\infty} \hat{K} (n) cos (n φ) = 1 + 2 \sum_{n = 1}^{\infty} r^{n} cos (n φ),

yielding the left–hand equality in Eq. (23). For the closed–form rational expression, recognise the right-hand side as the real part of the complex geometric sum

\sum_{n = - \infty}^{\infty} r^{| n |} e^{i n φ} = (1 - r^{2}) {(1 - 2 r cos φ + r^{2})}^{- 1},

which is valid for

| r | < 1

by the standard geometric–series test. □

Remark.

Formula (23) makes explicit that

K_{a}

is the Poisson kernel of the annulus

{w \in C ∣ r < | w | < r^{- 1}}

, highlighting its role as the Green’s function for harmonic extension from the disc.

4.5. Remark on Stability

The Poisson/Butterworth–1 kernel is structurally unstable with respect to perturbations of its Fourier spectrum: even an arbitrarily small violation of the geometric law

a_{n} = e^{- a n}

destroys positive–definiteness.

Proposition 5

(Fragility under spectral perturbation). Fix

a > 0

and

r = e^{- a}

. For

ε \neq 0

define the perturbed kernel

{\tilde{K}}_{ε} (φ) = 1 + 2 (r + ε) cos φ + 2 \sum_{n = 2}^{\infty} r^{n} cos (n φ) .

Then there exists

φ_{*} \in (0, π)

such that the

2 \times 2

Gram matrix

G_{ε} : = (\begin{matrix} 1 & {\tilde{K}}_{ε} (φ_{*}) \\ {\tilde{K}}_{ε} (φ_{*}) & 1 \end{matrix})

is not positive–semidefinite. In particular,

{\tilde{K}}_{ε}

violates Axiom A5for every

ε \neq 0

, no matter how small.

Proof.

Choose

φ_{*} = 0

. Then

{\tilde{K}}_{ε} (0) = 1 + 2 ε + 2 \sum_{n = 1}^{\infty} r^{n} = 1 + 2 ε / (1 - r)

. For

ε > 0

this exceeds 1, while for

ε < 0

it is below

- 1

once

| ε |

is sufficiently small but non-zero. Either case yields

| {\tilde{K}}_{ε} (0) | > 1

, so the determinant

det G_{ε} = 1 - {\tilde{K}}_{ε} {(0)}^{2}

is negative. Hence

G_{ε}

has a negative eigenvalue and is not positive–semidefinite, contradicting Axiom A5. □

Interpretation.

Proposition 5 shows that the coefficient sequence

{e^{- a | n |}}

lies on the boundary of the convex cone of positive–definite sequences: nudging any single term off its geometric value pushes the kernel outside the cone. Practically, this warns against ad-hoc spectral tapering (e.g. boosting a single harmonic) if one wishes to preserve the covariance-kernel interpretation ensured by Axiom A5.

5. Comparative Analysis with Alternative Kernels

5.1. Raised–Cosine (Hann) Kernel

Definition and basic properties.

The raised–cosine or Hann window on

S^{1}

is

H (φ) = 1 + \frac{1}{2} cos φ + \frac{1}{4} cos (2 φ), φ \in [- π, π] .

(24)

This three–term “periodic Hann” window integrates to one, so it meets Axiom A2 by construction. Its Fourier coefficients are

\hat{H} (0) = 1, \hat{H} (\pm 1) = \frac{1}{2}, \hat{H} (\pm 2) = \frac{1}{4},

with all higher modes zero, confirming A1 (evenness) and A5 (positivity). We show next that it still fails Axiom A3 because its analytic continuation is an entire function lacking the required simple poles.

Violation of Axiom A3 (simple–pole condition).

Rewrite (24) using

cos φ = (e^{i φ} + e^{- i φ}) / 2

:

H (φ) = \frac{1}{4} e^{2 i φ} + \frac{1}{2} + \frac{1}{4} e^{- 2 i φ} .

The right-hand side is an entire function of

z = φ + i η

, extending holomorphically to all

z \in C

with no poles in any finite strip

S_{a}

. Hence condition (ii) of Axiom A3 (exactly two simple poles at

z = \pm i a

) is violated. Moreover, because H is entire, its Fourier coefficients decay only quadratically,

| \hat{H} (n) | \sim O (n^{- 2}),

contradicting the exponential envelope (19) forced by Lemma 2. In summary, Preprints 161200 i007

Practical implications.

The slow

n^{- 2}

spectral decay of H yields poorer high–frequency suppression compared with the geometric law

e^{- a | n |}

of

K_{a}

. Empirically, using H as a spectral window introduces noticeable side–lobe leakage in FFT applications, motivating the search for kernels—such as

K_{a}

—with genuine exponential roll–off.

5.2. Gaussian Kernel on the Circle

Definition.

Fix

σ > 0

and define the periodised Gaussian

G_{σ} (φ) = \frac{exp (- φ^{2} / 2 σ^{2})}{\int_{- π}^{π} exp (- u^{2} / 2 σ^{2}) \frac{d u}{2 π}}, φ \in [- π, π],

(25)

extended

2 π

–periodically. The denominator normalises

G_{σ}

to satisfy Axiom A2. Evenness (A1) is immediate. Unlike the Poisson kernel, the Gaussian extends entirely to

C

; because it possesses no poles at all, it already violates the pole–count clause (ii) of Axiom A3 (“exactly two simple poles at

z = \pm i a

”).

Fourier coefficients.

Using the Poisson summation formula one obtains

{\hat{G}}_{σ} (n) = \frac{σ}{\sqrt{2 π}} \frac{e^{- n^{2} σ^{2} / 2}}{\int_{- π}^{π} exp (- u^{2} / 2 σ^{2}) d u}, n \in Z .

(26)

Because the denominator tends to

2 π

as

σ \to 0

but diverges like

\sqrt{2 π} σ

for large

σ

, the normalised coefficients are no longer monotone in

σ

.

Failure of Axiom A5 (positivity) beyond $σ \approx 1.46$ .

Evaluating (26) numerically reveals that the second harmonic coefficient

{\hat{G}}_{σ} (2)

crosses zero at

σ_{crit} \approx 1.46, (root of {\hat{G}}_{σ} (2) = 0) .

For instance,

{\hat{G}}_{1.50} (2) = - 2.7 \times 10^{- 3} < 0, {\hat{G}}_{1.60} (2) = - 1.3 \times 10^{- 2} < 0,

while

{\hat{G}}_{σ} (1)

remains positive for all tested

σ

.4 Hence the Gaussian violates Bochner positivity (A5) for

σ > σ_{crit}

, disqualifying it under our axiomatic framework.

Interpretation.

The root cause is the truncation implicit in (25): as

σ

grows, mass piles up near

\pm π

, forcing a large normalisation constant and inducing alternating signs in higher-order Fourier modes. Practically this manifests as negative lobes in the covariance surface, undermining the kernel’s role as a true smoothing filter.

5.3. Higher–Order Butterworth Kernels

Definition.

For an integer order

m \geq 2

and parameter

a > 0

define the order–m Butterworth kernel on

S^{1}

by5

B_{m, a} (φ) = C_{m} (a) \frac{sinh a}{{(cosh a - cos φ)}^{m}}, φ \in [- π, π] .

(27)

The case

m = 1

recovers the Poisson kernel

K_{a}

.

Which axioms survive?

A1 (evenness) & A2 (unit mass): satisfied by construction.
A3 (analytic strip). $B_{m, a}$ extends meromorphically to $S_{a}$ but possesses poles of multiplicity m at $z = \pm i a$ , violating the “simple pole’’ clause (ii) of A3.
A5 (positivity). All Fourier coefficients are B–spline moments of order m and hence non–negative [7].
A4 (single inflection). As shown below, $B_{m, a}^{''}$ changes sign at least m distinct angles in $(0, π)$ , contradicting A4.

Multiple–inflection failure.

Differentiating (27) twice yields

B_{m, a}^{''} (φ) = \frac{C_{m} (a) sinh a P_{m} (cos φ)}{{(cosh a - cos φ)}^{m + 2}},

where

P_{m} (x) = m (m + 1) x^{2} - 2 m (2 m + 1) (cosh a) x + (2 m + 1) (2 m + 2) - 2 m ({cosh}^{2} a) .

For

m \geq 2

,

P_{m}

is a quadratic with two real roots in

(- 1, 1) \subset (cos π, cos 0)

, each yielding a distinct zero of

B_{m, a}^{''}

inside

(0, π)

. By Rolle’s theorem applied to higher derivatives,

B_{m, a}^{''}

acquires at least m sign changes, proving:

Proposition 6

(Inflection multiplicity). For every

m \geq 2

and

a > 0

, the order–m Butterworth kernel

B_{m, a}

violates Axiom A4by exhibiting at least m distinct inflection points in

(0, π)

.

Practical Consequences.

The sharper roll–off of

B_{m, a}

(

m \geq 2

) is accompanied by pronounced time–domain ringing and passband ripple, mirroring the classical trade-off in analogue filter design. With

B_{1, a}

vs.

B_{2, a}

at equal FWHM, highlighting the additional curvature extrema.

Summary.

Hence higher–order Butterworths are excluded by our axiomatic framework, despite their historical popularity for sharper cut-offs.

6. Numerical Experiments

6.1. Implementation Details

Our test–bed follows the reproducible pipeline depicted in Fig. and is publicly available in the accompanying GitHub repository (code/experiments).

Signal discretisation.

Grid: $N = 2^{12} = 4096$ equispaced angles $φ_{j} = - π + \frac{2 π j}{N}$ , with index $j = 0, \dots, N - 1$ .
Sampled kernel: $K_{a} [j] : = K_{a} (φ_{j})$ , stored in double precision.
Input signals: periodic array $f [j] \in R$ of the same length, drawn either synthetically (Sec. 6.2) or from real-world data sets (Sec. 6.4).

FFT–based circular convolution.

Let

F

and

F^{- 1}

denote the discrete Fourier transform (DFT) and its inverse as implemented in NumPy/FFTW. Convolution is performed via

(f * K_{a}) [j] = F^{- 1} (F {f} \cdot F {K_{a}}) [j], j = 0, \dots, N - 1,

(28)

where · denotes element–wise multiplication. Since

K_{a}

is real and even, its DFT is real-valued, reducing memory traffic by

50 %

.

Normalisation and wrap-around ordering.

The DFT is used in its unitary form,

{[F {g}]}_{n} = \frac{1}{\sqrt{N}} \sum_{j} g [j] e^{- 2 π i j n / N},

ensuring that Parseval’s identity

{∥ f ∥}_{2} = {∥ F {f} ∥}_{2}

holds discretely. To mitigate wrap–around artefacts, data are stored in numpy.fft.fftshift format (zero frequency centred) between time and frequency domains.

Complexity and precision.

With FFTW’s FFTW_MEASURE flag, the forward/backward transforms in (28) cost

T_{conv} \approx 2 (\frac{5}{2}) N {log}_{2} N \approx 5.0 \times 10^{5}

flops per signal, well below the 16 ms wall-clock threshold on commodity laptops. All computations use 64-bit floating point; peak relative error of the convolution routine is

\leq 7.5 \times 10^{- 15}

for

K_{a}

with

a \in [0.2, 2.0]

.

Software environment.

Python 3.11, NumPy 1.28, SciPy 1.12, pyFFTW 0.14 under Ubuntu 22.04. Random seeds are fixed via numpy.random.seed(42) for exact reproducibility. Docker file and Jupyter notebooks are provided in the repository.

6.2. Experiment 1: Denoising Wrapped Phase Data

Synthetic test signal.

We emulate phase–unwrapping scenarios by generating a sawtooth wave of unit slope and period

2 π

:

s (φ) = {wrap}_{π} (φ), φ \in [- π, π),

(29)

sampled on the

N = 4096

grid from Sec. 6.1. Additive white Gaussian noise

η [j] \sim N (0, σ^{2})

with

σ = 0.15

is superimposed:

f_{noisy} [j] = s [j] + η [j]

(see Appendix A.8.1 for code).

Denoising procedure.

Each kernel listed in Table 2 is convolved with

f_{noisy}

via Eq. (28). The Poisson parameter is chosen as

a = 0.8

, giving

FWHM = 0.91

rad; competing kernels are tuned to the same FWHM for fairness. We evaluate reconstruction quality using mean–squared error

MSE = \frac{1}{N} \sum_{j} {(s [j] - \hat{s} [j])}^{2} .

Results are averaged over 100 Monte–Carlo noise realisations.

Discussion.

Poisson filtering reduces the noise energy by an order of magnitude (relative improvement

> 9 \times

) and outperforms the nearest rival (

B_{2, a}

) by

\sim 35 %

. The raised–cosine’s shallow

n^{- 2}

spectral decay yields the poorest denoising, while the Gaussian’s positivity breakdown (Sec. 5.2) manifests here as over–smoothing of sharp phase jumps.

Reproducibility.

The notebook phase_denoise.ipynb reproduces Table 2. Runtime per Monte–Carlo trial is 41 ms on an Apple M2 laptop (single core).

6.3. Experiment 2: Spectral Leakage in Window Design

Objective.

We compare the side–lobe leakage produced by the Hann window (24) versus the Poisson/Butterworth–1 kernel

K_{a}

when used as spectral windows in a short–time Fourier transform (STFT). Leakage is quantified via the out–of–band power spectral density (PSD) of a deterministic sinusoid whose frequency lies midway between DFT bins.

Signal, windowing, and PSD estimate.

Length $N = 2^{12}$ , grid $φ_{j}$ as in Sec. 6.1.
Test tone: $x [j] = sin (2 π ν j / N)$ with $ν = 5.5$ (non-integer bin index to provoke leakage).
Window sequences: $w_{Hann} [j] = H (φ_{j}), w_{K} [j] = K_{a} (φ_{j}), a = 0.80$ (same FWHM as in Experiment 1).
Windowed DFTs: $X_{w} [n] = F {w \cdot x} [n]$ via the unitary DFT from Sec. 6.1.
Power spectral density: $P_{w} [n] = {| X_{w} [n] |}^{2}$ .

Leakage metrics.

Let

n^{★} = round (ν) = 6

be the nearest DFT bin. Define

SLL = 10 {log}_{10} (\frac{\sum_{n \neq n^{★}} P_{w} [n]}{P_{w} [n^{★}]}), FOM = 10 {log}_{10} (P_{w} [n^{★}]),

where SLL is the side–lobe level (dB down from main lobe) and FOM the figure of merit for main–lobe power. Table 3 reports values averaged over 50 Monte–Carlo realisations of phase offset

ϕ \sim U (0, 2 π)

in

x [j] = sin (2 π ν j / N + ϕ)

.

Interpretation.

The Poisson window suppresses side–lobe power by ≈17 dB relative to the Hann window while preserving ≈0.12 dB more energy in the main lobe. The exponential tail

{\hat{K}}_{a} (n) = e^{- a | n |}

ensures rapid harmonic roll–off, explaining the superior leakage performance.

Reproducibility.

Notebook spectral_leakage.ipynb computes Table 3 in 0.62 s wall–clock time on the hardware cited in Sec. 6.1.

6.4. Experiment 3: Circular Kernel Regression on Wind–Direction Data

Data set.

We use the publicly available NOAA National Data Buoy Center station 46042 (Monterey Bay, CA) six-minute wind-direction records for calendar year 2023.6 Quality-controlled observations (

88, 139

time stamps) are split chronologically: first

70 %

for training, remaining

30 %

for testing. Wind direction

θ_{t}

is measured in

[0, 2 π)

; all angles are re-centred to

[- π, π)

for circular calculations.

Task and baseline.

Given a 1-hour history window

Θ_{t - 10 : t} = {θ_{t - 10}, \dots, θ_{t}}

(10 samples) we predict the direction

θ_{t + 1}

6 min ahead via circular kernel regression:

{\hat{θ}}_{t + 1} = arg (\sum_{k = 0}^{10} K (θ_{t - k} - θ_{t}) e^{i θ_{t - k}}),

(30)

where arg returns the argument of a complex number. Persistence (

{\hat{θ}}_{t + 1} = θ_{t}

) serves as a naïve baseline.

Kernels and hyper-parameters.

Poisson $K_{a}$ : bandwidth tuned by 5-fold rolling cross-validation on the training split $\Rightarrow a^{★} = 0.55$ .
Gaussian $G_{σ}$ : variance grid $σ \in {0.4, 0.6, 0.8, 1.0}$ $\Rightarrow σ^{★} = 0.6$ (still PSD-valid).
Raised–cosine H: no tunable parameter; FWHM equals that of $K_{a^{★}}$ by analytic scaling.
Butterworth $B_{2, a}$ : order $m = 2$ ; a chosen to match $K_{a^{★}}$ FWHM.

Evaluation metric.

Mean absolute angular error (MAAE) in degrees:

MAAE = \frac{1}{T_{test}} \sum_{t \in test} |{wrap}_{π} (θ_{t + 1} - {\hat{θ}}_{t + 1})| \times \frac{180}{π} .

Discussion.

The Poisson kernel lowers prediction error by 42% relative to persistence and

14 %

versus the next best alternative. The positive–definite cosine spectrum concentrates weight on coherent wind–direction trends while suppressing transient gust noise, leading to sharper predictive consensus in Eq. (30). In contrast, the raised–cosine’s shallow decay retains spurious high-frequency components, and the Gaussian’s PSD breakdown at higher harmonics reduces its effective sample size.

Reproducibility.

Data download scripts, preprocessing, and the wind_direction.ipynb notebook fully recreate Table 4. Total runtime: 2.4 s on the machine detailed in Sec. 6.1. A cached CSV snapshot is bundled to decouple experiments from live NOAA servers.

7. Discussion and Outlook

7.1. Interpretation of Uniqueness

Theorem 2 asserts a striking scarcity: among the infinitely many even, normalised kernels on

S^{1}

, only the Poisson/Butterworth–1 family simultaneously fulfils the six seemingly modest desiderata A1–A6. Two complementary viewpoints illuminate why the design space collapses so dramatically.

Intersection of orthogonal constraints.

Each axiom carves out a convex cone (or affine slice) in the Banach space

C (S^{1})

:

A1 forces the kernel onto the +-eigenspace of the reflection operator $φ \mapsto - φ$ .
A2 restricts to the affine hyperplane of unit-mass functions.
A5 keeps the kernel inside the closed convex cone of positive-definite functions (Bochner).

Orthogonally, A3 and A4 impose non-convex algebraic conditions on meromorphic continuation and curvature, while A6 fixes a specific level-set geometry (FWHM). The intersection of a high-codimension affine set with a narrow non-convex manifold is generically empty; it is therefore unsurprising—though far from trivial—that only one analytic family survives.

Pole geometry versus spectral positivity.

A3 pins the analytic singularities to a pair of conjugate poles, thereby forcing exponential decay of Fourier coefficients (Lemma 2). Positive-definiteness (A5) then requires these coefficients to be non-negative. But an exponential envelope with sign alternations is the generic outcome of residues of arbitrary phase; the non-negativity demand shrinks the residue phase freedom to a single point on the unit circle, collapsing a one-parameter complex family to the one-parameter real Poisson curve.

Analogy to minimax priors.

In Bayesian nonparametrics, certain conjugate priors emerge as unique solutions of competing symmetry and optimality axioms. Our kernel uniqueness plays an analogous role in harmonic analysis:

K_{a}

is the sole compromise between rotation symmetry, energy conservation, analytic regularity, curvature moderation, covariance validity, and operational bandwidth control.

Implications.

Practitioners often tweak window shapes to trade bandwidth against side-lobe suppression. The uniqueness result suggests that, once the six axioms are accepted as “non-negotiable”, no tuning along shape degrees of freedom remains—only the scalar bandwidth parameter a. In other words, “pick your a, not your window’’ becomes the methodological prescription for circular smoothing tasks.

Caveats.

The scarcity hinges on the exact wording of A4 and A6. Relaxing “single” inflection to “at most two”, or redefining FWHM as a range instead of a unique point, would reopen the door to a continuum of admissible kernels. Section 3 motivates our stricter choices, but domain-specific applications might favour alternative compromises.

7.2. Limitations

No compact–support kernels.

Axiom A3 demands meromorphic extension to a horizontal strip

S_{a}

, which automatically implies that

K (φ)

decays exponentially rather than vanishing beyond a finite radius. Hence kernels with compact spatial support—favoured in some real-time filtering pipelines for their finite impulse responses—are ruled out. In particular, truncated polynomials and

sinc

-like tapers live outside our admissible class despite their utility in block-processing hardware.

Bandwidth selection.

The uniqueness theorem leaves a single free hyper-parameter,

a > 0

, whose choice governs the bias–variance trade-off.

Heuristic tuning. Domain experts often map FWHM to a problem-specific scale (e.g. ‘‘one month of wave data’’ in Sec. 6.4), but such heuristics can be brittle across regimes.
Data-driven methods. Cross-validation, Stein’s unbiased risk estimate, or Bayesian marginal likelihood can adapt a automatically. Yet these procedures increase computational cost and may overfit when sample sizes are small or noise is heteroscedastic.

Non-even or directed kernels.

Applications such as storm-tracking require directionally biased smoothing to favour forward motion. Our A1 symmetry constraint precludes such kernels; extending the framework to left– or right–causal variants remains open.

Numerical resolution versus a.

While FFT-based convolution is

O (N log N)

, large a narrows the main lobe, demanding finer grids to avoid aliasing (Fig. ). Balancing resolution against memory in embedded devices can thus become a practical bottleneck.

7.3. Connections to Other Fields

Wavelet theory.

The Poisson kernel is the time–domain footprint of the Poisson wavelet (also called the Abel–Poisson or Laplace wavelet) on the circle [5]. Our parameter a corresponds to the inverse dilation in continuous wavelet transforms:

W_{f} (a, τ) = \int_{- π}^{π} f (φ) K_{a} (φ - τ) d μ (φ) .

Uniqueness under A1–A6 thus backs longstanding empirical wisdom that Poisson wavelets deliver optimally localised analyses among rotation-invariant mother functions.

Bayesian priors on angles.

In directional statistics, the Poisson (a.k.a. wrapped Cauchy) kernel serves as a conjugate prior for von Mises likelihoods [8]. Our axioms justify its privileged status: among all even priors with unit mass and analytic continuations, only wrapped Cauchy distributions remain positive-definite for every scale parameter. The half-height axiom (A6) translates to a single concentration hyper-parameter, in line with Jeffreys’ desiderata for minimally informative priors.

Graph signal processing on cycles.

Cyclic graphs

C_{N}

approximate

S^{1}

as

N \to \infty

. Convolving a graph signal

x \in R^{N}

with the discrete Poisson kernel

K_{a} [k] = \frac{1 - r^{2}}{1 - 2 r cos (2 π k / N) + r^{2}}

implements an exact spectral filter whose eigenvalues are

e^{- a | n |}

. Recent work on polynomial filters [9] shows that no finite-order Chebyshev approximation can reproduce this exponential decay without violating positivity on the graph Laplacian spectrum—echoing our continuous uniqueness theorem in the discrete domain.

Synthesis.

The axiomatic characterisation of

K_{a}

therefore resonates across wavelet design, Bayesian inference on angles, and graph frequency analysis, strengthening the universality claim for the Poisson/Butterworth–1 kernel.

7.4. Future Work

Several research avenues emerge naturally from the present results.

(i) Higher–dimensional tori $T^{d}$ .

Replacing the circle by the d-torus,

T^{d} = S^{1} \times \dots \times S^{1}

, raises two questions: (a) what axioms generalise A1–A6 in a product setting, and (b) does the tensor product kernel

K_{a}^{\otimes d} (φ) = \prod_{j = 1}^{d} K_{a} (φ_{j})

remain unique under the higher-dimensional analogue? Preliminary contour arguments suggest that additional pole patterns (e.g. diagonals in

C^{d}

) could intrude, requiring new positivity constraints.

(ii) Non-even or directed kernels.

Relaxing A1 to allow

K (φ) \neq K (- φ)

would enable directionally biased smoothers, pertinent to storm tracking or robotic navigation. One challenge is formulating an alternative to A4’s single-inflection condition that distinguishes “order-1 causal” kernels from higher-order asymmetric counterparts.

(iii) Data-driven bandwidth selection.

While cross-validation suffices in small-to-moderate data regimes, Bayesian marginal likelihood

p (data ∣ a)

offers a principled route to learning a. The conjugacy between wrapped Cauchy priors (Sec. 7.3) and von Mises likelihoods hints at analytic closed forms for

log p (data ∣ a)

, paving the way for scalable empirical Bayes estimators and gradient-based hyper-optimisation.

(iv) Hardware-friendly implementations.

FPGA or DSP realisations could exploit the geometric coefficient law

{\hat{K}}_{a} (n) = e^{- a | n |}

to replace large LUTs with on-the-fly exponentiation, trimming memory footprints in edge devices.

(v) Robustness to model mis-specification.

Finally, quantifying the sensitivity of downstream tasks (e.g. directional regression) to deviations from positivity (Prop. 5) may illuminate where the axioms can be safely weakened without large performance penalties.

8. Conclusion

We have shown that a concise set of six axioms—reality and evenness, unit mass, analytic continuation to a strip with simple poles, single inflection, positive-definiteness, and a half-height bandwidth calibration—isolates a single family of kernels on the circle: the Poisson/Butterworth-1 class

K_{a} (φ) = sinh a / (cosh a - cos φ) .

This uniqueness hinges on a synthesis of classical complex analysis (contour integration, Paley–Wiener theory, Bochner–Herglotz positivity) and modern kernel perspectives from statistics and signal processing.

Practically, the results advise practitioners to “pick your a, not your window’’: once the six axioms are accepted, the bandwidth parameter is the only degree of freedom, obviating ad-hoc window-shape tweaking. Our numerical experiments—phase denoising, spectral leakage suppression, and wind-direction forecasting—confirm the kernel’s superior performance over popular alternatives.

Beyond theory, we encourage incorporation of

K_{a}

as a default rotationally invariant smoother in circular-data toolkits and graph-signal libraries. The geometric coefficient law

{\hat{K}}_{a} (n) = e^{- a | n |}

enables efficient FFT and hardware implementations, while the single-parameter design simplifies user interfaces. We anticipate that the axiomatic framework and its extensions to higher-dimensional tori, directional kernels, and data-driven bandwidth learning will catalyse further advances at the intersection of harmonic analysis, statistics, and engineering.

Appendix A. Geometry of the Circle

Appendix A.1. Embedding and Parametrization

We consider the unit circle

S^{1} \subset R^{2}

of radius

R > 0

:

S^{1} = \{(x, y) \in R^{2} : x^{2} + y^{2} = R^{2}\} .

Introduce the angular coordinate

ϕ \in [0, 2 π)

via

x = R cos ϕ, y = R sin ϕ .

Arc-Length Element and Normalization

The differential arc-length on

S^{1}

is

d s = \sqrt{{(\frac{d x}{d ϕ})}^{2} + {(\frac{d y}{d ϕ})}^{2}} d ϕ = \sqrt{{(- R sin ϕ)}^{2} + {(R cos ϕ)}^{2}} d ϕ = R d ϕ .

Hence the total circumference is

\int_{0}^{2 π} d s = \int_{0}^{2 π} R d ϕ = 2 π R .

To turn this into a probability-style “averaging” measure on

S^{1}

, we set

d μ (ϕ) = \frac{1}{2 π R} d s = \frac{1}{2 π} d ϕ,

so that for any integrable

f : S^{1} \to R

,

\frac{1}{2 π R} \int_{S^{1}} f d s = \int_{0}^{2 π} f (ϕ) \frac{d ϕ}{2 π} .

Proof of Invariance under Rotations

If

ϕ_{0}

is any fixed angle, the rotation

ϕ \mapsto ϕ + ϕ_{0}

preserves

d s

because

d (ϕ + ϕ_{0}) = d ϕ ⟹ d s = R d ϕ = R d (ϕ + ϕ_{0}) .

Thus

d μ

is invariant under all circle-rotations, a key property for Fourier analysis on

S^{1}

.

Appendix B. Fourier-Mode Decomposition

Appendix B.1. Definition of Circular Harmonics

On the unit circle

S^{1}

, the circular harmonics (Fourier modes) are the functions

e_{n} (ϕ) = e^{i n ϕ}, n \in Z, ϕ \in [0, 2 π) .

Any

f \in L^{2} (S^{1}, d μ)

admits the formal expansion

f (ϕ) = \sum_{n = - \infty}^{\infty} a_{n} e^{i n ϕ}, a_{n} = \int_{0}^{2 π} f (ϕ) e^{- i n ϕ} \frac{d ϕ}{2 π} .

Orthonormality and Completeness

We verify

〈e_{m}, e_{n}〉 = \int_{0}^{2 π} e^{i m ϕ} \bar{e^{i n ϕ}} \frac{d ϕ}{2 π} = \int_{0}^{2 π} e^{i (m - n) ϕ} \frac{d ϕ}{2 π} = δ_{m n} .

Completeness follows from the fact that the span of

{e_{n}}

is dense in

L^{2} (S^{1})

by standard Fourier theory (e.g. via the Stone–Weierstraß theorem on trigonometric polynomials).

Parseval’s Identity

For

f \in L^{2} (S^{1})

with Fourier coefficients

{a_{n}}

,

{∥ f ∥}_{L^{2}}^{2} = \int_{0}^{2 π} {| f (ϕ) |}^{2} \frac{d ϕ}{2 π} = \sum_{n = - \infty}^{\infty} {| a_{n} |}^{2} .

Moreover, for two functions

f, g \in L^{2} (S^{1})

,

〈 f, g 〉 = \sum_{n = - \infty}^{\infty} a_{n} \bar{b_{n}}, b_{n} = \int_{0}^{2 π} g (ϕ) e^{- i n ϕ} \frac{d ϕ}{2 π} .

Appendix C. Kernel Axioms Specialised to S 1

Let

K : S^{1} \times S^{1} \to R

be a convolution kernel on the unit circle, so that

K (ϕ, ϕ^{'}) = K (ϕ - ϕ^{'}), ϕ, ϕ^{'} \in [0, 2 π),

and write

K (θ)

for

K (θ, 0)

.

Appendix C.1. Reality & Evenness

K (θ) \in R, K (θ) = K (- θ), \forall θ \in [0, 2 π) .

This ensures the kernel is a real, even function of angular separation.

Appendix C.2. Normalisation

Throughout what follows we denote by

K_{a}

the Poisson kernel with a strip-width parameter

a > 0

that is, for the moment, left unspecified (its value will be fixed later in §A.3.5 via the half-height condition):

K_{a} (ϕ) = \frac{sinh a}{cosh a - cos ϕ}, d μ (ϕ) = \frac{d ϕ}{2 π}, \int_{0}^{2 π} K_{a} (ϕ) d μ (ϕ) = 1 .

Hence convolution with

K_{a}

preserves the average of any function on

S^{1}

.

Appendix C.3. Analytic-Strip & Exponential Decay

There exists a width

a > 0

such that the normalized Poisson kernel

K_{a} (ϕ) = \frac{sinh a}{cosh a - cos ϕ}, d μ (ϕ) = \frac{d ϕ}{2 π}, \int_{0}^{2 π} K_{a} (ϕ) d μ (ϕ) = 1,

extends to a meromorphic,

2 π

-periodic function on the complex strip

{z = θ + i η : θ \in R, | η | < a},

with exactly two simple poles per period at

z = \pm i a (mod 2 π) .

Moreover, for all real

θ

and

| η | < a

, one has the uniform bound

|K_{a} (θ + i η)| = |\frac{sinh a}{cosh a - cos (θ + i η)}| \leq \frac{sinh a}{cosh a - cosh η} .

By shifting the integration contour for

a_{n} = \int_{0}^{2 π} K_{a} (ϕ) e^{- i n ϕ} d μ (ϕ)

to

ℑ z = \pm (a - ε)

(choosing the sign opposite to n)) and letting

ε \to 0

, one immediately deduces the Paley–Wiener estimate

| a_{n} | \leq M e^{- a | n |}, n \in Z,

for some constant

M > 0

.

Appendix C.4. Single-Inflection / Half-Height Criterion

Define the half-height angle

ϕ_{1 / 2} \in (0, π)

by

K (ϕ_{1 / 2}) = \frac{1}{2} K (0) .

Second derivative and uniqueness of the inflection.

With

K_{a} (ϕ) = \frac{sinh a}{cosh a - cos ϕ}

set

D (ϕ) : = cosh a - cos ϕ

. A direct calculation gives

K_{a}^{'} (ϕ) = \frac{sinh a sin ϕ}{D {(ϕ)}^{2}}, K_{a}^{''} (ϕ) = \frac{sinh a}{D {(ϕ)}^{3}} [(cosh a - cos ϕ) cos ϕ - 2 {sin}^{2} ϕ] .

Write the numerator as

N (ϕ) : = cosh a cos ϕ + {cos}^{2} ϕ - 2 .

On the interval

(0, π)

we have

N (0) = cosh a - 1 > 0

and

N (π) = - (cosh a + 1) < 0

; hence N changes sign.

Differentiate once more:

N^{'} (ϕ) = - sin ϕ (cosh a + 2 cos ϕ), N^{''} (ϕ) = 2 (1 + cos ϕ) > 0 .

Thus

N^{'} (ϕ)

is strictly increasing on

(0, π)

; it is negative when

0 < ϕ < \frac{π}{2}

and positive when

\frac{π}{2} < ϕ < π

. Consequently

N (ϕ)

is strictly decreasing until a single minimum at some

ϕ_{*} \in (0, π)

, then strictly increasing—so it crosses zero exactly once. Because the factor

sinh a / D {(ϕ)}^{3}

in

K_{a}^{''}

never vanishes, this unique zero of N is the only point in

(0, π)

where

K_{a}^{''} (ϕ) = 0

. In particular

K_{a}^{''} (ϕ) < 0 (0 < ϕ < ϕ_{*}), K_{a}^{''} (ϕ) > 0 (ϕ_{*} < ϕ < π),

so

K_{a}

has exactly one inflection point, occurring at

ϕ = ϕ_{*}

. Imposing the half-height condition

K_{a} (ϕ_{1 / 2}) = \frac{1}{2} K_{a} (0)

therefore forces

ϕ_{*} = ϕ_{1 / 2}

and determines a uniquely via

cosh a = 2 - cos ϕ_{1 / 2}

(see § A.3.5).

Axiom.

Assume that

K (θ)

has exactly one inflection point on

(0, π)

and that it occurs precisely at

θ = ϕ_{1 / 2}

:

K^{''} (θ) \{\begin{matrix} < 0, & 0 < θ < ϕ_{1 / 2}, \\ > 0, & ϕ_{1 / 2} < θ < π . \end{matrix}

Why impose this?

Parameter fixing. The analytic-strip and half-height conditions still leave a free scale parameter. Pinning the curvature change to the half-height point removes that freedom, selecting a single “just-right’’ profile—the Poisson kernel $K_{a}$ with $a = arcosh (2 - cos ϕ_{1 / 2})$ .
Excluding sharper filters. Higher-order Butterworth kernels of order $m \geq 2$ have m sign changes of $K^{''}$ on $(0, π)$ . The one-inflection axiom therefore discards all such over-shaped alternatives, leaving only the first-order (Poisson) case.
Signal-processing intuition. Locating the unique curvature switch exactly at the $- 3$ dB (half-height) point yields the smoothest admissible low-pass kernel with that cut-off, mirroring the classical design rule for analogue filters.

Consequently the strip width is no longer arbitrary but must satisfy

a = arcosh (2 - cos ϕ_{1 / 2}),

so the ratio of a to the peak value

K (0)

is fixed once the half-height angle is specified.

Note. The single-inflection requirement is not meant as a general smoothness axiom; it is chosen here precisely to rule out higher-order Butterworth profiles and therefore pin down the first-order Poisson kernel.

Appendix C.5. Half-Height Equation

Impose the half-height condition

K_{a} (ϕ_{1 / 2}) = \frac{1}{2} K_{a} (0) ⟹ \frac{sinh a}{cosh a - cos ϕ_{1 / 2}} = \frac{1}{2} \frac{sinh a}{cosh a - 1} .

Canceling the common

sinh a

and cross-multiplying gives

2 (cosh a - 1) = cosh a - cos ϕ_{1 / 2} ⟹ cosh a = 2 - cos ϕ_{1 / 2} .

Hence the strip-width a is

Appendix C.6. Positive-Definite Kernel

For every choice of points

ϕ_{1}, \dots, ϕ_{N} \in [0, 2 π)

and complex scalars

c_{1}, \dots, c_{N}

we require

\sum_{j = 1}^{N} \sum_{k = 1}^{N} \bar{c_{j}} c_{k} K (ϕ_{j} - ϕ_{k}) \geq 0 .

Remarks.

By the Bochner–Herglotz theorem on $S^{1}$ , this is equivalent to the Fourier coefficients being non-negative: $a_{n} \geq 0$ for all n.
Combined with the earlier axioms (analytic strip, simple poles at $\pm i a$ , normalisation), positive-definiteness forces

$a_{n} = e^{- a | n |}, n \in Z,$

i.e. K is the Poisson kernel. See the next section.

Appendix D. Fourier-Coefficient Analysis

Given the

2 π

-periodic, even kernel

K (ϕ)

from Appendix A.3, expand in circular harmonics:

K (ϕ) = \sum_{n = - \infty}^{\infty} a_{n} e^{i n ϕ}, a_{n} = \int_{0}^{2 π} K (ϕ) e^{- i n ϕ} \frac{d ϕ}{2 π} .

Appendix A.4.1 Symmetry of Coefficients

Because

K (ϕ) = K (- ϕ)

is even, we have

a_{- n} = \int K (ϕ) e^{i n ϕ} \frac{d ϕ}{2 π} = \bar{\int K (ϕ) e^{- i n ϕ} \frac{d ϕ}{2 π}} = \bar{a_{n}} .

Since K is real-valued, we have

a_{n} = \bar{a_{n}}

; hence every Fourier coefficient is real and

a_{- n} = a_{n}

.

Appendix D.1. Exponential Decay via Paley–Wiener

By Appendix A.3.3 the normalised Poisson kernel

K_{a} (ϕ)

extends meromorphically to the strip

\{ϕ + i η : ϕ \in R, | η | < a\},

and satisfies the pointwise estimate

|K_{a} (ϕ + i η)| \leq \frac{sinh a}{cosh a - cosh η} \leq \frac{C}{a - | η |}, | η | < a,

for some constant

C > 0

. In particular

M : = sup_{\begin{matrix} ϕ \in R \\ | η | < a \end{matrix}} |K_{a} (ϕ + i η)| < \infty .

The n-th Fourier coefficient with respect to

d μ (ϕ) = d ϕ / 2 π

is

a_{n} = \int_{0}^{2 π} K_{a} (ϕ) e^{- i n ϕ} d μ (ϕ) .

To estimate

a_{n}

we shift the contour, choosing the sign of the shift according to n:

n>0.

Translate the path downward to the line

ℑ z = - (a - ε)

:

z = ϕ - i (a - ε)

with

0 \leq ϕ \leq 2 π

. Along this edge

e^{- i n z} = e^{- i n ϕ} e^{- n (a - ε)}, and |K_{a} (z)| \leq \frac{C}{ε},

so

|K_{a} (z) e^{- i n z}| \leq \frac{C}{ε} e^{- n (a - ε)} .

Hence the bottom edge contributes at most

\frac{1}{2 π} \int_{0}^{2 π} |K_{a} (z) e^{- i n z}| d ϕ \leq \frac{C}{ε} e^{- n (a - ε)} \overset{ε \to 0^{+}}{\to} 0 .

The vertical edges cancel by

2 π

-periodicity, so Cauchy’s theorem implies

| a_{n} | \leq M e^{- n (a - ε)} ⟹ | a_{n} | \leq M e^{- n (a - ε)} .

Taking absolute values therefore yields

| a_{n} | \leq M e^{- n (a - ε)} .

n<0.

Shift upward by

η = a - ε

; repeating the calculation with

n \mapsto - n

yields

| a_{n} | \leq M e^{- | n | (a - ε)} .

—

|bottom edge| \leq \frac{C}{ε} e^{- n (a - ε)} = C e^{- n a} e^{n ε} ⟶ 0 (ε \to 0^{+}) .

Letting

ε \to 0^{+}

in both cases gives the uniform Paley–Wiener bound

Appendix D.2. Boundary-of-Strip Estimates

Let

z = ϕ + i η

with

ϕ \in R

and

| η | < a

. From the normalized Poisson kernel

K_{a} (z) = \frac{sinh a}{cosh a - cos z}, cos z = cos ϕ cosh η - i sin ϕ sinh η,

we obtain

|cosh a - cos z| = \sqrt{{(cosh a - cos ϕ cosh η)}^{2} + {(sin ϕ sinh η)}^{2}} \geq cosh a - cosh η .

Hence for all

ϕ

and

| η | < a

,

|K_{a} (ϕ + i η)| \leq \frac{sinh a}{cosh a - cosh η} = O ({(a - η)}^{- 1}) as η \to a^{-} .

In particular, the growth

|K_{a} (z)| = O ({(a - | ℑ z |)}^{- 1})

suffices to justify contour deformations up to

ℑ z = \pm a

and thus completes the rigorous Paley–Wiener contour-shift argument for exponential Fourier-coefficient decay.

Appendix E. Residue-Calculus Construction

Appendix E.1. Normalized Poisson Kernel & Meromorphic Extension

Define the order-1 Butterworth (Poisson) kernel on

S^{1}

by

K_{a} (z) = \frac{sinh a}{cosh a - cos z}, d μ (z) = \frac{d z}{2 π}, z \in C, | ℑ z | < a .

$2 π$ -periodic:

$K_{a} (z + 2 π) = K_{a} (z) .$
Even:

$K_{a} (- z) = K_{a} (z) .$
Real on the real axis:

$K_{a} (ϕ) \in R for ϕ \in R .$
Normalized:

$\int_{0}^{2 π} K_{a} (ϕ) d μ (ϕ) = 1 .$

Moreover,

K_{a} (z)

is **meromorphic** in the strip

{| ℑ z | < a}

with exactly two simple poles per period located at

z = \pm i a (mod 2 π) .

Appendix E.2. Contour Integral for Fourier Coefficients

The n-th Fourier coefficient is

a_{n} = \frac{1}{2 π} \int_{- π}^{π} K_{a} (ϕ) e^{- i n ϕ} d ϕ .

A uniform bound for K a off the real axis.

For

| η | \geq a

,

|K_{a} (ϕ + i η)| \leq \frac{sinh a}{cosh | η | - cosh a} \leq 2 sinh a e^{- | η |}, | η | > a .

We close the contour differently for

n > 0

and

n < 0

.

Case n>0 (close downward).

Let

A > 0

and consider the rectangle whose lower edge is

ϕ \mapsto ϕ - i A

with

- π \leq ϕ \leq π

. Along that edge

|K_{a} (ϕ - i A) e^{- i n (ϕ - i A)}| \leq 2 sinh a e^{- A} e^{- n A} = 2 sinh a e^{- (n + 1) A},

by (A.5.2.1). Hence

|bottom edge| \leq \frac{1}{2 π} \int_{- π}^{π} 2 sinh a e^{- (n + 1) A} d ϕ = 2 sinh a e^{- (n + 1) A} \overset{A \to \infty}{\to} 0 .

The vertical sides cancel by

2 π

-periodicity, so Cauchy’s theorem gives

\int_{- π}^{π} K_{a} (ϕ) e^{- i n ϕ} d ϕ = - 2 π i \sum_{\begin{matrix} ℑ z < 0 \\ | ℜ z | \leq π \end{matrix}} Res [K_{a} (z) e^{- i n z}] .

Only the pole

z_{0} = - i a

lies inside. Expanding near

z_{0}

with

t = z + i a

:

cos z = cosh a + i t sinh a + O (t^{2}), cosh a - cos z = - i sinh a t + O (t^{2}),

so

Therefore

{Res}_{z = - i a} [K_{a} (z) e^{- i n z}] = i e^{- n a} .

(A.5.2.4)

Substituting,

\int_{- π}^{π} K_{a} (ϕ) e^{- i n ϕ} d ϕ = - 2 π i \cdot i e^{- n a} = 2 π e^{- n a}, n \geq 1,

and hence

a_{n} = e^{- n a}, n \geq 1 .

Case n<0 (close upward).

Repeating the argument with

n \mapsto - n

gives

\int_{- π}^{π} K_{a} (ϕ) e^{- i n ϕ} d ϕ = 2 π i \sum_{\begin{matrix} ℑ z > 0 \\ | ℜ z | \leq π \end{matrix}} Res [K_{a} (z) e^{- i n z}],

where the only pole is

z_{1} = i a

with

{Res}_{z = i a} K_{a} = - i

. A short calculation yields

a_{n} = e^{- | n | a}, n \leq - 1 .

Result

Together with

a_{0} = 1

(from normalisation of

K_{a}

), we have

Inequalities (A.5.2.1)–(A.5.2.2) guarantee the horizontal edges vanish, while (A.5.2.3)–(A.5.2.4) make the residue computation explicit.

Appendix F. Uniqueness via Bochner’s Theorem

Because Axiom A.3.6 makes K a positive-definite, translation-invariant kernel on the circle, the Bochner–Herglotz theorem gives non-negative Fourier coefficients:

a_{n} \geq 0

and

a_{- n} = a_{n}

. The analytic-strip axiom (A.3.3) says the generating function

F (w) = \sum_{n = - \infty}^{\infty} a_{n} w^{n}

has exactly two simple poles, at

w = e^{\pm a}

, so its Laurent coefficients must satisfy

a_{n} = C e^{- a | n |}

. Normalisation (A.3.2) fixes

C = 1

, whence

a_{n} = e^{- a | n |}, n \in Z,

and therefore

K (ϕ) = \frac{sinh a}{cosh a - cos ϕ} .

Thus the Poisson kernel is the unique kernel obeying Axioms A.3.1–A.3.6.

Appendix G. Special Cases & Comparisons

Appendix G.1. Pure Cosine Kernel

Consider the normalized cosine-kernel

K_{cos} (ϕ) = 1 + cos 2 ϕ, d μ (ϕ) = \frac{d ϕ}{2 π}, \int_{0}^{2 π} K_{cos} (ϕ) d μ (ϕ) = 1 .

Reality & evenness: $K_{cos} (ϕ) \in R$ and $K_{cos} (ϕ) = K_{cos} (- ϕ)$ .
Normalization: $a_{0} = \int K_{cos} (ϕ) d μ (ϕ) = 1$ .
Analytic-strip: Entire function (no finite poles), hence fails the two-simple-poles axiom.
Fourier decay: Coefficients $a_{0} = 1, a_{\pm 2} = \frac{1}{2}, a_{n} = 0 (n \neq 0, \pm 2)$ , so no genuine exponential tail.
Single-inflection failure: Inflection points where $K_{cos}^{''} (ϕ) = 0$ occur at $ϕ = \frac{π}{4}, \frac{3 π}{4}$ , giving two inflections in $(0, π)$ instead of one.

Appendix G.2. Gaussian Kernel

Let

K_{Gauss} (ϕ) = \frac{1}{\sqrt{4 π σ^{2}}} exp (- \frac{ϕ^{2}}{4 σ^{2}}), ϕ \in [- π, π],

periodically extended.

Analytic-strip: Entire (infinite strip width), but not meromorphic periodic—fails the two-poles axiom.
Fourier decay: Super-Gaussian $a_{n} \sim e^{- σ^{2} n^{2}}$ , too rapid and non-characteristic of a simple pole structure.
Half-height inflection: $ϕ_{1 / 2} = σ \sqrt{4 ln 2}$ ; lacks a unique inflection criterion in $(0, π)$ .
Normalization & evenness: May be arranged by choice of $σ$ , but other axioms fail.

Appendix G.3. Higher-Order Butterworth Kernels

For readers outside signal processing: an order-m Butterworth low-pass filter is the rational transfer function whose squared magnitude response is

{| H (ω) |}^{2} = {[1 + {(ω / ω_{c})}^{2 m}]}^{- 1} .

The case

m = 1

is called first-order or order-1 Butterworth; on the circle its impulse response is precisely the Poisson kernel, while

m \geq 2

gives progressively sharper roll-off.)

We therefore consider, for

m \geq 2

,

K_{a, m} (ϕ) = C_{a, m} \frac{{sinh}^{m} a}{{(cosh a - cos ϕ)}^{m}},

For integer

m \geq 2

, the order-m Butterworth kernel is

K_{a, m} (ϕ) = C_{a, m} \frac{{sinh}^{m} a}{{(cosh a - cos ϕ)}^{m}}, C_{a, m}^{- 1} = \int_{0}^{2 π} \frac{{sinh}^{m} a}{{(cosh a - cos ϕ)}^{m}} \frac{d ϕ}{2 π} .

Pole structure: Two poles of order m at $ϕ = \pm i a$ .
Fourier decay: $a_{n} = P_{m - 1} (| n |) e^{- a | n |}$ , where $P_{m - 1}$ is degree- $(m - 1)$ polynomial.
Half-height inflections: Exactly m distinct inflection points in $(0, π)$ , violating the single-inflection axiom.

References

Lord Rayleigh, The Theory of Sound, vol. 1, Macmillan, London, 1877 (2nd ed. 1896).
S. Butterworth, “On the theory of filter amplifiers,” Wireless Engineer, vol. 7, pp. 536–541, 1930.
E. C. Titchmarsh, The Theory of Functions, 2nd ed., Oxford University Press, 1939.
L. L. Helms, Introduction to Potential Theory, Wiley–Interscience, New York, 1969.
M. Holschneider, “Continuous wavelet transforms on the sphere,” Journal of Mathematical Physics, vol. 36, no. 8, pp. 4154–4165, 1995.
Y. Katznelson, An Introduction to Harmonic Analysis, 3rd ed., Cambridge University Press, 2004.
M. Unser, “Splines: A perfect fit for signal and image processing,” IEEE Signal Processing Magazine, vol. 16, no. 6, pp. 22–38, 1999.
K. V. Mardia and P. E. Jupp, Directional Statistics, Wiley, Chichester, 2000.
D. I. Shuman, B. Ricaud, and P. Vandergheynst, “A windowed graph Fourier transform,” IEEE Signal Processing Letters, vol. 22, no. 3, pp. 242–246, 2015.
R. E. A. C. Paley and N. Wiener, Fourier Transforms in the Complex Domain, American Mathematical Society, 1934.
W. Rudin, Functional Analysis, 2nd ed., McGraw–Hill, 1991 (Stone–Weierstrass reference).

1	Rayleigh formulates the problem in polar coordinates, expanding a periodic disturbance into Fourier harmonics and then “blurring’’ by convolving with a rapidly decaying even weight. Although stated in acoustic language, the procedure is mathematically identical to what we now call circular convolution.
2	In the sequel we abuse notation and write $d φ$ for $d μ (φ)$ whenever the normalising factor ${(2 π)}^{- 1}$ is either irrelevant or displayed elsewhere.
3	Completeness follows from the Stone–Weierstrass theorem; see Appendix A.2 for a self–contained proof.
4	Numerical integration performed at $10^{- 12}$ absolute precision; code available in the project repository.
5	The normalisation constant $C_{m} (a) = Γ (m + \frac{1}{2}) / (\sqrt{π} Γ (m)) sinh a {(2 cosh a - 2)}^{1 - m}$ ensures Axiom A2. Full derivation is in Appendix A.7.
6	https://www.ndbc.noaa.gov/station_page.php?station=46042

Table 1. Mapping between practitioner desiderata and the six axioms.

Practical desideratum	Guaranteeing axiom(s)
Rotational invariance; real-valued output	A1 (Reality & Evenness)
Scale-free averaging; bounded gain	A2 (Unit mass / Normalisation)
Causal, maximally-flat order-1 roll-off; bandwidth dial a	A3 (Analytic strip & simple poles)
Moderate passband curvature; ringing control	A4 (Single inflection)
Energy non-amplifying; covariance-kernel interpretation	A5 (Positive-definiteness)
Operational bandwidth via FWHM specification	A6 (Half-height equation)

Table 2. Mean–squared error (

\times 10^{- 3}

) for wrapped–phase denoising (

σ = 0.15

). Standard errors are

\leq 0.04

in all cases.

Table 2. Mean–squared error (

\times 10^{- 3}

) for wrapped–phase denoising (

σ = 0.15

). Standard errors are

\leq 0.04

in all cases.

Kernel	Parameter(s)	MSE
Poisson $K_{a}$	$a = 0.80$	12.1
Raised–cosine H	−	34.7
Gaussian $G_{σ}$	$σ = 1.00$	24.9
Butterworth $B_{2, a}$	$m = 2, a = 0.80$	18.6
Noisy input (baseline)	—	198.4

Table 3. Leakage metrics (dB) for Hann versus Poisson window (

a = 0.8

). Lower SLL is better; higher FOM is better. Standard errors

\leq 0.1

dB.

Table 3. Leakage metrics (dB) for Hann versus Poisson window (

a = 0.8

). Lower SLL is better; higher FOM is better. Standard errors

\leq 0.1

dB.

Window	SLL	FOM
Poisson $K_{a}$	-48.7	-0.04
Hann H	-31.9	-0.16

Table 4. Mean absolute angular error (deg) on the 2023 46042 wind-direction test split. Standard errors

\leq 0 . 06^{\circ}

.

Table 4. Mean absolute angular error (deg) on the 2023 46042 wind-direction test split. Standard errors

\leq 0 . 06^{\circ}

.

Method	MAAE
Poisson $K_{a^{★}}$	9.7
Butterworth $B_{2, a}$	11.3
Gaussian $G_{σ^{★}}$	12.4
Raised–cosine H	14.1
Persistence baseline	16.7

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.

Optimal Rotational Smoothing on S1: Why Only the Poisson Kernel Survives on the Circle

Abstract

Keywords:

Subject:

1. Introduction

1.1. Historical Background

Early circular smoothing (Rayleigh, 1870s).

The Butterworth revolution (1930s–40s).

Poisson kernel in potential theory (mid–20th century).

The axiomatic gap.

1.2. Problem Statement

Practical motivation.

1.3. Our Contributions

1.4. Paper Organisation

2. Preliminaries and Notation

2.1. Geometry of S 1

Arc–length measure.

Geodesic distance and rotations.

2.2. Function Spaces

L 2 ( S 1 , d μ )

Fourier basis and Parseval identity.

Schwartz space on the circle.

2.3. Fourier Series Conventions

Forward Transform.

Inverse transform.

Partial sums and Dirichlet kernel.

Convolution theorem.

2.4. Complex–Analytic Notation

Horizontal strips and their boundaries.

Poles and residues.

Rectangular contour.

Exponential map.

Asymptotic notation on boundary lines.

3. Axiomatic Characterisation Framework

3.1. Axiom A1: Reality & Evenness

Informal statement.

Formal axiom.

Rationale.

Immediate Spectral Consequence.

Example.

Looking ahead.

3.2. Axiom A2: Unit Mass / Normalisation

Informal statement.

Formal axiom.

Rationale.

Immediate consequence.

Example.

Looking ahead.

3.3. Axiom A3: Analytic Strip & Two Simple Poles

Informal statement.

Formal axiom.

Rationale.

Exponential decay lemma.

Example.

Looking ahead.

3.4. Axiom A4: Single–Inflection / Half–Height

Informal statement.

Formal axiom.

Rationale.

Second derivative for the Poisson kernel.

Formal consequence.

Looking ahead.

3.5. Axiom A5: Positive–Definiteness

Informal statement.

Formal axiom.

Bochner–Herglotz theorem on S 1 .

Statistical interpretation.

Spectral corollary.

Example.

Looking ahead.

3.6. Axiom A6: Half–Height (FWHM) Equation

Informal statement.

Formal axiom.

Rationale.

Example and closed-form relation.

Looking ahead.

3.7. Collective Implications

4. Main Theorem & Proof Sketch

4.1. Statement of the Uniqueness Theorem

Proof sketch.

Optimal Rotational Smoothing on S¹: Why Only the Poisson Kernel Survives on the Circle

Poisson kernel in potential theory (mid–20^th century).

2.1. Geometry of $S^{1}$

$L^{2} (S^{1}, d μ)$

Bochner–Herglotz theorem on $S^{1}$ .

Failure of Axiom A5 (positivity) beyond $σ \approx 1.46$ .

(i) Higher–dimensional tori $T^{d}$ .