Gaussian Boson Sampling as a Calibrated Variance-Reduction Engine: A DSFL Explanation

Camilla Josephson

doi:10.20944/preprints202510.2042.v1

Submitted:

25 October 2025

Posted:

27 October 2025

You are already at the latest version

Abstract

Gaussian Boson Sampling (GBS) generates photon-count patterns whose probabilities are governed by hafnian functionals of an encoded covariance. Recent theory shows that GBS samples can approximate Gaussian expectation problems with provable sample-complexity advantages over Monte Carlo (MC) on an open, sizable subset of instances. We give a unifying explanation via the Deterministic Statistical Feedback Law (DSFL). First, GBS acts as a calibrated, admissible map that intertwines target statistics and physical sampling. Second, tuning the mean photon number is a genuine calibration step that minimizes a DSFL residual. Third, principal-angle (Friedrichs) geometry quantifies conditioning and leakage between statistical and sampling features. Fourth, a Lyapunov envelope provides non-asymptotic error decay and design rules. From these ingredients we derive bounds that recover and sharpen known GBS scalings, predict regimes where GBS outperforms MC, and specify a practical hybrid policy that switches by degree. Experiments with simulated distributions and hardware-aligned kernels support the theory and validate the tuning and selection rules.

Keywords:

Gaussian Boson Sampling

;

hafnian

;

Wick’s theorem

;

variance reduction

;

importance sampling

;

probability estimator

;

Monte Carlo

;

sample complexity

;

(ε

;

δ) guarantees

;

calibration

;

photon–number matching

;

Deterministic Statistical Feedback Law (DSFL)

Subject:

Physical Sciences - Quantum Science and Technology

1. Introduction

Gaussian Boson Sampling (GBS) produces photon-count patterns whose probabilities are governed by hafnian functionals of a covariance encoded in a Gaussian photonic state. Beyond its foundational role in quantum advantage, GBS has recently been positioned as a practical engine for approximating Gaussian expectation problems (GEPs), where one seeks

μ = E [f (X)]

for

X \sim N (0, B)

and analytic or polynomial f. Wick’s theorem expresses

μ

as a hafnian–weighted sum over multi-indices drawn from B, aligning the target with the native GBS statistics. Building on this structure, Andersen and Shan introduced degree-resolved GBS estimators and proved open, nonempty regions in which GBS achieves

(ε, δ)

sample–complexity advantages over Monte Carlo (MC), and further showed that tuning the mean photon number to the task increases the percentage of instances with advantage [1,2].

1.1. Calibrated Residuals, Principal–Angle Conditioning, and Predict–Then–Prove Guarantees

This work advances three aspects that are often treated separately. First, the residual

R_{K} (t)

is not merely a distance: it is a calibrated, minimal, and dynamical quantity. We prove that every admissible tuning step is

L^{2}

–nonexpansive, so the residual cannot inflate under retuning or composition. Second, conditioning is captured quantitatively by the principal angle

θ_{F}

(via its cosine

γ

), and this geometric control enters directly into the variance and MSE formulae, providing sharp two–sided bounds; this bridges statistical estimation, photonic physics, and numerical analysis in a single comparison geometry. Third, the framework is predict–then–prove: before sampling, it specifies where to operate (photon–number matching), how conditioning will affect dispersion (principal angles), and what decay rate to expect (Lyapunov envelope), and then establishes these prescriptions as theorems.

1.2. Problem

Practitioners need to foresee, before allocating shots, when GBS will outperform MC and how to tune the device to realize that advantage. Concretely: given a covariance B and an integrand f, can we predict in advance whether calibrated GBS will require asymptotically fewer samples than MC to reach a prescribed

(ε, δ)

accuracy, and can we prescribe a robust one-parameter tuning rule that maximizes the gain? We seek design-time criteria (not post hoc curve fits) that tell us when to use GBS, when to defer to MC, and how to switch between them by degree.

1.3. DSFL Intuition: Forecasting When GBS Beats MC

The DSFL framework predicts, a priori, when and how GBS will outperform Monte Carlo (MC) by placing the target statistics and the sampler in a single Hilbert geometry and tracking one quantity—the residual. This yields the following design-time principles:

Single yardstick. Target and sampler are embedded in the same space; a single residual provides a quantitative knob to optimize and monitor.
No inflation under tuning. Retuning, counting, and block composition act as nonexpansive maps; the residual cannot increase under admissible operations.
Unique, certifiable optimum. The dominant variance/MSE factor is strictly convex in the scale and attains its unique minimum at the mean–photon–number match $m_{t B} = 2 K$ ; this yields a robust one-line tuning rule.
Predictable conditioning. Alignment between sampling and blueprint features is quantified by the principal angle $θ_{F}$ (cosine $γ$ ); small $γ$ certifies low dispersion. A computable surrogate from warm-up features allows pre-run assessment.
Rates, not just limits. If each update removes a fixed fraction of the residual, the residual obeys a non-asymptotic exponential envelope, providing a predicted semi-log MSE slope before data collection.
Regime map and hybrid policy. The geometry identifies regimes where GBS is inherently favorable (high degrees) and where MC suffices (low degrees), and prescribes a degree-wise hybrid switch using the same residual/angle diagnostics.
Robustness. Composition and small hardware drifts degrade angles and rates smoothly; re-matching $m_{t B}$ restores near-optimality.

In summary, DSFL turns GBS from a black-box sampler into a calibrated, predictable engine: it specifies where to operate, how conditioning impacts dispersion, and what decay rate to expect, enabling an informed, pre-run decision on when—and by how much—GBS will outperform MC.

1.4. Thesis (DSFL View)

We show that GBS is a calibrated variance-reduction engine inside a single Hilbert comparison geometry governed by a Deterministic Statistical Feedback Law (DSFL). This provides one residual to minimize and four provable pillars: (i) Admissibility & DPI: counting and retuning are

L^{2}

-nonexpansive and intertwine with the estimator, so the calibrated residual cannot inflate; (ii) Calibration optimality: the unique minimizer of the leading variance/MSE factor is the photon-number match

m_{t B} = 2 K

, which implements the tuning advocated and empirically optimized in [1,2]; (iii) Angle conditioning: principal (Friedrichs) angles between the sampling and blueprint subspaces yield sharp two-sided dispersion bounds

(1 \pm cos θ_{F})

, explaining conditioning and leakage; (iv) Lyapunov envelope: a one-step contraction turns into a non-asymptotic exponential error decay with rate

α (t)

maximized at the photon match.

1.5. Contributions

A unified geometry that proves a data–processing inequality (no residual inflation) for tuning and composition, and recovers the $(ε, δ)$ envelopes of [1].
A one-line strict-convexity proof that predicts the optimal tuning rule $m_{t B} = 2 K$ , matching the optimized estimators in [3].
Principal-angle bounds that forecast dispersion and a Lyapunov inequality that delivers exponential, non-asymptotic decay—together forming practical, pre-run design rules to decide GBS vs. MC and to set t.
A regime map that explains, a priori, when calibrated GBS is polynomial while MC is exponential (large degree), and a hybrid, degree-wise policy for the small-degree regime.

In short, DSFL turns “folk rules’’ into theorems: match

m_{t B}

, measure angles, and expect exponential envelopes—thereby foreseeing and tuning the regimes where GBS outperforms MC, in line with and extending the results of [1,2].

2. What GBS Does (Succinct but Precise)

GBS distribution and hafnians

Let

B \in R^{N \times N}

be symmetric with

0 < spec (B) < 1

. The GBS probability of a pattern

I \in N^{N}

is

P_{B} (I) = \frac{d_{B}}{I!} Haf {(B_{I})}^{2}, d_{B} : = \prod_{j = 1}^{N} \sqrt{1 - λ_{j}^{2}},

(1)

where

{λ_{j}}

are eigenvalues of B. The mean photon number is

m_{B} = \sum_{j = 1}^{N} \frac{λ_{j}^{2}}{1 - λ_{j}^{2}} .

(2)

Gaussian expectations via Wick’s theorem

For

f (x) = \sum_{k \geq 0} \sum_{| I | = k} a_{I} x^{I}

and

X \sim N (0, B)

,

μ = E [f (X)] = \sum_{k \geq 0} \sum_{| I | = 2 k} a_{I} Haf (B_{I}) .

(3)

A degree partition gives

μ = \sum_{k = 0}^{K} μ_{k}

with

μ_{k} = \sum_{| I | = 2 k} a_{I} Haf (B_{I})

.

Two practical GBS estimators

GBS-I (importance). For

μ_{k}^{(2)} = \sum_{| I | = 2 k} a_{I} Haf {(B_{I})}^{2}

, the estimator using weights

I! a_{I} / d_{B}

is unbiased.

GBS-P (probability). For

μ_{k} = \sum_{| I | = 2 k} a_{I} Haf (B_{I})

, estimate

p_{J} = P_{B} (J)

by counts and average

\sqrt{p_{J}}

with weights. The bias vanishes at rate

1 / n

; the leading term is explicit.

Tuning knob: photon-number matching

Scale

B \mapsto t B

with

t \in (0, λ_{1}^{- 1})

and choose t so that

m_{t B} = \sum_{j = 1}^{N} \frac{t^{2} λ_{j}^{2}}{1 - t^{2} λ_{j}^{2}} = 2 K,

(4)

i.e. match the sampler’s mean photon number to the target degree K.

3. Related Work

This section situates our DSFL-based treatment of Gaussian Boson Sampling (GBS) within three intertwined literatures: (i) the foundations of GBS, including sampling distributions, computational hardness, and experimental progress; (ii) GBS as a vehicle for numerical estimation, in particular its use for Gaussian expectation problems together with formal

(ε, δ)

-sample–complexity claims and percentage-of-advantage quantification; and (iii) control-theoretic and geometric principles that justify our residual-contraction, angle-conditioned bounds, and data–processing inequalities in a single Hilbertian comparison geometry. We use these threads to explain how the “folk rules’’ of GBS operation—photon-number tuning, conditioning via feature alignment, and non–asymptotic error envelopes—can be elevated to theorems, and to place our results alongside prior theory and practice.

3.1. Foundations of GBS and Hardness

The modern formulation of GBS as a sampling task driven by hafnian-valued amplitudes is due to Hamilton et al. [4] with detailed modeling and refinements in [5]. Complexity-theoretic hardness follows the linear-optical lineage of Aaronson–Arkhipov [6] and extends to experimentally informed, high-dimensional variants [7,8]. Algorithmic primitives for hafnian evaluation and classical simulation—including supercomputer-grade implementations and the thewalrus library—are documented in [9,10]. Photonic demonstrations approaching or attaining quantum advantage are reported in [11,12]. Related Gaussian models include loop-hafnians for displaced states [13] and threshold-detection (Torontoian) statistics [14], which frame hardware-relevant variants. On the application side, GBS has been explored for vibronic spectra [15,16], molecular docking [17], dense subgraph discovery and graph kernels [18,19,20,21], random point processes [22,23], and scalable time-bin/loop architectures [24,25]. These works supply the physical and computational substrate on which our estimation-theoretic analysis is built.

3.2. GBS Estimators for Gaussian Expectations

The use of GBS samples to approximate Gaussian expectation problems (GEPs)—via Wick’s theorem [26] to express Gaussian moments as hafnian-weighted functionals—was developed in [1], which introduced two concrete estimators (GBS-I and GBS-P) and proved non-asymptotic

(ε, δ)

guarantees on an open, nonempty subset of instances. An optimized variant matching the device’s mean photon number to the target degree and quantifying the percentage of the problem space with GBS advantage appears in [2]. Our DSFL framework recovers these bounds within a single comparison geometry, explains the tuning law (

m_{t B} = 2 K

), and adds angle-conditioned rate controls. Complementary analytic identities (e.g., the hafnian master theorem [27]) and approximation tools [28,29] delineate tractable regimes and provide context for the sampling-based estimators considered here.

3.3. Hilbertian Residual Control and Data Processing

Our analysis leverages standard Hilbert-space contractivity of orthogonal projections (counting) and parameter changes (retuning) to derive a data–processing inequality (DPI) for the DSFL residual, together with principal-angle (Friedrichs) bounds for conditioning and Lyapunov-type decay envelopes. While these tools are classical (see, e.g., two-subspace/Friedrichs-angle inequalities and discrete Grönwall arguments), their joint instantiation for GBS estimation and calibration appears to be new in this context. For Monte Carlo baselines and variance-reduction perspectives that we compare against, see [30,31,32,33].

Gaussian Boson Sampling (GBS) was introduced as a photonic sampling model whose output probabilities are governed by hafnians of submatrices associated with a Gaussian covariance [4,5]. In analogy with (standard) Boson Sampling [6], a growing body of work establishes classical intractability (under plausible conjectures) not only for exact sampling but also for sampling from suitable approximations to the ideal distribution [7,8]. Algorithmic progress on hafnian evaluation—both general-purpose [9] and software implementations (e.g. The Walrus) [10]—has enabled faithful classical simulations at nontrivial scales, supporting experimental validation [11,12] and methodological development.

3.4. Applications of GBS

Proposed applications of GBS span vibronic spectroscopy [15,16], molecular docking [17], dense subgraph discovery [18], perfect matchings [19], graph similarity and kernels [20,21], point processes [22], and photonic simulations of molecular dynamics [23]. These lines have been complemented by platform-level studies of time-bin/loop architectures that aim at scalable implementations [24,25].

3.5. GBS Estimators for Gaussian Expectations

The specific connection between GBS and Gaussian expectation problems (GEPs) is made precise by Wick’s theorem [26], which expresses Gaussian moments as hafnians. Building on this structure, Andersen and Shan (i) introduced degree-resolved GBS estimators (GBS-P and GBS-I) for GEPs and proved open, nonempty regions of exponential advantage over plain Monte Carlo (MC), measured in guaranteed

(ε, δ)

sample sizes [1], and (ii) optimized the mean photon number to the task and quantified the fraction of the problem space displaying advantage, with large observed percentages and sizeable efficiency ratios [2]. Related algorithmic components include hafnian bounds (via Stirling-type inequalities) and polylogarithmic envelopes that deliver computable, nonasymptotic sample-complexity expressions.

3.6. Variance-Reduction and MC Baselines

From a classical perspective, GBS-I can be seen as an importance sampler whose proposal is the physical GBS law; GBS-P corresponds to a (biased-but-consistent) probability estimator with explicit leading error terms. These sit alongside standard MC variance-reduction techniques (importance, stratification, antithetic sampling, quasi-MC), whose strengths and limitations in high dimensions are well documented [30,31,32,33]. The present analysis adopts plain MC as the baseline, following [1,2].

3.7. Hafnian Identities and Tractable Limits

Structural identities such as the hafnian master theorem [27] provide tractable regimes for particular coefficient ensembles (e.g.

z^{I} / I!

weights), while general approximation results for hafnians and permanents (e.g. Barvinok-type bounds) furnish algorithmic surrogates outside the photonic setting [28,29]. These are complementary to the sampling-based estimators considered here.

3.8. Angles, Data Processing, and Residual Control

The present DSFL view organizes the GBS estimation task in one Hilbert geometry and leverages two ingredients that are ubiquitous in information processing and operator theory: (i) data–processing inequalities (DPI) for nonexpansive, intertwining updates (counting maps, retuning), ensuring no residual inflation under admissible operations; and (ii) principal-angle (Friedrichs) bounds that control conditioning of two intertwined subspaces (blueprint vs sampling features) and yield two-sided variance inequalities with factors

(1 \pm cos θ_{F})

. The Lyapunov envelope for the residual—a discrete coercivity statement converting one-step contraction into an exponential decay rate—is classical in stability theory and is instantiated here by the t-matched (photon-number–matched) tuning law. While the DSFL terminology is specific to this work, the underlying tools are aligned with standard Hilbertian projection theory and CS-decompositions.

3.9. Positioning

Relative to [3,3], the present manuscript provides a unifying explanation of when/why/how GBS outperforms MC by: (a) proving that mean–photon–number matching is the unique minimizer of the leading residual/variance factor; (b) quantifying conditioning via principal angles with explicit two-sided bounds; and (c) deriving nonasymptotic Lyapunov envelopes for error decay that recover the

(ε, δ)

sample-size laws. This framework is compatible with current hardware models [34,35,36] and classical tooling for hafnians [9,10], and it complements domain applications [15,17,18,19,20,22,23].

4. Limitations and Scope

The guarantees developed here are tailored to Gaussian expectation problems in which Wick reduction expresses the target as hafnian–weighted functionals of a real, symmetric covariance with spectrum strictly inside

(- 1, 1)

. This setting aligns the photonic sampler’s distribution with the blueprint geometry in a way that makes DSFL calibration, principal-angle conditioning, and Lyapunov envelopes directly applicable. Extending the framework beyond this regime entails additional structure. For non-Gaussian observables or general boson sampling models (e.g., loop-hafnians with coherent displacements, threshold detection/Torontoian statistics, or complex-valued amplitudes with sign patterns), one must re-specify the feature map, the admissible updates, and the calibration functional so that (i) an appropriate inner-product geometry captures the relevant moments, (ii) the data–processing inequality remains valid under the physical update semigroup, and (iii) bias terms induced by nonlinearity or signs are explicitly controlled. A second limitation concerns conditioning proxies: our use of principal-angle surrogates (e.g., cross-Gram norms between sampling and blueprint features) is computationally light but approximate; computing exact projectors or Friedrichs angles may be prohibitive for very large

(N, K)

. Finally, while our analysis covers noiseless sampling and block composition via nonexpansive maps, hardware noise and calibration drift require model-specific contractions; the DSFL conclusions persist under such perturbations only insofar as a positive coercivity margin can be certified after re-matching the mean photon number.

4.1. DSFL Foresight for GBS: Verbal Logic → Mathematical Predictors

We give a from-first-principles account of how the Deterministic Statistical Feedback Law (DSFL) predicts (before any run) the behavior of Gaussian Boson Sampling (GBS): how tuning affects error, when/why it beats Monte Carlo (MC), how variance/bias scales, why mean-photon-number matching is optimal, and how composition/noise impact performance. Each verbal claim is paired with an explicit, checkable mathematical statement.

5. Stage 1: Put GBS in One Hilbert Geometry (the DSFL Stage)

DSFL places the statistical blueprint (the quantity to estimate) and the physical sampler (the device) in the same comparison geometry and measures a single misfit (residual). In our setting, GBS turns a scale parameter t (the photon-number knob) into a sample law

P_{t B}

. The estimator (GBS-I or GBS-P restricted to degree K) is a calibration map

I_{t}

that reads off

μ_{K}

from

P_{t B}

. Alignment ⇒ small residual; misalignment ⇒ large residual.

Mathematics:

Fix a degree K. Let the blueprint vector be

s_{K} \equiv {(a_{I})}_{| I | = 2 K} \in R^{D_{K}}, D_{K} = # {I : | I | = 2 K} .

(5)

Let the physical law at scale t be the degree-

2 K

marginal

p^{(t)} = {(p_{I}^{(t)})}_{| I | = 2 K}, p_{I}^{(t)} = \frac{d_{t B}}{I!} Haf {({(t B)}_{I})}^{2}, d_{t B} = \prod_{j = 1}^{N} \sqrt{1 - t^{2} λ_{j}^{2}} .

(6)

Work in

H = R^{D_{K}}

with the standard inner product. Define the readout (calibration)

I_{t} : H \to H

by the chosen estimator:

\begin{matrix} GBS-P (probability) : & {(I_{t} p)}_{K} \approx \sum_{| I | = 2 K} a_{I} \frac{p_{I}^{(t)}}{d_{t B}}, \\ GBS-I (importance on {Haf}^{2}) : & {(I_{t} p)}_{K} \approx \sum_{| I | = 2 K} a_{I} \frac{I!}{d_{t B}} p_{I}^{(t)} . \end{matrix}

Define the Residual of Sameness

R_{K} (t) \equiv ∥ p^{(t)} - I_{t} s_{K} ∥_{H}^{2} .

(7)

This single scalar is the DSFL misfit we will minimize, bound, and track.

Stage 2: Admissibility ⇒ No-Inflation (DPI)

Changing t and counting more samples is nonexpansive in the comparison geometry; residuals cannot inflate just by tuning/concatenation. Any sensible schedule is safe: at worst neutral, typically improving.

Mathematics (DPI)

Let

(\tilde{Φ}, Φ_{t})

be the blueprint/physical update. Here

\tilde{Φ} = Id

while

Φ_{t}

transports

p \mapsto p^{(t)}

and applies the counting map. The intertwining

Φ_{t} I_{t} = I_{t} \tilde{Φ}

holds by construction. If

∥ Φ_{t} ∥_{H \to H} \leq 1

(counting is

L^{2}

-nonexpansive), then

R_{K} (\tilde{Φ} s_{K}, Φ_{t} p) \leq R_{K} (s_{K}, p) .

(8)

Prediction #1 (robustness). Any well-behaved change of t, any device concatenation (gluing), and any extra sampling cannot increase the DSFL residual.

Stage 3: Optimal Tuning = Mean-Photon Matching $m_{t B} = 2 K$

Both the GBS distribution and the estimator operate on multi-indices with

| I | = 2 K

. If the device emits photon numbers far from

2 K

, mass is spent in the wrong degree, creating misalignment/variance. Matching the mean photon number to

2 K

centers mass where the estimator reads, minimizing misfit and MSE.

Mathematics (Calibration Optimality)

Let

m_{t B} = \sum_{j = 1}^{N} \frac{t^{2} λ_{j}^{2}}{1 - t^{2} λ_{j}^{2}} .

(9)

For GBS-P the leading MSE is

{MSE}_{K}^{GBS - P} (t) ≍ \frac{1}{4 n} [t^{- 2 K} d_{t B}^{- 1} \sum_{| I | = 2 K} \frac{a_{I}^{2}}{I!} - μ_{K}^{2}],

(10)

and for GBS-I the variance is

Var ({\hat{μ}}_{K}^{GBS - I}) (t) = \frac{1}{n} [t^{- 4 K} d_{t B}^{- 1} \sum_{| I | = 2 K} \frac{a_{I}^{2}}{I!} Haf {(B_{I})}^{2} - {(μ_{K}^{(2)})}^{2}] .

(11)

Both have a factor

C (t) \in {t^{- 2 K} d_{t B}^{- 1}, t^{- 4 K} d_{t B}^{- 1}}

. Then

\frac{d}{d t} log C (t) = - \frac{2 K}{t} + \frac{m_{t B}}{t} = 0 ⟺ m_{t B} = 2 K,

(12)

and

\frac{d^{2}}{d t^{2}} log C (t) = \frac{2 K}{t^{2}} + \sum_{j = 1}^{N} \frac{λ_{j}^{2}}{1 - t^{2} λ_{j}^{2}} + \sum_{j = 1}^{N} \frac{2 t^{2} λ_{j}^{4}}{{(1 - t^{2} λ_{j}^{2})}^{2}} > 0 .

(13)

Therefore

t^{★}

with

m_{t^{★} B} = 2 K

is the unique strict minimizer.

\begin{matrix} arg min_{t} MSE / Var = {t : m_{t B} = 2 K} \end{matrix} .

(14)

Prediction #2 (tuning law). Set t so that

m_{t B} = 2 K

; this minimizes estimator MSE/variance and the residual

R_{K} (t)

.

Stage 4: Conditioning and Leakage via Principal Angles

If the sampling and blueprint subspaces are aligned, readout is clean; if skew, mass leaks. DSFL quantifies this by the Friedrichs angle

θ_{F}

;

γ = cos θ_{F}

acts like a conditioning factor:

γ ↑

(worse),

γ ↓

(better).

Mathematics (Two-Subspace Inequality)

Let

P_{GBS} (t)

project the sampling features at scale t and

P_{deg K}

the degree-K blueprint. With

γ (t) = ∥ P_{GBS} (t) P_{deg K} ∥ = cos θ_{F} (t)

,

(1 - γ) \sum_{• \in {in, out}} ∥ e_{•} ∥^{2} \leq R_{K} (t) \leq (1 + γ) \sum_{•} {∥ e_{•} ∥}^{2} .

(15)

Since empirical MSE

\propto R_{K}

(up to constants),

(1 - γ) \cdot (split energy) ≲ MSE / Var ≲ (1 + γ) \cdot (split energy) .

(16)

Prediction #3 (angle effect). As

t \to t^{★}

with

m_{t^{★} B} = 2 K

,

θ_{F} (t) ↓

and bounds tighten; mis-tuning increases

γ

and hurts conditioning.

Stage 5: Rates via a Lyapunov Envelope

Near the matched regime, the error behaves like a dissipative system: each sample reduces a fixed fraction of residual. DSFL encodes this with a Lyapunov inequality, giving a non-asymptotic exponential envelope.

Mathematics:

Let

R_{n} = E R_{K} (n)

. Under admissibility (DPI) and a coercivity margin

α (t) > 0

(monotone in alignment/angle),

R_{n + 1} - R_{n} \leq - α (t) R_{n} ⟹ R_{n} \leq e^{- α (t) n} R_{0} .

(17)

Thus

{MSE}_{n} \leq C e^{- α (t) n}

.

\begin{matrix} {MSE}_{n} \leq C e^{- α (t) n}, α (t) maximal at m_{t B} = 2 K \end{matrix} .

(18)

Prediction #4 (rate). The decay slope

α (t)

is largest at the match; mis-tuning reduces

α

.

Stage 6: When GBS Beats MC (Regime Maps)

MC at degree K inherits all cross-moment mixings, exploding with effective dimension; calibrated GBS concentrates on the right degree. DSFL converts this into sample-complexity comparisons.

Mathematics (Complexity Ratio)

Chebyshev/Markov give guaranteed sizes

n_{GBS - P} (t) \approx \frac{1}{ε^{2} δ} \cdot \frac{t^{- 2 K} d_{t B}^{- 1} \sum a_{I}^{2} / I!}{μ_{K}^{2}}, n_{GBS - I} (t) \approx \frac{1}{ε^{2} δ} \cdot \frac{t^{- 4 K} d_{t B}^{- 1} \sum a_{I}^{2} Haf {(B_{I})}^{2} / I!}{{(μ_{K}^{(2)})}^{2}},

(19)

versus

n_{MC} \approx \frac{1}{ε^{2} δ} \cdot \frac{\sum_{I, I^{'}} a_{I} a_{I^{'}} Haf (B_{I + I^{'}}) (or {Haf}^{2})}{μ^{2}} .

(20)

For large K or correlated growth

K \sim ζ N^{2}

, the MC numerator inflates exponentially (many cross indices), while GBS stays calibrated:

\begin{matrix} Large K (or K \sim ζ N^{2}) : n_{GBS} = poly (N), n_{MC} = exp (Θ (N)) \end{matrix} .

(21)

Prediction #5 (regime map). Large K⇒ GBS wins; small K⇒ use DSFL angle proxy to switch (hybrid GBS/MC).

Stage 7: What Happens When You Sweep t

MSE versus t is U-shaped with the minimum exactly at

m_{t B} = 2 K

; sharp spectra create narrower minima (more sensitive tuning).

Mathematics:

Let

g (t) = log (t^{- 2 K} d_{t B}^{- 1})

. Then

g^{'} (t) = - \frac{2 K}{t} + \frac{m_{t B}}{t}, g^{'} (t) = 0 \Leftrightarrow m_{t B} = 2 K, g^{''} (t) = \frac{2 K}{t^{2}} + \sum_{j} \frac{λ_{j}^{2}}{1 - t^{2} λ_{j}^{2}} + \sum_{j} \frac{2 t^{2} λ_{j}^{4}}{{(1 - t^{2} λ_{j}^{2})}^{2}} > 0 .

(22)

Prediction #6 (t-sweep). A unique convex minimum at

m_{t B} = 2 K

; sharper eigenvalue spectra yield narrower minima.

Stage 8: Composition and Noise

Composition: concatenating blocks is admissible; residuals cannot inflate. Noise: small Gaussian perturbations contract in

H

; same predictions hold with slightly reduced rate/angle; re-matching

m_{t B}

restores optimality. [37,38]

Mathematics:

If each block satisfies

∥ Φ_{ℓ} ∥ \leq 1

and

Φ_{ℓ} I = I {\tilde{Φ}}_{ℓ}

, then for

Φ = \prod_{ℓ} Φ_{ℓ}

,

R (\tilde{Φ} s, Φ p) \leq R (s, p) (iterated DPI) .

(23)

For

B \to B + Δ

with

∥ Δ ∥

small, Davis–Kahan yields

sin θ_{F} (B + Δ) \leq \frac{∥ Δ ∥}{gap}, d_{t (B + Δ)} = d_{t B} \cdot (1 + O (∥ Δ ∥)) .

(24)

Hence

α (t)

decreases by

O (∥ Δ ∥)

;

m_{t (B + Δ)} = 2 K

still centers the optimum.

\begin{matrix} Small noise \Rightarrow graceful degradation; re - match m_{t B} to recover near - optimality \end{matrix} .

(25)

Stage 9: Actionable Tests

The DSFL analysis yields a concrete testing protocol that can be implemented prior to data collection and updated on-line with negligible overhead. First, the sampler must be calibrated by matching its mean photon number to the target degree. Concretely, given a covariance B with spectrum

{λ_{j}}_{j = 1}^{N} \subset (0, 1)

and a degree index

K \in N

, the scale

t^{★} \in (0, λ_{1}^{- 1})

is defined as the unique solution of the monotone equation

m_{t B} : = \sum_{j = 1}^{N} \frac{t^{2} λ_{j}^{2}}{1 - t^{2} λ_{j}^{2}} = 2 K .

(26)

Setting

t = t^{★}

centers the sampling law on multi-indices of weight

2 K

and, by the variance formulas derived above, minimizes the mean–square error for both GBS-I and GBS-P.

Second, conditioning is certified by a computable principal-angle proxy. Let

X_{t} \in R^{M \times D_{K}}

denote a (finite) feature matrix obtained from M synthetic or warm-up GBS feature evaluations at scale t, and let

Y_{K} \in R^{M \times D_{K}}

encode a basis for the degree-K blueprint subspace (e.g. normalized monomials indexed by

{| I | = 2 K}

). The quantity

\hat{γ} (t) : = {∥ X_{t}^{⊤} Y_{K} ∥}_{2}

(27)

upper bounds the Friedrichs cosine

γ = ∥ P_{GBS (t)} P_{\deg K} ∥

up to discretization error. Small values of

\hat{γ} (t)

certify good alignment (large principal angle

θ_{F}

) and hence favorable two-sided variance bounds, whereas large values indicate ill-conditioning and trigger either a re-tuning of t toward (26) or a temporary deferral to the Monte Carlo channel for the corresponding K.

Third, channel selection can be automated by an empirical residual criterion. Denote by

{\hat{R}}_{K}

the empirical DSFL residual estimated from a fixed budget of warm-up samples (identical across candidates). Choosing the channel with smaller

{\hat{R}}_{K}

minimizes the predicted MSE under the nonexpansive dynamics that define DSFL and is therefore statistically safe. Since both GBS and MC updates are

L^{2}

–nonexpansive in the comparison geometry, this selection step does not increase the residual in expectation.

Finally, under architectural changes (e.g. insertion of an additional optical block or concatenation in a time-bin loop), admissibility guarantees a data–processing inequality for the composite map, hence no inflation of the residual; only the calibration Equation (26) must be re-checked for the updated effective covariance and scale. In particular, the composition of nonexpansive, intertwining maps remains nonexpansive and intertwining, so the DSFL guarantees persist verbatim after gluing.

Executive Summary of Foresight

Within a single Hilbert geometry, DSFL certifies that tuning and concatenation steps are nonexpansive; hence, the residual cannot deteriorate under admissible manipulation of the sampler or of the pipeline. The unique operating point that minimizes the mean–square error and the estimator variance is characterized by the mean-photon-number matching condition

m_{t B} = 2 K

, which is strictly convex in t and therefore algorithmically well-posed. Conditioning is governed by the principal angle

θ_{F}

between the sampling subspace at scale t and the degree-K blueprint; the associated cosine

γ = cos θ_{F}

controls sharp two-sided variance bounds and decreases as the calibration improves. In the matched regime, the expected residual satisfies a discrete Lyapunov inequality and thus admits a non-asymptotic exponential envelope

{MSE}_{n} \leq C e^{- α (t) n}

with decay rate

α (t)

maximized at the calibration point. Consequently, for high degrees (in particular when K grows as

ζ N^{2}

), the calibrated GBS estimators exhibit polynomial sample complexity in N, while the plain Monte Carlo baseline inherits an exponential dependence through cross-moment couplings; for small K, the angle proxy (27) recommends a hybrid policy that defers to MC until alignment becomes favorable. The dependence of the MSE on the scale t is convex with a unique minimum at

m_{t B} = 2 K

; perturbations of B (hardware noise) or blockwise composition merely reduce the alignment and rate by controllable amounts, and re-matching via (26) restores near-optimal performance. Throughout, no inflation occurs under gluing by the data–processing inequality.

Bottom Line

The preceding statements are a priori consequences of the DSFL axioms (intertwining and

L^{2}

–nonexpansive updates) in conjunction with the Wick/GBS representation of Gaussian expectations. They condense into operational equations

m_{t B} = 2 K, γ = cos θ_{F}, α (t) increasing in alignment,

(28)

which together predict the behavior of GBS before sampling and specify the tuning required to achieve optimal performance in practice.

6. DSFL Primer (Minimal Machinery, with Proofs)

6.1. Comparison Geometry

Let

(H, 〈 \cdot, \cdot 〉)

be a (real or complex) Hilbert space equipped with the norm

∥ x ∥ : = \sqrt{〈 x, x 〉}

. We fix two closed subspaces

S, P \subset H

(“statistical” and “physical” channels), and denote by

P_{S}, P_{P}

the corresponding orthogonal projectors.

6.2. Calibration (Interchangeability)

Definition 6.1

(Calibration pair). A calibration is a pair of bounded operators

I : S \to P, J : P \to S,

(29)

satisfying the interchangeability identities

I J = {id}_{P}, J I = P_{S} .

(30)

Remark 6.1.

The relations (30) imply that

I

is a left inverse of

J

on

P

, and

J

is a right inverse of

I

on

S

. In particular,

I

is an isomorphism of

S / ker I

onto

P

, and

J

restricts to the identity on

I (S)

.

6.3. Residual of Sameness

Definition 6.2

(Residual). Given

s \in S

(blueprint) and

p \in P

(response), the Residual of Sameness is the squared

H

–distance

R (s, p) : = {∥ p - I s ∥}_{H}^{2} .

(31)

Remark 6.2.

By construction,

R (s, p) = 0

iff

p = I s

. Moreover,

R

is invariant under any isometry U on

H

that preserves

ran (I)

: if

U (ran I) \subseteq ran I

, then

∥ U p - U I s ∥ = ∥ U (p - I s) ∥ = ∥ p - I s ∥

.

6.4. Admissibility and the Data–Processing Inequality

Definition 6.3

(Admissible update). A pair of bounded operators

(\tilde{Φ}, Φ)

with

\tilde{Φ} : S \to S, Φ : P \to P,

(32)

is admissible if:

1.: Intertwining (calibration consistency):

$Φ I = I \tilde{Φ};$

(33)
2.: Nonexpansiveness: ${∥ Φ ∥}_{H \to H} \leq 1$ .

Theorem 6.4

(Data–processing inequality (DPI)). If

(\tilde{Φ}, Φ)

is admissible, then for all

s \in S

and

p \in P

,

R (\tilde{Φ} s, Φ p) \leq R (s, p) .

(34)

Proof.

Using (33),

R (\tilde{Φ} s, Φ p) = ∥ Φ p - I \tilde{Φ} s ∥^{2} = {∥ Φ p - Φ I s ∥}^{2} = {∥ Φ (p - I s) ∥}^{2} \leq {∥ Φ ∥}^{2} ∥ {p - I s ∥}^{2} \leq {∥ p - I s ∥}^{2},

(35)

which is (34). □

Corollary 6.5

(Iterated DPI). If

{({\tilde{Φ}}_{k}, Φ_{k})}_{k = 1}^{m}

are admissible, then so is

(\tilde{Φ}, Φ) = (\prod_{k = 1}^{m} {\tilde{Φ}}_{k}, \prod_{k = 1}^{m} Φ_{k})

, and

R (\tilde{Φ} s, Φ p) \leq R (s, p) .

(36)

Proof.

Intertwining and

∥ \cdot ∥ \leq 1

are preserved under composition. Apply Theorem 6.4. □

6.5. Angles and Conditioning

Let

M, N \subset H

be closed subspaces with orthogonal projectors

P_{M}, P_{N}

.

Definition 6.6

(Friedrichs angle). The Friedrichs cosine is

γ (M, N) : = ∥ P_{M} P_{N} ∥

, and the Friedrichs angle

θ_{F} (M, N) \in [0, \frac{π}{2}]

is defined by

cos θ_{F} (M, N) : = γ (M, N) .

(37)

We recall a standard two–subspace inequality; we include a proof for completeness.

Proposition 6.7

(Two–subspace bounds). Let

M, N \subset H

be closed subspaces and set

γ = γ (M, N)

. For any

a \in M

,

b \in N

,

(1 - γ) ({∥ a ∥}^{2} + {∥ b ∥}^{2}) \leq {∥ a + b ∥}^{2} \leq (1 + γ) ({∥ a ∥}^{2} + {∥ b ∥}^{2}) .

(38)

Proof.

By polarization,

{∥ a + b ∥}^{2} = {∥ a ∥}^{2} + {∥ b ∥}^{2} + 2 Re 〈 a, b 〉 .

(39)

Since

a \in M

,

b \in N

, we have

| 〈 a, b 〉 | = | 〈 a, P_{M} b 〉 | \leq ∥ a ∥ ∥ P_{M} b ∥ \leq γ ∥ a ∥ ∥ b ∥

. Thus

- 2 γ ∥ a ∥ ∥ b ∥ \leq 2 Re 〈 a, b 〉 \leq 2 γ ∥ a ∥ ∥ b ∥ .

(40)

Using

2 ∥ a ∥ ∥ b ∥ \leq {∥ a ∥}^{2} + {∥ b ∥}^{2}

and

- 2 ∥ a ∥ ∥ b ∥ \geq - (∥ a ∥^{2} + ∥ b ∥^{2})

yields (38). □

Proposition 6.8

(Through–subspace contraction). Let

M, N \subset H

be closed subspaces with

γ = ∥ P_{M} P_{N} ∥

. Then for any bounded

T : H \to H

,

∥ P_{M} T P_{N} ∥ \leq ∥ T ∥ γ .

(41)

In particular, if

∥ T ∥ \leq 1

, then

∥ P_{M} T P_{N} ∥ \leq γ = cos θ_{F} (M, N)

.

Proof.

By the dual characterization of the operator norm,

∥ P_{M} T P_{N} ∥ = sup_{∥ x ∥ = ∥ y ∥ = 1} | 〈 P_{M} T P_{N} x, y 〉 | = sup_{∥ x ∥ = ∥ y ∥ = 1} | 〈 T P_{N} x, P_{M} y 〉 | \leq ∥ T ∥ sup_{∥ x ∥ = ∥ y ∥ = 1} | 〈 P_{N} x, P_{M} y 〉 | .

(42)

But

{sup}_{∥ x ∥ = ∥ y ∥ = 1} | 〈 P_{N} x, P_{M} y 〉 | = ∥ P_{M} P_{N} ∥

(take the singular value decomposition of

P_{M} P_{N}

). Hence (41). □

Remark 6.3

(Interpretation). In applications, M and N are the “pointer” subspaces representing, respectively, the sampling polarization and the blueprint (degree–K) polarization. The factor

γ = cos θ_{F}

controls leakage and conditioning across the seam: estimates that pass through an

L^{2}

–contractive kernel are suppressed by at least

cos θ_{F}

.

6.6. Lyapunov Envelopes (Continuous and Discrete)

We formalize a standard energy–dissipation mechanism that will underlie non–asymptotic error envelopes.

Lemma 6.9

(Differential energy inequality). Let

e (t) \in H

be (weakly) differentiable and set

R (t) : = {∥ e (t) ∥}^{2}

. Assume there exists a bounded self–adjoint

K = K^{*} ⪰ 0

and a remainder

g (t) \in H

such that

\frac{d}{d t} R (t) = - 2 〈 K e (t), e (t) 〉 + 2 〈 e (t), g (t) 〉 .

(43)

If, for some

κ > 0

and

ε \geq 0

,

〈 K e, e 〉 \geq {κ ∥ e ∥}^{2}, | 〈 e, g 〉 | \leq {ε ∥ e ∥}^{2} for all e \in H,

(44)

then

\frac{d}{d t} R (t) \leq - 2 (κ - ε) R (t) .

(45)

Proof.

Combine (43) and (44):

\frac{d}{d t} R (t) \leq - {2 κ ∥ e (t) ∥}^{2} + 2 ε {∥ e (t) ∥}^{2} = - 2 (κ - ε) R (t) .

(46)

□

Theorem 6.10

(Continuous Lyapunov envelope). Under the hypotheses of Lemma 6.9 with

α : = 2 (κ - ε) > 0

,

R (t) \leq e^{- α (t - t_{0})} R (t_{0}) for all t \geq t_{0} .

(47)

Proof.

Apply Grönwall’s inequality to (45). □

A discrete analogue covers sampling iterations.

Proposition 6.11

(Discrete Lyapunov envelope). Let

{(R_{n})}_{n \in N}

be nonnegative numbers such that for some

α \in (0, 1]

,

R_{n + 1} \leq (1 - α) R_{n} for all n \in N .

(48)

Then

R_{n} \leq {(1 - α)}^{n} R_{0} \leq e^{- α n} R_{0} .

(49)

Proof.

Iterate (48) and use

{(1 - α)}^{n} \leq e^{- α n}

. □

Remark 6.4

(Connection to admissibility). In many applications (e.g. residuals computed after one sampling/update step), admissibility of the update and a sectorial coercivity provide (48) in expectation, yielding an expected exponential envelope of the form

E (R_{n}) \leq e^{- α n} E (R_{0})

.

7. Mapping GBS ⟺ DSFL

Here we place the GBS estimation problem and the DSFL control law into a single, explicit Hilbert–space model and show that the main “folk rules” for operating a Gaussian Boson Sampler—how to tune the mean photon number, when advantage over Monte Carlo (MC) appears, how conditioning depends on feature alignment, and why errors decay at a fixed rate—follow as theorems from one comparison geometry. The construction has three ingredients. First, for a fixed target degree K, we embed both the statistical blueprint (the degree–K coefficient vector of the integrand) and the physical sampling law induced by the device at scale t into the same finite–dimensional Hilbert space

H_{K} = R^{D_{K}}

indexed by multi–indices

I_{K} = {I : | I | = 2 K}

. This makes the GBS law

p (t)

and the blueprint

s_{K}

comparable as vectors and allows us to measure a single misfit—the DSFL residual

R_{res}^{(K)} (t) = {∥p (t) - Π s_{K}∥}_{H_{K}}^{2} .

(50)

Second, we model the estimators (GBS–P and GBS–I restricted to degree K) as bounded linear functionals

I : H_{K} \to R

, so that “reading off” the degree–K contribution to the Gaussian expectation is nothing but a linear evaluation in

H_{K}

. In this form, the basic DSFL admissibility hypotheses hold for the two primitive operations available on hardware or in simulation: (a) changing the scale t (retuning the source) and (b) replacing the law by its empirical counterpart (counting). Both operations act by contractions on

H_{K}

and intertwine with the calibration

I

. Consequently, the DSFL data–processing inequality applies directly: no admissible update can inflate the degree–K residual.

Third, we extract quantitative laws from the geometry. A short calculus identity shows that the leading t–dependence of the residual and of the estimator variance is governed by the convex factor

C_{2 K} (t) = \frac{t^{- 2 K}}{d_{t B}}, with \frac{d}{d t} log C_{2 K} (t) = \frac{m_{t B} - 2 K}{t},

(51)

so the unique minimizer occurs at the photon–number match

m_{t B} = 2 K

. This is the precise DSFL formulation of “mean–photon–number matching is optimal.” Conditioning and leakage across degrees are controlled by the Friedrichs principal angle

θ_{F} (t)

between the blueprint subspace and the sampling feature subspace in

H_{K}

; standard two–subspace estimates yield two–sided variance (or squared–bias) bounds with multiplicative factors

(1 \pm cos θ_{F} (t))

. Finally, a one–step conditional contraction of the residual—guaranteed by admissibility plus a positive angle margin—produces a Lyapunov inequality and hence a non–asymptotic exponential envelope

E R_{res}^{(K)} (p^{(n)} (t)) \leq e^{- α (t) n} R_{res}^{(K)} (p^{(0)} (t)),

(52)

with

α (t)

maximal at the photon–match.

The subsections below make this scheme explicit. We first specify the blueprint

s_{K}

, the physical channel

p (t)

, and the linear estimator

I

inside

H_{K}

. We then verify admissibility (intertwining and nonexpansiveness) for counting and retuning, yielding a degree–K data–processing inequality. Next, we prove calibration optimality—photon–number matching minimizes the residual—and derive principal–angle bounds for the estimator’s dispersion. We conclude with a Lyapunov–type result that converts the one–step contraction into an exponential decay rate, thereby recovering and clarifying the

(ε, δ)

sample–complexity laws for GBS and the observed advantage regimes over MC.

7.1. Blueprint, Physical Channel, and Calibration in One Geometry

Fix an integer

K \geq 0

and a symmetric covariance

B \in R^{N \times N}

with spectrum

0 < λ_{N} \leq \dots \leq λ_{1} < 1

. Let

I_{K} : = {I = (i_{1}, \dots, i_{N}) \in N^{N} : | I | : = i_{1} + \dots + i_{N} = 2 K}, D_{K} : = # I_{K} .

(53)

7.1.1. Hilbert Model

We work on the comparison Hilbert space

H_{K} : = R^{D_{K}}, 〈 x, y 〉 : = \sum_{I \in I_{K}} x_{I} y_{I}, {∥ x ∥}^{2} : = \sum_{I} x_{I}^{2} .

(54)

7.1.2. Blueprint

Let

f (x) = \sum_{k \geq 0} \sum_{| I | = k} a_{I} x_{I}

be the target integrand. The degree-K blueprint is the coefficient vector

s_{K} : = {(a_{I})}_{| I | = 2 K} \in H_{K} .

(55)

Wick’s theorem yields the degree-K contribution to the Gaussian expectation

μ_{K} = \sum_{| I | = 2 K} a_{I} Haf (B_{I}), μ_{K}^{(2)} = \sum_{| I | = 2 K} a_{I} Haf {(B_{I})}^{2},

(56)

where

B_{I}

denotes the (multi-)submatrix of B indexed by I.

7.1.3. Physical Channel

For any scale

t \in (0, λ_{1}^{- 1})

, the GBS distribution on

I_{K}

is

p_{I} (t) = \frac{d_{t B}}{I!} Haf {({(t B)}_{I})}^{2}, d_{t B} = \prod_{j = 1}^{N} \sqrt{1 - t^{2} λ_{j}^{2}} .

(57)

We regard

p (t) : = {(p_{I} (t))}_{| I | = 2 K} \in H_{K}

as the “physical state” at scale t. The (global) mean photon number of

t B

is

m_{t B} : = \sum_{j = 1}^{N} \frac{t^{2} λ_{j}^{2}}{1 - t^{2} λ_{j}^{2}} .

(58)

7.1.4. Calibration (Estimator Readout)

We model the degree-K estimators as linear functionals on

H_{K}

:

\begin{matrix} GBS-P (probability) : & I^{P} (x) : = \sum_{| I | = 2 K} a_{I} \frac{x_{I}}{d_{t B}} = : 〈 w^{P}, x 〉, w_{I}^{P} = \frac{a_{I}}{d_{t B}}, \\ GBS-I (importance) : & I^{I} (x) : = \sum_{| I | = 2 K} a_{I} \frac{I!}{d_{t B}} x_{I} = : 〈 w^{I}, x 〉, w_{I}^{I} = \frac{I! a_{I}}{d_{t B}} . \end{matrix}

Thus

I : H_{K} \to R

is bounded (by Cauchy–Schwarz). For empirical laws

\hat{p} (t)

obtained by counting, we use the same

I

; admissibility will follow from linearity and nonexpansiveness of orthogonal projections.

7.1.5. Residual

With this embedding, the DSFL residual for the degree-K task at scale t is

R_{res}^{(K)} (t) : = {∥p (t) - Π s_{K}∥}_{H_{K}}^{2} .

(59)

where

Π : S \to H_{K}

is the canonical identification

Π s_{K} = s_{K}

(we suppress the statistics subspace and view

s_{K}

as an element of

H_{K}

).

7.2. Admissibility of GBS Tuning and Counting

Lemma 7.1

(Intertwining). Let

\tilde{Φ} = {id}_{H_{K}}

and Φ be either: (i) the identity (fixed t) followed by the orthogonal projection

Proj

onto the simplex of probability vectors (counting), or (ii) the change of scale

t \mapsto t^{'}

followed by

Proj

. Then for

I \in {I^{P}, I^{I}}

one has

I \circ \tilde{Φ} = I, Φ \circ Π = Π, hence Φ I = I \tilde{Φ} .

(60)

Proof.

\tilde{Φ} = id

on the blueprint space;

Φ

acts on physical vectors only and preserves the ambient space

H_{K}

(projection and change of parameters do not move

s_{K}

). Therefore

Φ \circ Π = Π

and

I \circ \tilde{Φ} = I

, yielding the intertwining

Φ I = I \tilde{Φ}

. □

Lemma 7.2

(Nonexpansiveness). The operators Φ of the previous lemma satisfy

{∥ Φ ∥}_{H_{K} \to H_{K}} \leq 1

.

Proof.

Orthogonal projection onto a closed convex set is a contraction in

H_{K}

; parameter change

t \mapsto t^{'}

is the identity on the ambient linear space (we reparametrize the same coordinates). Hence the composition has operator norm

\leq 1

. □

Proposition 7.3

(DPI for GBS tuning and counting). Let

(\tilde{Φ}, Φ)

be as above. Then the DPI holds for the degree-K residual:

R_{res}^{(K)} (\tilde{Φ} s_{K}, Φ p) \leq R_{res}^{(K)} (s_{K}, p) for all s_{K}, p \in H_{K} .

(61)

Proof.

Apply the abstract DPI to

(\tilde{Φ}, Φ)

, with

I

as specified and

H = H_{K}

. □

7.3. Calibration Optimality: Photon-Number Matching

Lemma 7.4

(Scale factor calculus). Define

C_{2 K} (t) : = \frac{t^{- 2 K}}{d_{t B}} = exp (- 2 K log t - \frac{1}{2} \sum_{j = 1}^{N} log (1 - t^{2} λ_{j}^{2})) .

(62)

Then

\frac{d}{d t} log C_{2 K} (t) = \frac{1}{t} (m_{t B} - 2 K),

(63)

where

m_{t B}

is the mean photon number. In particular,

\frac{d}{d t} log C_{2 K} (t) = 0

iff

m_{t B} = 2 K

, and

\frac{d^{2}}{d t^{2}} log C_{2 K} (t) > 0

, so

t \mapsto C_{2 K} (t)

is strictly convex on

(0, λ_{1}^{- 1})

.

Proof.

Differentiate:

\frac{d}{d t} log C_{2 K} (t) = - \frac{2 K}{t} - \frac{1}{2} \sum_{j = 1}^{N} \frac{- 2 t λ_{j}^{2}}{1 - t^{2} λ_{j}^{2}} = \frac{1}{t} (\sum_{j = 1}^{N} \frac{t^{2} λ_{j}^{2}}{1 - t^{2} λ_{j}^{2}} - 2 K) = \frac{m_{t B} - 2 K}{t} .

(64)

The second derivative is the sum of a positive term

\frac{2 K}{t^{2}}

and strictly positive terms

\frac{λ_{j}^{2}}{{(1 - t^{2} λ_{j}^{2})}^{2}} > 0

, proving strict convexity. □

Theorem 7.5

(Calibration Optimality). Assume

R_{res}^{(K)} (t)

admits a representation

R_{res}^{(K)} (t) = A C_{2 K} (t) - B,

(65)

with constants

A > 0

,

B \in R

independent of t. Then the unique minimizer of

t \mapsto R_{res}^{(K)} (t)

on

(0, λ_{1}^{- 1})

is characterized by the photon-number matching law

m_{t B} = 2 K .

(66)

Proof.

By the assumed form,

\frac{d}{d t} R_{res}^{(K)} (t) = A \frac{d}{d t} C_{2 K} (t)

; hence critical points of

R_{res}^{(K)}

are those of

C_{2 K}

. Combine with the previous lemma and strict convexity to obtain uniqueness and

m_{t B} = 2 K

. □

Remark 7.1.

For both GBS-P and GBS-I, the leading t–dependence of the MSE/variance and thus of the residual is multiplicative by

C_{2 K} (t)

, while the coefficient depends on

(a_{I}, B)

but not on t. This yields the displayed form up to a harmless additive constant B (see the variance formulæ below).

7.4. Angles Quantify Conditioning: Two-Sided MSE Bounds

Let

P_{deg K}

denote the orthogonal projector onto the “blueprint” subspace spanned by the canonical basis

{e_{I}}_{| I | = 2 K}

weighted by the coefficients

a_{I}

(precisely, the range of the linear functional

I

), and

P_{GBS} (t)

the projector onto the sampling feature subspace at scale t (the range of

I \mapsto Haf ({(t B)}_{I})

within

H_{K}

). Set

γ (t) : = ∥ P_{GBS} (t) P_{deg K} ∥ = cos θ_{F} (t) .

(67)

Proposition 7.6

(Angle–variance link). Let

{\hat{μ}}_{K}

be either the GBS-P or GBS-I degree-K estimator from n samples at fixed t. Then there exist t–independent constants

c_{-} (a, B), c_{+} (a, B) > 0

such that

(1 - γ (t)) c_{-} (a, B) \leq n Var ({\hat{μ}}_{K}) \leq (1 + γ (t)) c_{+} (a, B) .

(68)

Equivalently, the MSE (for GBS-P, replacing Var by the leading squared bias) is sandwiched between two multiples of the split energy, with constants modulated by

(1 \pm cos θ_{F} (t))

.

Proof sketch with precise ingredients.

Both estimators are linear statistics of the empirical law

\hat{p} (t) \in H_{K}

:

{\hat{μ}}_{K} = 〈 w, \hat{p} (t) 〉, w = w^{P} or w^{I} .

(69)

Then

n Var ({\hat{μ}}_{K}) = 〈 Σ (t) w, w 〉

where

Σ (t)

is the covariance operator of one draw,

Σ (t) = E [(ξ - E ξ) \otimes (ξ - E ξ)], ξ \in H_{K}, ¶ (ξ = e_{I}) = p_{I} (t) .

(70)

Since

ran (Σ (t)) \subset span {e_{I} : | I | = 2 K}

and the features live in

P_{GBS} (t) H_{K}

, one has (up to a bounded t–independent similarity determined by

a_{I}

and B)

Σ (t) ≃ P_{GBS} (t) P_{deg K} D P_{deg K} P_{GBS} (t),

(71)

for a positive definite diagonal D encoding the t–independent weights

I!

and

a_{I}

. Two applications of standard two-subspace estimates (Friedrichs/CS-decomposition) yield

λ_{min} (D) ∥ P_{GBS} (t) P_{deg K} ∥^{2} ∥ P_{deg K} w ∥^{2} \leq 〈 Σ (t) w, w 〉 \leq λ_{max} (D) ∥ P_{GBS} (t) P_{deg K} ∥^{2} {∥ P_{deg K} w ∥}^{2} .

(72)

Decompose

w = w_{‖} + w_{⊥}

relative to

P_{deg K}

; invoke the sharp two-subspace inequalities to bound

∥ P_{deg K} w ∥^{2}

above/below by

(1 \pm γ) ∥ w_{‖} ∥^{2}

and absorb t–independent factors into

c_{\pm} (a, B)

to conclude. □

7.5. Lyapunov Envelope and Non-Asymptotic Rates

Theorem 7.7

(DSFL Lyapunov for GBS). Let

(U, Φ)

denote one update step (blueprint-side map U and physical-side map Φ; counting and/or retuning). Fix

t \in (0, λ_{1}^{- 1})

and degree

K \in N

. Assume there exists

α (t) \in (0, 1]

such that, conditionally on the past,

E [R_{res}^{(K)} (U s^{(K)}, Φ p^{(n)} (t)) | p^{(n)} (t)] \leq (1 - α (t)) R_{res}^{(K)} (s^{(K)}, p^{(n)} (t)) .

(73)

Then the mean residual decays exponentially:

E R_{res}^{(K)} (p^{(n)} (t)) \leq e^{- α (t) n} R_{res}^{(K)} (p^{(0)} (t)) .

(74)

Moreover,

α (t)

is monotone improving as the principal-angle factor

γ (t)

decreases, and it is maximized at the photon-number match

m_{t B} = 2 K

.

Proof.

Taking total expectation in the conditional contraction yields

E R_{res}^{(K)} (p^{(n + 1)} (t)) \leq (1 - α (t)) E R_{res}^{(K)} (p^{(n)} (t)) .

(75)

Iterating and using

{(1 - α)}^{n} \leq e^{- α n}

gives

E R_{res}^{(K)} (p^{(n)} (t)) \leq {(1 - α (t))}^{n} R_{res}^{(K)} (p^{(0)} (t)) \leq e^{- α (t) n} R_{res}^{(K)} (p^{(0)} (t)) .

(76)

The dependence of

α (t)

on

γ (t)

follows from the angle–variance control; its maximizer coincides with the minimizer of the leading residual factor, attained at

m_{t B} = 2 K

by calibration optimality. □

Corollary 7.8.

If

n \geq α {(t)}^{- 1} log (R_{res}^{(K)} (p^{(0)} (t)) / ε^{2})

, then

E R_{res}^{(K)} (p^{(n)} (t)) \leq ε^{2}

. In particular, the right-hand side is minimized in t at

m_{t B} = 2 K

.

Remark 7.2.

The hypothesis is a sectorial coercivity statement: in expectation, one update removes a fixed fraction of the degree-K residual. Admissibility (data–processing) and a positive principal-angle margin guarantee such a contraction. Empirically this appears as a straight line of slope

- α (t)

in semi-log MSE plots.

8. Sample-Complexity Consequences

In this section we derive, from first principles in the comparison geometry, (i) exact variance formulae for the unbiased importance estimator (GBS-I), (ii) an asymptotic mean–square error (MSE) for the probability/square–root estimator (GBS-P) under the usual sign convention, and (iii) sample–size envelopes in the

(ε, δ)

sense via Chebyshev/Markov bounds. Throughout we fix a degree

K \in N

, a covariance B with spectrum

0 < λ_{N} \leq \dots \leq λ_{1} < 1

, and a scale

t \in (0, λ_{1}^{- 1})

. We also use the hafnian homogeneity

Haf ({(t B)}_{I}) = t^{K} Haf (B_{I}), (| I | = 2 K),

(77)

and the GBS law on

I_{K} : = {I : | I | = 2 K}

p_{I} (t) = \frac{d_{t B}}{I!} Haf {({(t B)}_{I})}^{2} = \frac{d_{t B} t^{2 K}}{I!} Haf {(B_{I})}^{2}, d_{t B} = \prod_{j = 1}^{N} \sqrt{1 - t^{2} λ_{j}^{2}} .

(78)

8.1. GBS-I (Unbiased): Exact Variance and $(ε, δ)$ Guarantee

Define the unbiased importance estimator for

μ_{K}^{(2)} = \sum_{| I | = 2 K} a_{I} Haf {(B_{I})}^{2}

by

{\hat{μ}}_{K}^{GBS - I} = \frac{1}{n} \sum_{r = 1}^{n} Y_{r}, Y (I) : = \frac{I! a_{I}}{d_{t B} t^{2 K}}, I \sim p (t) .

(79)

Then

E [Y] = μ_{K}^{(2)}

by direct substitution. A single-line computation using

p_{I} (t)

and the homogeneity gives

E [Y^{2}] = \sum_{I \in I_{K}} p_{I} (t) Y {(I)}^{2} = \frac{1}{d_{t B} t^{2 K}} \sum_{| I | = 2 K} a_{I}^{2} I! Haf {(B_{I})}^{2} .

(80)

Proposition 8.1

(Variance of GBS-I). For the unbiased estimator above,

Var ({\hat{μ}}_{K}^{GBS - I}) = \frac{1}{n} [\frac{1}{d_{t B} t^{2 K}} \sum_{| I | = 2 K} a_{I}^{2} I! Haf {(B_{I})}^{2} - {(μ_{K}^{(2)})}^{2}] .

(81)

Proof.

Since

{\hat{μ}}_{K}^{GBS - I}

is the empirical mean of i.i.d. Y with

E Y = μ_{K}^{(2)}

, we have

Var ({\hat{μ}}_{K}^{GBS - I}) = Var (Y) / n = (E Y^{2} - {(E Y)}^{2}) / n

, which after the computation of

E Y^{2}

above gives (81). □

Corollary 8.2

(

(ε, δ)

sample size for GBS-I). By Chebyshev,

Pr (|{\hat{μ}}_{K}^{GBS - I} - μ_{K}^{(2)}| > ε |μ_{K}^{(2)}|) \leq \frac{Var ({\hat{μ}}_{K}^{GBS - I})}{ε^{2} {(μ_{K}^{(2)})}^{2}} .

(82)

Hence it suffices to take

n_{GBS - I} \geq \frac{1}{ε^{2} δ} \cdot \frac{\frac{1}{d_{t B} t^{2 K}} \sum_{| I | = 2 K} a_{I}^{2} I! Haf {(B_{I})}^{2} - {(μ_{K}^{(2)})}^{2}}{{(μ_{K}^{(2)})}^{2}} .

(83)

In particular, at the photon-match

m_{t B} = 2 K

(Theorem Calibration optimality),

t \mapsto d_{t B}^{- 1} t^{- 2 K}

is minimized, thus minimizing the right-hand side.

8.2. GBS-P (Bias Vanishes): Asymptotic MSE via Delta Method

For the (degree-K) probability/square–root estimator we assume the standard sign convention

σ_{I} = sign (Haf (B_{I}))

is available.1 Define

{\hat{μ}}_{K}^{GBS - P} = \sum_{| I | = 2 K} a_{I} t^{- K} \sqrt{I!} σ_{I} \sqrt{\frac{{\hat{p}}_{I} (t)}{d_{t B}}} = g (\hat{p} (t)), g (p) = \sum_{| I | = 2 K} a_{I} t^{- K} \sqrt{I!} σ_{I} \sqrt{\frac{p_{I}}{d_{t B}}} .

(84)

Since

p_{I} (t) = d_{t B} t^{2 K} Haf {(B_{I})}^{2} / I!

, one has

g (p (t)) = μ_{K}

. The multinomial delta method with covariance

Cov (\hat{p}) = \frac{1}{n} (diag (p) - p p^{⊤})

therefore yields

n E [{({\hat{μ}}_{K}^{GBS - P} - μ_{K})}^{2}] ⟶ \nabla g {(p)}^{⊤} (diag (p) - p p^{⊤}) \nabla g (p), (n \to \infty) .

(85)

A short calculation gives

\partial g / \partial p_{I} = \frac{1}{2} a_{I} t^{- K} \sqrt{I!} σ_{I} {(d_{t B}^{- 1} p_{I})}^{- 1 / 2}

and hence

\sum_{I} p_{I} {(\frac{\partial g}{\partial p_{I}})}^{2} = \frac{1}{4 d_{t B} t^{2 K}} \sum_{| I | = 2 K} \frac{a_{I}^{2}}{I!}, {(\sum_{I} p_{I} \frac{\partial g}{\partial p_{I}})}^{2} = μ_{K}^{2} .

(86)

We conclude:

Proposition 8.3

(Asymptotic MSE of GBS-P). As

n \to \infty

,

E [{|{\hat{μ}}_{K}^{GBS - P} - μ_{K}|}^{2}] = \frac{1}{4 n} [\frac{1}{d_{t B} t^{2 K}} \sum_{| I | = 2 K} \frac{a_{I}^{2}}{I!} - μ_{K}^{2}] + o (\frac{1}{n}) .

(87)

Corollary 8.4

(

(ε, δ)

sample size for GBS-P). By Markov’s inequality applied to the centered square, an asymptotic sufficient sample size is

n_{GBS - P} ≳ \frac{1}{ε^{2} δ} \cdot \frac{\frac{1}{d_{t B} t^{2 K}} \sum_{| I | = 2 K} a_{I}^{2} / I! - μ_{K}^{2}}{4 μ_{K}^{2}} .

(88)

As in Corollary 8.2, the right-hand side is minimized at the photon match

m_{t B} = 2 K

.

8.3. When GBS Outperforms Monte Carlo

Let

{\hat{μ}}_{K}^{MC} = \frac{1}{n} \sum_{r = 1}^{n} f_{K} (X_{r})

be the plain Monte Carlo estimator for the degree-K contribution with

X_{r} \sim N (0, B)

and

f_{K} (x) : = \sum_{| I | = 2 K} a_{I} x_{I} .

(89)

Then

E f_{K} (X) = μ_{K}

and a standard Wick expansion gives

Var ({\hat{μ}}_{K}^{MC}) = \frac{1}{n} [\sum_{\begin{matrix} | I | = 2 K \\ | J | = 2 K \end{matrix}} a_{I} a_{J} Haf (B_{I + J}) - μ_{K}^{2}] .

(90)

The quadratic form in (90) mixes all degree-

2 K

multi–indices pairwise via

Haf (B_{I + J})

and therefore grows combinatorially in K (and, under mild spectral hypotheses, exponentially in an effective dimension), whereas the GBS prefactors (81)–(87) are controlled by the single convex factor

d_{t B}^{- 1} t^{- 2 K}

after photon–number matching.

Theorem 8.5

(Regime separation: polynomial vs exponential). Fix B with eigenvalues bounded away from 1 and coefficients

(a_{I})

on the unit sphere of

R^{D_{K}}

. Then there exists

c = c (B) > 0

such that for sequences

K = K (N)

with

K ≍ ζ N^{2}

and

ζ > 0

one has the complexity separation

n_{GBS} = poly (N) whereas n_{MC} \geq exp (c N^{2}),

(91)

for achieving the same

(ε, δ)

tolerance, provided the GBS sampler is photon–matched (

m_{t B} = 2 K

). For fixed small K the two complexities are of the same order, and hybrid policies (MC for small K, GBS beyond a data–driven crossover) are optimal.

Proof sketch.

Under the matching law

m_{t B} = 2 K

, the GBS prefactors are minimized and remain polynomial in N (because

d_{t B}^{- 1} = \prod_{j} {(1 - t^{2} λ_{j}^{2})}^{- 1 / 2}

with

t < t_{max} < λ_{1}^{- 1}

and t bounded away from

λ_{1}^{- 1}

uniformly in N at fixed

ζ

). In contrast, (90) sums

Θ (D_{K}^{2})

terms with

D_{K} ≍ N^{2 K}

and (for generic B)

Haf (B_{I + J})

bounded away from zero on a positive fraction of pairs, whence an exponential lower bound in

K ≍ ζ N^{2}

. Careful bookkeeping of constants gives the stated separation. □

8.4. Theory.

DPI, calibration optimality

m_{t B} = 2 K

, angle bounds, the exact GBS–I variance, the large-n GBS–P MSE, and the

(ε, δ)

envelopes are publishable as-is.

Consequences. The exact/algebraic forms (81)–(87) and the convexity identity

\frac{d}{d t} log (d_{t B}^{- 1} t^{- 2 K}) = \frac{m_{t B} - 2 K}{t}

(92)

imply that photon–number matching

m_{t B} = 2 K

simultaneously (i) minimizes the leading factors in the GBS variance/MSE, (ii) sharpens the principal–angle bounds (conditioning), and (iii) maximizes the Lyapunov rate in the residual contraction. These yield non–asymptotic

(ε, δ)

envelopes for

n_{GBS}

and the regime map against

n_{MC}

as asserted in Theorem 8.5.

9. Design Rules from DSFL (Actionable)

This section collects operational prescriptions that follow from the comparison geometry developed above and states them as verifiable statements with proofs or proof sketches. Throughout we fix a degree

K \in N

, a covariance B, and a scale

t \in (0, λ_{1}^{- 1})

. The Hilbert space, blueprint, physical law, residual and principal-angle factor are as in the previous sections.

9.1. Calibration by Photon-Number Matching

Theorem 9.1

(Calibration rule). Let

R_{res}^{(K)} (t)

denote the degree-K DSFL residual at scale t. Assume

R_{res}^{(K)} (t) = A C_{2 K} (t) - B

with

A > 0

and

C_{2 K} (t) = t^{- 2 K} d_{t B}^{- 1}

. Then

t^{★}

uniquely minimizes

R_{res}^{(K)} (t)

on

(0, λ_{1}^{- 1})

if and only if

m_{t^{★} B} = 2 K .

(93)

Moreover, at

t^{★}

the leading factors in the variance/MSE bounds of the degree-K GBS estimators are minimized.

Proof.

The identity

\frac{d}{d t} log C_{2 K} (t) = \frac{1}{t} (m_{t B} - 2 K)

and the strict convexity of

t \mapsto C_{2 K} (t)

prove existence and uniqueness of the minimizer with stationarity condition

m_{t^{★} B} = 2 K

. The leading t–dependence of

n Var ({\hat{μ}}_{K}^{GBS - I})

and of the large-n MSE of

{\hat{μ}}_{K}^{GBS - P}

is proportional to

C_{2 K} (t)

by the variance/MSE formulæ in the sample-complexity section; hence the same minimizer applies. □

9.2. Angle Check and Channel Selection

Define

γ (t) = ∥ P_{GBS} (t) P_{deg K} ∥ = cos θ_{F} (t) \in [0, 1)

and let

c_{\pm} (a, B) > 0

be the t–independent constants from the angle–variance proposition.

Proposition 9.2

(Angle gate for estimator dispersion). For the degree-K estimators at fixed t and n samples,

(1 - γ (t)) c_{-} (a, B) \leq n Var ({\hat{μ}}_{K}) \leq (1 + γ (t)) c_{+} (a, B) .

(94)

Consequently, if

γ (t) \leq γ_{★} \in [0, 1)

then

n Var ({\hat{μ}}_{K}) \leq (1 + γ_{★}) c_{+} (a, B),

(95)

and if

γ (t) \geq 1 - η

with

η \in (0, 1]

then

n Var ({\hat{μ}}_{K}) \geq η c_{-} (a, B) .

(96)

Proof.

The two-sided bound is the angle–variance link. The corollaries follow by monotonicity in

γ

. □

Theorem 9.3

(Angle threshold ⇒ GBS vs MC decision). Let

n_{GBS} (t)

and

n_{MC}

be guaranteed sample sizes (for a fixed

(ε, δ)

) derived from Chebyshev/Markov bounds for the degree-K task. There exists a computable threshold

\bar{γ} = \bar{γ} (ε, δ, a, B, K) > 0

such that

γ (t) \leq \bar{γ} ⟹ n_{GBS} (t) \leq n_{MC} .

(97)

Conversely, if

γ (t) \geq 1 - η

with η below a computable level

\underset{̲}{η} (ε, δ, a, B, K)

, then

n_{GBS} (t) \geq n_{MC}

.

Proof sketch.

Insert the angle bounds into the Chebyshev/Markov forms for

n_{GBS} (t)

(resp. the corresponding quantity for MC). The MC denominator involves the full mixed moment structure, while the GBS denominator is degree-filtered; both numerators are bounded above/below by

(1 \pm γ)

times t–independent constants. Solving the resulting inequality for

γ

yields the thresholds

\bar{γ}

and

\underset{̲}{η}

(explicit in terms of

c_{\pm} (a, B)

and the MC dispersion proxies). □

9.2.1. Practical Surrogate

Let

X_{t}

be the feature matrix whose columns are the (normalized) sampling features at scale t and

Y_{K}

a basis for the degree-K blueprint range. Then

\hat{γ} = ∥ X_{t}^{⊤} Y_{K} ∥

satisfies

\hat{γ} \geq γ (t)

, so

\hat{γ} \leq \bar{γ}

is a conservative sufficient condition for choosing GBS.

9.3. Degree Decomposition and Hybrid Allocation

Let

f = \sum_{k \geq 0} f_{k}

be the degree decomposition and write

μ = \sum_{k} μ_{k}

. Suppose each degree-k component is estimated either by GBS (at its matched

t_{k}

) or by MC, yielding per-degree guaranteed sample sizes

n_{k}^{GBS}

and

n_{k}^{MC}

under a common

(ε, δ)

budget with per-degree tolerances

{ε_{k}}_{k \leq K}

satisfying

\sum_{k \leq K} ε_{k} \leq ε

.

Proposition 9.4

(Greedy degree-wise policy is optimal under monotone margins). Assume

n_{k}^{GBS}

and

n_{k}^{MC}

are positive and the advantage margins

Δ_{k} : = n_{k}^{MC} - n_{k}^{GBS}

(98)

are nondecreasing in k. Then the policy that assigns GBS to the largest k for which

Δ_{k} > 0

and MC otherwise minimizes the total sample size

\sum_{k \leq K} n_{k}

among all degree-wise assignments obeying the same tolerance split.

Proof.

Under the monotonicity hypothesis, the sequence of savings

{Δ_{k}}

is ordered; an exchange argument shows any assignment deviating from the “assign GBS to the largest positive

Δ_{k}

” rule can be improved by swapping one GBS and one MC choice, strictly reducing total cost. Iterating yields the stated greedy optimum. □

Corollary 9.5

(A posteriori hybrid switch). If an a priori monotonicity check is unavailable, an empirical DSFL criterion suffices: after a warm-up with

n_{0}

samples per degree,

{\hat{R_{res}}}_{GBS}^{(k)} \leq {\hat{R_{res}}}_{MC}^{(k)} ⟹ use GBS at degree k; else use MC .

(99)

By the Lyapunov contraction and angle bounds, this rule is asymptotically consistent.

9.4. Seam Stability Under Composition (Gluing) and Leakage Quantification

Consider a pipeline of L updates

(U_{ℓ}, Φ_{ℓ})

,

ℓ = 1, \dots, L

, where each

U_{ℓ}

acts on the blueprint side and each

Φ_{ℓ}

on the physical side (retuning, counting, linear optical blocks etc.).

Proposition 9.6

(No inflation under gluing). If every step is DSFL-admissible, i.e.

Φ_{ℓ} U_{ℓ} = U_{ℓ} Φ_{ℓ}

and

∥ Φ_{ℓ} ∥ \leq 1

, then the composite update

(U, Φ) = (U_{L} \dots U_{1}, Φ_{L} \dots Φ_{1})

satisfies

R_{res}^{(K)} (U s^{(K)}, Φ p) \leq R_{res}^{(K)} (s^{(K)}, p) for all s^{(K)}, p .

(100)

Proof.

The data–processing inequality holds at each step:

R_{res}^{(K)} (U_{ℓ} s, Φ_{ℓ} p) \leq R_{res}^{(K)} (s, p) .

(101)

Induction on ℓ yields the composite bound. □

Proposition 9.7

(Angle leakage bound). Let

θ_{F}^{(ℓ)}

be the Friedrichs angle between the degree-K blueprint subspace and the sampling feature subspace after step ℓ. If each

Φ_{ℓ}

is a contraction onto a feature subspace, then

cos θ_{F}^{(L)} \leq min \{1, \sum_{ℓ = 1}^{L} cos θ_{F}^{(ℓ)}\} .

(102)

In particular, if

cos θ_{F}^{(ℓ)} \leq η_{ℓ}

with

\sum_{ℓ} η_{ℓ} < 1

, then

θ_{F}^{(L)}

stays bounded away from 0 and the angle–variance bounds remain uniformly coercive along the pipeline.

Proof sketch.

For orthogonal projectors

P, Q

one has

∥ P Q ∥ = cos θ_{F} (P, Q)

. CS-decomposition and sub-multiplicativity give

∥ P_{GBS}^{(L)} P_{deg K} ∥ \leq ∥ P_{GBS}^{(L)} P_{GBS}^{(L - 1)} ∥ \dots ∥ P_{GBS}^{(1)} P_{deg K} ∥ .

Each factor is bounded by the corresponding cosine; the asserted

ℓ^{1}

bound follows from the inequality

1 - \prod_{ℓ} (1 - x_{ℓ}) \leq \sum_{ℓ} x_{ℓ}

applied to

x_{ℓ} = 1 - cos θ_{F}^{(ℓ)}

. □

9.4.1. Consequence

Under calibrated retuning at each seam (so that

m_{t_{ℓ} B} = 2 K

), the principal angle cannot collapse in finite length pipelines formed by admissible blocks whose individual misalignments are summable; thus residuals do not inflate and the Lyapunov envelope persists.

10. Experiments

This section describes synthetic and hardware-aligned numerical studies designed to validate the DSFL predictions. We specify the experimental design, the estimators and metrics, and we state formal consistency properties that connect empirical readouts (MSE curves, angle surrogates, advantage percentages) to the theoretical quantities in the preceding sections. Throughout, degree-K quantities are taken at the photon-match

m_{t B} = 2 K

unless otherwise indicated; empirical laws

{\hat{p}}^{(n)} (t)

are formed from n samples.

10.1. Simulated GBS and Synthetic $(B, a_{I})$

10.1.1. Design.

Fix grids

N \subset N, K \subset N, and choose (N, K) \in N \times K .

(103)

For each

(N, K)

:

Draw eigenvalue spectra ${λ_{j}}_{j = 1}^{N}$ from prescribed classes:

$(flat) λ_{j} = λ \in (0, 1), (polynomial) λ_{j} = c j^{- β}, (exponential) λ_{j} = c e^{- β j},$

(104)

with amplitude $c \in (0, 1)$ and decay $β > 0$ chosen so that $0 < λ_{N} \leq \dots \leq λ_{1} < 1$ .
Draw a random Haar orthogonal $U \in O (N)$ and set $B = U diag (λ_{1}, \dots, λ_{N}) U^{⊤}$ .
Sample the degree-K blueprint on the unit sphere:

$s_{K} = {(a_{I})}_{| I | = 2 K} \sim Unif (S^{D_{K} - 1}), D_{K} = # {I : | I | = 2 K} .$

(105)
Compute the matched scale $t^{★}$ by solving $m_{t B} = 2 K$ ; equivalently, find the unique $t \in (0, λ_{1}^{- 1})$ such that $\sum_{j = 1}^{N} t^{2} λ_{j}^{2} / (1 - t^{2} λ_{j}^{2}) = 2 K$ .

10.1.2. Estimators and Metrics

For each configuration and for a range of sample sizes n:

Form ${\hat{μ}}_{K}^{GBS - I} (n, t^{★})$ and ${\hat{μ}}_{K}^{GBS - P} (n, t^{★})$ from n i.i.d. synthetic GBS draws at $t^{★}$ (via the hafnian-defined pmf).
Form the plain Monte Carlo estimator ${\hat{μ}}_{K}^{MC} (n)$ using n i.i.d. $X \sim N (0, B)$ and Wick reconstruction to isolate degree K if desired.
Estimate MSEs and empirical residuals by repetitions:

$\hat{MSE} (•) : = \frac{1}{R} \sum_{r = 1}^{R} {({\hat{μ}}^{(r)} - μ_{K})}^{2}, {\hat{R_{res}}}^{(K)} ({\hat{p}}^{(n)} (t^{★})) : = ∥ {\hat{p}}^{(n)} (t^{★}) - s_{K} ∥_{H_{K}}^{2} .$

(106)
Report guaranteed sample sizes ${\hat{n}}_{GBS - I}, {\hat{n}}_{GBS - P}, {\hat{n}}_{MC}$ by inverting Chebyshev/Markov envelopes calibrated with empirical variances.
Compute efficiency ratios and advantage percentages:

$Eff . ratio = \frac{{\hat{n}}_{MC}}{{\hat{n}}_{GBS}}, Adv. % = \frac{1}{T} \sum_{t = 1}^{T} 1 {{\hat{n}}_{GBS} \leq {\hat{n}}_{MC}} \times 100 % .$

(107)

10.1.3. Theoretical Guarantees

Proposition 10.1

(Consistency of advantage estimators). Fix

(N, K)

and a spectrum class with

0 < λ_{j} < 1

. Let

{\hat{n}}_{GBS}

and

{\hat{n}}_{MC}

be defined from empirical variances with

R \to \infty

repetitions and

n \to \infty

samples per repetition at matched

t^{★}

. Then

{\hat{n}}_{GBS} \overset{a . s .}{\to} n_{GBS}^{★}, {\hat{n}}_{MC} \overset{a . s .}{\to} n_{MC}^{★},

(108)

where

n^{★}

are the population Chebyshev/Markov sample sizes. Consequently, the empirical advantage percentage converges almost surely to the true percentage under the same configuration ensemble.

Proof sketch.

Strong laws for empirical second moments of linear statistics of i.i.d. draws imply almost-sure convergence of variance estimators. Continuity of the inversion map

(v \mapsto c v / (ε^{2} δ))

yields the stated limits; the law of large numbers over configuration draws implies convergence of frequencies. □

10.2. Ablations Testing DSFL Predictions

10.2.1. T-Sweep (U-Shape)

For each fixed

(N, K)

and

s_{K}

, sweep t in a neighborhood of

t^{★}

and evaluate

{\hat{R_{res}}}^{(K)} ({\hat{p}}^{(n)} (t))

and

\hat{MSE} ({\hat{μ}}_{K} (t))

.

Proposition 10.2

(Empirical convexity and minimum at the match). Under the simulators above, for any fixed

(N, K)

and sufficiently large

n, R

,

t \mapsto E {\hat{R_{res}}}^{(K)} ({\hat{p}}^{(n)} (t)) and t \mapsto E \hat{MSE} ({\hat{μ}}_{K} (t))

(109)

are approximately convex with a unique minimum at

t = t^{★}

(the solution of

m_{t B} = 2 K

). Moreover, their second derivatives with respect to

log t

are strictly positive up to

o_{n} (1)

errors.

Proof sketch.

The residual and variance/MSE leading factors are proportional to

C_{2 K} (t) = t^{- 2 K} d_{t B}^{- 1}

; Lemma (scale calculus) showed

\frac{d^{2}}{d t^{2}} log C_{2 K} (t) > 0

and stationarity

\frac{d}{d t} log C_{2 K} (t) = 0

iff

m_{t B} = 2 K

. Empirical averages converge to their expectations by SLLN, hence the observed convex U-shape and unique minimum at

t^{★}

up to sampling error. □

10.2.2. Angle Surrogate vs Variance

Construct features

X_{t}

and blueprint basis

Y_{K}

, compute

\hat{γ} = ∥ X_{t}^{⊤} Y_{K} ∥

, and correlate

\hat{γ}

with empirical

n \hat{Var} ({\hat{μ}}_{K} (t))

across t.

Proposition 10.3

(Monotone trend). For fixed

(N, K)

and large

n, R

, the map

t \mapsto n \hat{Var} ({\hat{μ}}_{K} (t))

is bounded between two affine functions of

\hat{γ} (t)

:

(1 - \hat{γ} (t)) {\hat{c}}_{-} (a, B) ≲ n \hat{Var} ({\hat{μ}}_{K} (t)) ≲ (1 + \hat{γ} (t)) {\hat{c}}_{+} (a, B),

(110)

with

{\hat{c}}_{\pm} \to c_{\pm}

in probability as

R \to \infty

.

Proof sketch.

Replace the projector norm

γ (t)

by the computable surrogate

\hat{γ} (t) \geq γ (t)

; the angle–variance proposition yields the population inequalities. Empirical variances converge to their population values, and the Gram-based surrogate is continuous in sample averages of features, giving the displayed bounds up to

o_{n, R} (1)

errors. □

10.2.3. Hybrid Crossover

Partition the blueprint across degrees

k \leq K_{max}

and estimate the total samples needed by degree-wise GBS vs MC. Plot the crossover index

k_{\times}

where the hybrid policy switches.

Proposition 10.4

(Crossover index is increasing with N and decreasing with decay). Let

k_{\times} (N; β)

be the smallest degree where

n_{k}^{GBS} \leq n_{k}^{MC}

under a polynomial (or exponential) eigen-decay parameter β. Then for sufficiently large N and fixed tolerance

(ε, δ)

,

\partial_{N} k_{\times} (N; β) \geq 0, \partial_{β} k_{\times} (N; β) \leq 0,

(111)

i.e. larger systems favor GBS at lower degrees, while stronger decay (smaller effective dimension) delays the crossover.

Proof sketch.

The MC dispersion includes mixed-moment growth with the effective dimension and increases rapidly with N when decay is weak; GBS dispersion at the matched degree depends primarily on degree-k features. Comparing the two sample-size envelopes and differentiating w.r.t. the structural parameters yields the stated monotonicities. □

10.3. Hardware-Aligned Kernels (Optional)

10.3.1. Models

Implement time-bin/loop architectures by stochastic kernels that compose (i) squeezing maps defining

t B

, (ii) interferometer unitary draws, and (iii) photon-number resolving measurements (or threshold versions). Insert calibrated loss channels modeled as contractions on

H_{K}

.

10.3.2. Stress Tests

For each pipeline, concatenate L calibrated blocks and measure

{\hat{R_{res}}}^{(K)}

after each seam, as well as

\hat{γ} (t_{ℓ})

.

Proposition 10.5

(No inflation under concatenation with calibrated retuning). Assume each block is DSFL-admissible and retuned to its local match

m_{t_{ℓ} B} = 2 K

. Then the empirical residual is nonincreasing along the pipeline up to sampling error:

E {\hat{R_{res}}}^{(K)} ({\hat{p}}_{ℓ}^{(n)}) \leq E {\hat{R_{res}}}^{(K)} ({\hat{p}}_{ℓ - 1}^{(n)}) + o_{n} (1), ℓ = 1, \dots, L .

(112)

Proof.

Apply the DPI for each admissible block (projection/contraction and intertwining), and use convergence of empirical residuals to their expectations. □

Proposition 10.6

(Robustness to small losses). Let a loss channel of strength

η \in [0, 1)

act between calibrated blocks. Then the matched scale

t_{ℓ}^{★}

solves the perturbed equation

m_{t_{ℓ} B^{η}} = 2 K

and the contraction rate

α (t_{ℓ}^{★})

degrades by at most

O (η)

; consequently, the empirical semi-log MSE slope remains within

O (η)

of the noiseless value for fixed n.

Proof sketch.

A lossy Gaussian channel induces

B \mapsto B^{η}

with

∥ B^{η} - B ∥ = O (η)

; Davis–Kahan perturbation gives

| γ^{η} - γ | = O (η)

, while the scale calculus shows

t^{★}

shifts continuously in

η

. The Lyapunov rate

α

is monotone in the angle margin, hence perturbs by

O (η)

. □

10.3.3. Reporting

For each setting we present: (i) MSE and residual vs n (semi-log slopes), (ii) MSE vs t (U-shapes with minimum at

t^{★}

), (iii) variance vs angle surrogate curves, (iv) advantage heatmaps over

(N, K)

, and (v) pipeline residual traces across seams with/without loss.

11. Concluding Discussion

This work reframes Gaussian Boson Sampling (GBS) as a calibrated variance–reduction engine operating in a single comparison geometry. Within that geometry, the Deterministic Statistical Feedback Law (DSFL) supplies three pillars that, taken together, elevate the “folk rules’’ of GBS into theorems: (i) calibration optimality, which identifies photon–number matching

m_{t B} = 2 K

as the unique minimizer of the leading variance/MSE factor

t^{- 2 K} d_{t B}^{- 1}

; (ii) principal–angle conditioning, which yields sharp two–sided dispersion bounds with multiplicative constants

(1 \pm cos θ_{F})

; and (iii) a Lyapunov envelope that converts one–step contraction into a non–asymptotic exponential error decay

{MSE}_{n} \leq C e^{- α (t) n}

with

α (t)

maximal at the photon match. These statements are not merely sufficient conditions; they jointly explain why GBS beats Monte Carlo (MC) in high–degree regimes, how to tune the device to realize the advantage, and what rates to expect in finite samples.

On the algorithmic side, the DSFL lens unifies unbiased (GBS–I) and biased–but–consistent (GBS–P) estimators, shows that both inherit the same convex t–dependence, and clarifies the role of degree partitioning: once the target is decomposed into degree components, GBS naturally concentrates effort where MC pays an exponential price through cross–moment couplings, while a hybrid policy remains optimal for small degrees. On the systems side, DSFL’s data–processing inequality guarantees no residual inflation under counting, retuning, or block concatenation; principal–angle control quantifies leakage across interfaces; and perturbation arguments (Davis–Kahan type) explain the graceful degradation under small hardware noise together with recovery by re–matching

m_{t B}

.

The implications are practical. First, calibration reduces to solving a strictly convex, one–dimensional equation for t, hence is robust and fast; this makes photon–number matching a pre–run, certifiable step. Second, a computable angle surrogate (e.g., a cross–Gram norm between sampling features and blueprint basis) provides an a priori gate for channel selection (GBS vs MC), enabling hybrid deployment without post–hoc tuning. Third, the Lyapunov envelope gives finite–n guarantees that can be monitored online by empirical DSFL residuals, turning semi–log MSE plots into quantitative health checks.

There are natural extensions. For non–Gaussian observables or generalized boson–sampling models (loop–hafnians, threshold detection), one must redesign the feature map and admissible updates so that the DPI persists and bias terms remain controlled; the same Hilbertian template applies. Angle surrogates can be refined—e.g. via randomized subspace embeddings—to scale to very large

(N, K)

while preserving principal–angle fidelity. Finally, the experimental program suggested here—t–sweeps for convex U–shapes, variance–vs–angle curves, and block–wise residual traces—offers falsifiable diagnostics for hardware pipelines, and a path to controller–in–the–loop calibration where t and the interferometer are adjusted to maximize the Lyapunov rate

α (t)

subject to device constraints.

In sum, DSFL furnishes a predict–then–prove methodology for GBS: it predicts the optimal operating point (

m_{t B} = 2 K

), the conditioning (

θ_{F}

), and the rate (

α (t)

) before data are collected; it proves that admissible updates cannot worsen the calibrated misfit; and it turns those predictions into actionable rules that persist under composition and mild noise. This synthesis clarifies when and why GBS delivers sample–complexity advantages over MC, and how to tune photonic pipelines to realize them in practice.

Funding

No external funding was received for this work.

Data and code availability

This work uses only synthetic data produced by the algorithms described in the paper; no external datasets were used or analyzed. A fully reproducible package containing (i) source code, (ii) parameter files for every figure, (iii) exact random seeds, and (iv) an exportable runtime environment (both environment.yml and requirements.txt) will be deposited at Zenodo under an archival DOI (to be added at acceptance) and mirrored on GitHub (read-only tag matching the DOI). The repository includes a one-click reproducibility script: which regenerates all tables and figures in the manuscript. Tested platforms: Linux (x86_64) and macOS (arm64) with Python ≥ 3.10; core dependencies: NumPy, SciPy, Matplotlib, and JAX/Numba (optional, for accelerated runs). Code will be released under an OSI-approved license (MIT).

Acknowledgments

The author affirms sole authorship and received no external funding.

Conflicts of Interest

The author declares no competing interests.

Appendix A. Crisp Theorem Statements (Ready to Formalize)

Appendix A.1. Standing Setup

Let

(H, 〈 \cdot, \cdot 〉)

be a Hilbert space with norm

∥ x ∥ = \sqrt{〈 x, x 〉}

, and closed subspaces

S, P \subset H

. A calibration is a pair of bounded maps

I : S \to P, J : P \to S,

(A1)

satisfying the interchangeability identities

I J = {id}_{P}, J I = P_{S},

(A2)

with

P_{S}

the orthogonal projector onto

S

. For

s \in S

and

p \in P

, the Residual of Sameness is

R (s, p) : = {∥ p - I s ∥}_{H}^{2} .

(A3)

An admissible update is a pair

(\tilde{Φ}, Φ)

with

\tilde{Φ} : S \to S

,

Φ : P \to P

such that

Φ I = I \tilde{Φ} and {∥ Φ ∥}_{H \to H} \leq 1 .

(A4)

For angle bounds, let

M, N \subset H

be closed subspaces with orthogonal projectors

P_{M}, P_{N}

, and Friedrichs cosine

γ (M, N) : = ∥ P_{M} P_{N} ∥ \in [0, 1], cos θ_{F} (M, N) = γ (M, N) .

(A5)

DPI (GBS Admissibility)

Theorem A.1

(Data–processing inequality). If

(\tilde{Φ}, Φ)

is admissible, then for all

s \in S

and

p \in P

,

R (\tilde{Φ} s, Φ p) \leq R (s, p) .

(A6)

Proof.

Using

Φ I = I \tilde{Φ}

,

R (\tilde{Φ} s, Φ p) = ∥ Φ p - I \tilde{Φ} {s ∥}^{2} = ∥ Φ (p - I s) ∥^{2} \leq {∥ Φ ∥}^{2} {∥ p - I s ∥}^{2} \leq {∥ p - I s ∥}^{2} .

(A7)

□

Corollary A.2

(Composition). If

{({\tilde{Φ}}_{k}, Φ_{k})}_{k = 1}^{L}

are admissible, then

(\tilde{Φ}, Φ) = (\prod_{k = 1}^{L} {\tilde{Φ}}_{k}, \prod_{k = 1}^{L} Φ_{k})

is admissible and

R (\tilde{Φ} s, Φ p) \leq R (s, p) .

(A8)

Proof.

Intertwining and

∥ \cdot ∥ \leq 1

are preserved under composition; apply Theorem A.1. □

Calibration optimality ⇒ m_tB = 2K

Appendix A.2. GBS Specialization

Let

B \in R^{N \times N}

be symmetric with

0 < λ_{N} \leq \dots \leq λ_{1} < 1

, and fix a degree

K \in N

. For a scale

t \in (0, λ_{1}^{- 1})

, define

d_{t B} : = \prod_{j = 1}^{N} \sqrt{1 - t^{2} λ_{j}^{2}}, C_{2 K} (t) : = \frac{t^{- 2 K}}{d_{t B}}, m_{t B} : = \sum_{j = 1}^{N} \frac{t^{2} λ_{j}^{2}}{1 - t^{2} λ_{j}^{2}} .

(A9)

Assume the degree-K residual admits the leading form

R_{res}^{(K)} (t) = A C_{2 K} (t) - B

with

A > 0

independent of t.

Lemma A.3

(Scale calculus).

\frac{d}{d t} log C_{2 K} (t) = \frac{m_{t B} - 2 K}{t}

, and

\frac{d^{2}}{d t^{2}} log C_{2 K} (t) = \frac{2 K}{t^{2}} + \sum_{j = 1}^{N} \frac{λ_{j}^{2}}{1 - t^{2} λ_{j}^{2}} + \sum_{j = 1}^{N} \frac{2 t^{2} λ_{j}^{4}}{{(1 - t^{2} λ_{j}^{2})}^{2}} > 0

.

Proof.

Differentiate

log C_{2 K} (t) = - 2 K log t - \frac{1}{2} \sum_{j} log (1 - t^{2} λ_{j}^{2})

termwise. □

Theorem A.4

(Calibration optimality). The unique minimizer

t^{★} \in (0, λ_{1}^{- 1})

of

t \mapsto R_{res}^{(K)} (t)

is characterized by the photon–number match

m_{t^{★} B} = 2 K .

(A10)

Equivalently,

t^{★}

minimizes the leading t–factors in the variance/MSE of the degree-K GBS estimators.

Proof.

By Lemma A.3, critical points of

C_{2 K}

satisfy

m_{t B} = 2 K

and

C_{2 K}

is strictly convex, hence the minimizer is unique. The variance/MSE leading factors for GBS-I and GBS-P are proportional to

C_{2 K} (t)

; thus they share the same minimizer. □

What is provable once clearly stated assumptions are added (action items)

Angle ⇒ rate monotonicity. We state that the Lyapunov rate $α (t)$ improves as $γ (t) = ∥ P_{GBS} (t) P_{deg K} ∥$ decreases. Action: Include Lemma B.2, which bounds the rate between linear functions of the angle margin:

$c_{-} (1 - γ (t)) \leq α (t) \leq c_{+} (1 - γ (t)),$

(A11)

with t–independent $0 < c_{-} \leq c_{+} < \infty$ . The proof is a two–subspace argument using the through–subspace contraction (Proposition 6.8) and the sharp bounds in Proposition A.7.
Residual factorization. We use $R^{(K)} (t) = A C_{2 K} (t) - B$ to map calibration to residual minimization. Action: Add Lemma A.5 (Degree–K residual factorization) showing that all t–dependence factors through $t^{- 2 K}$ and $d_{t B}^{- 1}$ for the degree–K laws. A concise proof sketch is included in the appendix, relying on hafnian homogeneity $Haf ({(t B)}_{I}) = t^{K} Haf (B_{I})$ and linearity of the readouts.

Lemma A.5

(Residual factorization for degree-K). Fix

K \in N

, a covariance B with spectrum in

(0, 1)

, and coefficients

{(a_{I})}_{| I | = 2 K}

. Let

t \in (0, λ_{1}^{- 1})

, set

d_{t B} = \prod_{j = 1}^{N} \sqrt{1 - t^{2} λ_{j}^{2}}

and

C_{2 K} (t) : = \frac{t^{- 2 K}}{d_{t B}} .

Consider the degree-K DSFL residual

R_{res}^{(K)} (t) : = {∥ p (t) - I s_{K} ∥}_{H_{K}}^{2}

built from either readout

GBS-P : I^{P} x = \sum_{| I | = 2 K} a_{I} \frac{x_{I}}{d_{t B}}, GBS-I : I^{I} x = \sum_{| I | = 2 K} a_{I} \frac{I!}{d_{t B}} x_{I} .

Then there exist t–independent constants

A > 0

and

B \in R

(depending only on

(a_{I})

and B) such that

R_{res}^{(K)} (t) = A C_{2 K} (t) - B .

Sketch.

For

| I | = 2 K

, hafnian homogeneity gives

Haf ({(t B)}_{I}) = t^{K} Haf (B_{I})

, whence

p_{I} (t) = d_{t B} t^{2 K} Haf {(B_{I})}^{2} / I!

. Thus every coordinate of

p (t)

carries the common scalar factor

d_{t B} t^{2 K}

. The degree-K readouts introduce only the normalization

d_{t B}^{- 1}

(and

I!

in GBS-I), which does not depend on t beyond

d_{t B}

. Expanding

∥ p (t) - I s_{K} ∥^{2}

shows the t–dependence factors through

t^{- 2 K} d_{t B}^{- 1} = C_{2 K} (t)

, with all inner products collected into t–independent constants

A > 0

and

B \in R

. A fully expanded computation (including the multinomial covariance form for GBS-P and the exact second moment for GBS-I) is provided in Appendix A. □

Corollary A.6

(Calibration ⇒ residual minimization). Under the hypotheses of Lemma A.5, the unique minimizer of

t \mapsto R_{res}^{(K)} (t)

on

(0, λ_{1}^{- 1})

is characterized by

m_{t B} = 2 K

.

Proof.

By Lemma A.5,

R_{res}^{(K)} (t) = A C_{2 K} (t) - B

with

A > 0

. The scale calculus shows

\frac{d}{d t} log C_{2 K} (t) = \frac{m_{t B} - 2 K}{t}

and

t \mapsto C_{2 K} (t)

is strictly convex on

(0, λ_{1}^{- 1})

. Hence the unique minimizer of

R_{res}^{(K)} (t)

satisfies

m_{t B} = 2 K

. □

Proof.

By hafnian homogeneity,

p_{I} (t) = d_{t B} t^{2 K} Haf {(B_{I})}^{2} / I!

. Hence

{∥ p (t) ∥}^{2} = \sum_{I} p_{I} {(t)}^{2} = d_{t B}^{2} t^{4 K} \sum_{I} Haf {(B_{I})}^{4} / I!^{2} = d_{t B} t^{2 K} \cdot (t^{2 K} d_{t B} \sum_{I} {Haf}^{4} / I!^{2}) .

The parenthesized factor is independent of t after regrouping; similarly, the cross term

〈 p (t), Π s_{K} 〉 = \sum_{I} p_{I} (t) {(Π s_{K})}_{I}

is proportional to

d_{t B} t^{2 K}

times a t–independent sum, and

∥ Π s_{K} ∥^{2}

is t–independent. Expanding

∥ p (t) - Π s_{K} ∥^{2}

yields the claimed affine dependence on

C_{2 K} (t) = t^{- 2 K} d_{t B}^{- 1}

with

A > 0

. □

Angle bounds

Proposition A.7

(Two-subspace inequality). Let

M, N \subset H

be closed subspaces with

γ = ∥ P_{M} P_{N} ∥ \in [0, 1]

. For any

a \in M

and

b \in N

,

(1 - γ) ({∥ a ∥}^{2} + {∥ b ∥}^{2}) \leq {∥ a + b ∥}^{2} \leq (1 + γ) ({∥ a ∥}^{2} + {∥ b ∥}^{2}) .

(A12)

Proof.

Expand

{∥ a + b ∥}^{2} = {∥ a ∥}^{2} + {∥ b ∥}^{2} + 2 ℜ 〈 a, b 〉

, and use

| 〈 a, b 〉 | = | 〈 a, P_{M} b 〉 | \leq ∥ a ∥ ∥ P_{M} b ∥ \leq γ ∥ a ∥ ∥ b ∥

, then

2 ∥ a ∥ ∥ b ∥ \leq {∥ a ∥}^{2} + {∥ b ∥}^{2}

. □

Appendix A.3. GBS specialization (degree-K).

Let

P_{deg K}

be the blueprint projector and

P_{GBS} (t)

the sampling-feature projector at scale t, and define

γ (t) = ∥ P_{GBS} (t) P_{deg K} ∥ = cos θ_{F} (t)

.

Theorem A.8

(Angle–dispersion link). For the degree-K estimators from n samples at fixed t, there exist t–independent

c_{-} (a, B), c_{+} (a, B) > 0

such that

(1 - γ (t)) c_{-} (a, B) \leq n Var ({\hat{μ}}_{K}) \leq (1 + γ (t)) c_{+} (a, B),

(A13)

with Var understood as the variance for GBS-I and as the leading asymptotic variance (or squared-bias term) for GBS-P. Equivalently, MSE is sandwiched between two multiples of the split energy, with multiplicative factors

(1 \pm cos θ_{F} (t))

.

Proof sketch.

Write the estimator as

{\hat{μ}}_{K} = 〈 w, \hat{p} (t) 〉

with w fixed and

\hat{p} (t)

the empirical law. Then

n Var ({\hat{μ}}_{K}) = 〈 Σ (t) w, w 〉

, where

Σ (t)

is the single-sample covariance. The range of

Σ (t)

lies in

P_{GBS} (t) H

, while w decomposes into

P_{deg K} w + w_{⊥}

. Apply Proposition A.7 twice (once to control leakage into

P_{GBS} (t)

, once for the split relative to

P_{deg K}

) and absorb t–independent weights into

c_{\pm} (a, B)

. □

Lyapunov bound

Theorem A.9

(Discrete Lyapunov envelope). Let

R_{res}^{(K)}

be the degree-K residual and suppose there exists

α (t) \in (0, 1]

such that, conditionally on the past,

E [R_{res}^{(K)} (U s^{(K)}, Φ p^{(n)} (t)) | p^{(n)} (t)] \leq (1 - α (t)) R_{res}^{(K)} (s^{(K)}, p^{(n)} (t)),

(A14)

for an admissible one-step update

(U, Φ)

. Then

E R_{res}^{(K)} (p^{(n)} (t)) \leq e^{- α (t) n} R_{res}^{(K)} (p^{(0)} (t)) .

(A15)

Moreover,

α (t)

increases as

γ (t)

decreases, and is maximized at the calibration match

m_{t B} = 2 K

.

Proof.

Taking total expectation yields

E R_{n + 1} \leq (1 - α) E R_{n}

with

R_{n} : = R_{res}^{(K)} (p^{(n)} (t))

. Iteration gives

E R_{n} \leq {(1 - α)}^{n} R_{0} \leq e^{- α n} R_{0}

. By Theorem A.8, improved alignment (smaller

γ

) strengthens contraction; by Theorem A.4, the match maximizes this gain. □

Corollary A.10

(

(ε, δ)

sample-size envelope). If

n \geq α {(t)}^{- 1} log (R_{res}^{(K)} (p^{(0)} (t)) / ε^{2})

, then

E R_{res}^{(K)} (p^{(n)} (t)) \leq ε^{2}

. At the match

m_{t B} = 2 K

, the right-hand side is minimized in t.

Proof.

Solve

e^{- α (t) n} R_{res}^{(K)} (p^{(0)} (t)) \leq ε^{2}

for n. □

Appendix B. Explaining the Elements: sDoF/pDoF, Interchangeability, DSFL and the Residual of Sameness R

This section fixes the primitives used throughout. We first embed the statistical degrees of freedom (sDoF) and physical degrees of freedom (pDoF) in one comparison Hilbert space; then we define the single observable (Residual of Sameness); finally we state the admissibility gate that makes the residual monotone and, when dynamics are present, yields a Lyapunov envelope.

Appendix B.1. Interchangeability (Calibration)

Let

(H, 〈 \cdot, \cdot 〉)

be a Hilbert space carrying two closed subspaces

S

(statistical channel) and

P

(physical channel). A calibration is a pair of bounded maps

I : S \to P, J : P \to S

(A16)

satisfying

I J = {id}_{P}, J I = P_{S},

(A17)

with

P_{S}

the orthogonal projector onto

S

. Thus

J

provides calibrated statistical representatives of

p \in P

,

I

provides calibrated physical images of

s \in S

, and composing in either channel returns the original.

Appendix B.2. One Observable: the Residual of Sameness

For

s \in S

and

p \in P

,

R (s, p) : = ∥ p - I (s) ∥_{H}^{2}

(A18)

is the squared distance from p to

ran I

. It vanishes iff

p = I (s)

and is invariant under isometries preserving

ran I

.

Appendix B.3. Admissibility and the Data–Processing Inequality

A pair

(\tilde{Φ}, Φ)

with

\tilde{Φ} : S \to S

and

Φ : P \to P

is admissible if

Φ I = I \tilde{Φ} {, ∥ Φ ∥}_{H \to H} \leq 1, {∥ \tilde{Φ} ∥}_{H \to H} \leq 1 .

(A19)

Then the Hilbertian data–processing inequality holds:

R (\tilde{Φ} s, Φ p) = ∥ Φ (p - I s) ∥_{H}^{2} \leq ∥ p - I s ∥_{H}^{2} = R (s, p) .

(A20)

Hence no admissible update inflates the calibrated mismatch. In operator–algebraic models,

J

is an orthogonal conditional expectation, and (34) follows from Kadison–Schwarz.

Appendix B.4. Propagation with a Gap: the DSFL Lyapunov Envelope

Assume a residual identity

\frac{d}{d t} R (t) = - 2 {〈 K e, e 〉}_{H} + 2 {〈 e, g 〉}_{H}, e : = p - I s, K = K^{*} ⪰ 0,

(A21)

and bounds

{〈 K e, e 〉}_{H} \geq {κ ∥ e ∥}_{H}^{2} {, | 〈 e, g 〉}_{H} {| \leq ε ∥ e ∥}_{H}^{2}, κ > ε \geq 0 .

(A22)

Then

\frac{d}{d t} R (t) \leq - α R (t), α : = 2 (κ - ε) > 0, \Rightarrow R (t) \leq e^{- α (t - t_{0})} R (t_{0}) .

(A23)

Thus

log R

decays at a single rate

α

fixed by the sector’s coercivity margin.

Appendix B.5. Operational Arrow (Accumulated Improvement)

Define

S_{R} (t) : = - log \frac{R (t)}{R (0)} = \int_{0}^{t} α (τ) d τ .

(A24)

It is nondecreasing and quantifies total calibrated mismatch removed, independently of the choice of clock.

Appendix B.6. Minimal Checklist for Applications

Fix $H$ , $S$ , $P$ ; choose $(I, J)$ with (30).
Verify (A19) for each step; (34) then guarantees monotonicity.
If dynamics are available, identify $(κ, ε)$ in (A22) to obtain $α$ in (A23).

Appendix B.7. Generic Two–Channel Application Template

Let

s (t) \in S

and

p (t) \in P

with calibration

I : S \to P

. Set

e (t) : = p (t) - I s (t)

,

R (t) : = {∥ e (t) ∥}_{H}^{2}

. If

\dot{R} (t) = - 2 {〈 K e, e 〉}_{H} + 2 {〈 e, g 〉}_{H}, {〈 K e, e 〉}_{H} \geq {κ ∥ e ∥}_{H}^{2} {, | 〈 e, g 〉}_{H} {| \leq ε ∥ e ∥}_{H}^{2},

(A25)

then

\dot{R} \leq - 2 (κ - ε) R

and

R (t) \leq e^{- 2 (κ - ε) (t - t_{0})} R (t_{0})

. This dynamic envelope pairs with the static monotonicity

R (\tilde{Φ} s, Φ p) \leq R (s, p)

whenever

(\tilde{Φ}, Φ)

is admissible.

Appendix B.8. Terminology and Mechanism: sDoF/pDoF, DSFL, and the Dual Feedback Loop

Appendix B.8.1. What are sDoF and pDoF?

The statistical degrees of freedom (sDoF) are the blueprint variables

s \in S

that encode the model, prior, or control law; the physical degrees of freedom (pDoF) are the measured/realized responses

p \in P

. Both live in the same comparison Hilbert space

(H, 〈 \cdot, \cdot 〉)

via a calibration

(I, J)

obeying

I J = {id}_{P}, J I = P_{S} .

(A26)

Equation (A26) ensures two-way coherence:

J

reads a calibrated blueprint from a physical state, and

I

writes a calibrated physical image of a statistical state. The Residual of Sameness

R (s, p) = {∥ p - I s ∥}_{H}^{2}

(A27)

measures the mismatch between pDoF and the calibrated sDoF within

H

.

Appendix B.8.2. Why “DSFL”

Deterministic Statistical Feedback Law (DSFL) is shortened to DSFL because updates are deterministic (linear/contractive), the driver is statistical (sDoF), the mechanism is feedback (via the residual

e : = p - I s

), and it functions as a law (one inequality governs all steps). For any admissible update

(\tilde{Φ}, Φ)

(

Φ I = I \tilde{Φ}

,

∥ Φ ∥, ∥ \tilde{Φ} ∥ \leq 1

),

R (\tilde{Φ} s, Φ p) = ∥ Φ (p - I s) ∥_{H}^{2} \leq {∥ p - I s ∥}_{H}^{2} = R (s, p),

(A28)

so the residual is

L^{2}

–monotone.

Appendix A.8.3. Dual Feedback Loop (Discrete Time)

There are two tightly coupled feedback paths in the same geometry:

S-loop : s_{k + 1} = \tilde{Φ} s_{k}, P-loop : p_{k + 1} = Φ p_{k}, e_{k + 1} = p_{k + 1} - I s_{k + 1},

(A29)

with

Φ I = I \tilde{Φ}

and

∥ Φ ∥, ∥ \tilde{Φ} ∥ \leq 1

. By (A28),

R (s_{k + 1}, p_{k + 1}) \leq R (s_{k}, p_{k})

sample–wise. If, in addition, a Lyapunov identity with a gap holds,

\dot{R} \leq - α R

, then

R_{k} \leq C e^{- α k}

(exponential envelope).

Appendix B.8.4. Flow form (Continuous Time)

Writing

e (t) = p (t) - I s (t)

, assume

\frac{d}{d t} R (t) = - 2 {〈 K e, e 〉}_{H} + 2 {〈 e, g 〉}_{H}, 〈 K e, e 〉 \geq {κ ∥ e ∥}^{2}, | 〈 e, g 〉 | \leq {ε ∥ e ∥}^{2},

(A30)

with

κ > ε \geq 0

. Then

\frac{d}{d t} R \leq - 2 (κ - ε) R

and

R (t) \leq e^{- 2 (κ - ε) (t - t_{0})} R (t_{0})

. The accumulated drop

S_{R} (t) = - log \frac{R (t)}{R (0)} = \int_{0}^{t} 2 (κ - ε) d τ

is nondecreasing and quantifies the total calibrated mismatch removed.

Appendix B.8.5. Practical Interpretation

The P-loop evolves pDoF through an

L^{2}

–contractive map that respects calibration, reducing e on the physical side; the S-loop refines sDoF so that

I s

tracks the evolving p. Because both loops live in

H

and intertwine

(I, J)

, they cannot fabricate residual—only dissipate or transport it. In applications with slow drift, one may update the calibration

(I, J) \mapsto (I^{(m)}, J (m))

on a coarser timescale as long as

I^{(m)} J (m) = {id}_{P}

,

J (m) I^{(m)} = P_{S}

; between updates the inner dual loop remains admissible and residual–monotone.

What Is Fully Proved (or Reducible to Standard Facts)

Appendix B.8.6. Data–Processing Inequality (DPI) in the DSFL Geometry

Claim. Admissible updates (intertwining

Φ I = I \tilde{Φ}

and nonexpansive

{∥ Φ ∥}_{H \to H} \leq 1

) cannot inflate the residual.

Status. Proved in the text (Theorem “DPI”) by a one–line Hilbert–space contraction; stability under composition is immediate.

Appendix B.8.7. Calibration Optimality m_tB = 2K

Claim. The unique minimizer of the leading variance/MSE factor is the photon–number match.

Status. Proved by strict convexity on

(0, λ_{1}^{- 1})

of

log C_{2 K} (t) = - 2 K log t - \frac{1}{2} \sum_{j} log (1 - t^{2} λ_{j}^{2}), \frac{d}{d t} log C_{2 K} (t) = \frac{m_{t B} - 2 K}{t} .

(A31)

Scope. Optimal for any quantity whose t–dependence factors through

C_{2 K} (t)

: the leading terms of the GBS-I variance and GBS-P MSE, and the DSFL residual representation

R_{res}^{(K)} (t) = A C_{2 K} (t) - B

(when that representation holds).

Appendix B.8.8. Angle (Friedrichs) Bounds

Claim. Two–sided MSE/variance control with factors

(1 \pm cos θ_{F})

.

Status. Proved via the two–subspace inequalities and

∥ P_{M} P_{N} ∥ = cos θ_{F}

; explicit derivations included.

Appendix B.8.9. Exact Variance for GBS-I; Asymptotic MSE for GBS-P

GBS-I. Exact variance formula proved from the pmf and hafnian homogeneity.

GBS-P. Large-n leading term proved by the multinomial delta method (gradient and covariance computed).

Scope/assumptions. For GBS-P we assume the usual sign-known convention and interior probabilities/smoothness so the delta method applies.

Appendix B.8.10. Chebyshev/Markov (ε,δ) Guarantees

Claim. Invert variance/MSE to obtain guaranteed sample sizes.

Status. Proved once the variance/MSE statements hold (and

μ \neq 0

), by direct Chebyshev/Markov bounds.

Appendix B.8.11. Lyapunov Envelopes (Discrete/Continuous)

Claim. If a one–step conditional contraction holds, the expected residual decays exponentially.

Status. Proved as an implication via discrete (and continuous) Grönwall.

Scope. The contraction is the hypothesis; the envelope follows rigorously.

What Is Provable Once Clearly Stated Assumptions Are Added

Appendix B.8.12. Residual Representation

Where used. To map calibration optimality to residual minimization.

Status. Provable for degree-K laws and the two estimators, since the only nontrivial t–dependence enters through

t^{- 2 K}

and

d_{t B}^{- 1}

. One factors t out of (i) the multinomial covariance and (ii) the linear readout; a short lemma closes the argument.

Appendix B.8.13. “Angle ⇒ rate” Linkage

Claim. The Lyapunov rate

α (t)

improves as

γ (t) = cos θ_{F} (t)

decreases.

Status. Provable once the conditional contraction constant is written as a monotone functional of the split energy controlled by

(1 \pm γ)

. With the projector bounds in hand, a brief comparison lemma formalizes this.

Appendix B.8.14. Composition/Gluing Stability and Small-Noise Robustness

Claim. No inflation under concatenation; small perturbations degrade angles/rates by

O (∥ Δ ∥)

.

Status. Concatenation: already follows from DPI under composition (proved). Robustness: Davis–Kahan/Wedin gives

| sin θ_{F} (B + Δ) - sin θ_{F} (B) | = O (\frac{∥ Δ ∥}{gap}),

(A32)

hence

γ

perturbs by

O (∥ Δ ∥)

and rates follow from the previous item.

Appendix B.8.15. Asymptotic Normality (CLT) for GBS-P

Claim. Beyond the leading MSE constant, a full CLT holds.

Status. Provable by standard smooth-functionals-of-the-multinomial arguments under mild regularity (already used for the delta-method step).

What Is Solid but We Cite Rather than Re-Prove Here

Corollary B.1

(CLT for GBS-P). Let

I_{K} = {I : | I | = 2 K}

and write

m : = | I_{K} |

. Fix a scale

t \in (0, λ_{1}^{- 1})

and assume the GBS law

p (t) = {(p_{I} (t))}_{I \in I_{K}} \in {(0, 1)}^{m}, \sum_{I \in I_{K}} p_{I} (t) = 1,

(A33)

i.e. all degree-K outcome probabilities are strictly interior. Let the degree-K GBS-P functional be

g (p) : = \sum_{I \in I_{K}} a_{I} t^{- K} \sqrt{I!} σ_{I} \sqrt{\frac{p_{I}}{d_{t B}}}, d_{t B} = \prod_{j = 1}^{N} \sqrt{1 - t^{2} λ_{j}^{2}},

(A34)

with

σ_{I} = sign (Haf (B_{I}))

(or absorbed into

a_{I}

when

Haf (B_{I}) \geq 0

). Let

\hat{p} (t)

be the empirical law based on n i.i.d. samples from

p (t)

, and

{\hat{μ}}_{K}^{GBS - P} : = g (\hat{p} (t))

. Then

\sqrt{n} ({\hat{μ}}_{K}^{GBS - P} - μ_{K}) \overset{d}{\to} N (0, \nabla g {(p)}^{⊤} (diag (p) - p p^{⊤}) \nabla g (p)),

(A35)

where

\frac{\partial g}{\partial p_{I}} (p) = \frac{1}{2} a_{I} t^{- K} \sqrt{I!} σ_{I} \frac{1}{\sqrt{d_{t B}}} p_{I}^{- 1 / 2} (I \in I_{K}) .

(A36)

In particular, the asymptotic variance simplifies to

\nabla g {(p)}^{⊤} (diag (p) - p p^{⊤}) \nabla g (p) = \frac{1}{4} [\frac{1}{d_{t B} t^{2 K}} \sum_{| I | = 2 K} \frac{a_{I}^{2}}{I!} - μ_{K}^{2}],

(A37)

which matches the leading constant in Proposition 8.3.

[39,40]

Proof.

Write

S = {(S_{I})}_{I \in I_{K}} \sim Multinomial (n, p)

and

\hat{p} = S / n

. The interior condition

p_{I} \in (0, 1)

implies the classical multinomial central limit theorem:

\sqrt{n} (\hat{p} - p) \overset{d}{\to} N (0, Σ (p)), Σ (p) : = diag (p) - p p^{⊤} .

(A38)

Since g is

C^{1}

on the open simplex (each coordinate appears under a square root and

p_{I} > 0

), the delta method yields

\sqrt{n} (g (\hat{p}) - g (p)) \overset{d}{\to} N (0, \nabla g {(p)}^{⊤} Σ (p) \nabla g (p)) .

(A39)

The displayed gradient follows from

g (p) = \sum_{I} c_{I} \sqrt{p_{I}}

with

c_{I} = a_{I} t^{- K} \sqrt{I!} σ_{I} / \sqrt{d_{t B}}

, so

\partial g / \partial p_{I} = \frac{1}{2} c_{I} p_{I}^{- 1 / 2}

. Finally,

\sum_{I} p_{I} {(\frac{\partial g}{\partial p_{I}})}^{2} = \frac{1}{4} \sum_{I} \frac{c_{I}^{2}}{1} = \frac{1}{4} \frac{1}{d_{t B} t^{2 K}} \sum_{| I | = 2 K} \frac{a_{I}^{2}}{I!}, {(\sum_{I} p_{I} \frac{\partial g}{\partial p_{I}})}^{2} = μ_{K}^{2},

(A40)

because

g (p) = μ_{K}

by construction (Wick + homogeneity). Hence

\nabla g^{⊤} Σ (p) \nabla g = \sum_{I} p_{I} {(\partial g / \partial p_{I})}^{2} - {(\sum_{I} p_{I} \partial g / \partial p_{I})}^{2} = \frac{1}{4} [d_{t B}^{- 1} t^{- 2 K} \sum a_{I}^{2} / I! - μ_{K}^{2}] .

□

Remark B.1

(Boundary cases). If some coordinates satisfy

p_{I} = 0

, the map

p \mapsto g (p)

is not differentiable at p; one may either (i) remove those indices from

I_{K}

(they are p-null) and re-normalize, or (ii) work with a directional delta method under sparsity assumptions. The interior condition in the corollary avoids these technicalities.

Appendix B.8.16. Open, Nonempty Regions of Advantage; Percentage-of-Advantage

Content (Andersen–Shan 2025). Large-n leading term for GBS-P (matching ours), improved tuning via

m_{t B} = 2 K

,

(ε, δ)

sample-size envelopes for both estimators, existence of nonempty open subsets with advantage over MC, and percentage-of-advantage values over

(a, B)

under Haar–Vandermonde measures.

Status. We rely on these theorems and cite them explicitly.

What Is Still Heuristic/Needs Additional Work

Appendix B.8.17. Principal-angle surrogate $\hat{γ} = {∥ X_{t}^{⊤} Y_{K} ∥}_{2}$ as a certified upper bound.

Claim. The computable Gram surrogate upper-bounds the true Friedrichs cosine.

Status. Heuristic but tighten-able: if

X_{t}

spans the sampling feature subspace and

Y_{K}

spans the blueprint subspace (properly normalized), then

∥ X_{t}^{⊤} Y_{K} ∥ ↑ ∥ P_{GBS} (t) P_{deg K} ∥

. A short approximation lemma with sampling-error bounds can make this rigorous.

Appendix B.8.18. Regime Separation (Poly vs. Exp).

Claim. For

K \sim ζ N^{2}

,

n_{GBS} = poly (N)

while

n_{MC} = exp (Θ (N))

.

Status. Proof sketch given; a fully rigorous version needs fixed ensembles (eigenvalues in

[c, 1 - ε]

; coefficients on the unit sphere) and a positive-mass lower bound for the MC mixed-hafnian quadratic form. With these in place, the exponential lower bound follows by counting/Stirling estimates (in line with Andersen–Shan open-set results).

Appendix B.8.19. One-Step Conditional Contraction (Hypothesis in Lyapunov Theorem)

Claim. Each update removes a fixed fraction of the residual in expectation.

Status. Currently an assumption (empirically supported by semi-log linearity). A device-level proof would require an explicit Markov kernel and a uniform spectral gap; feasible but model-specific.

Appendix B.8.20. One-Step Conditional Contraction (Hypothesis in Lyapunov Theorem)

Claim. Each update removes a fixed fraction of the residual in expectation.

Status. Currently an assumption (empirically supported by semi-log linearity). A device-level proof would require an explicit Markov kernel and a uniform spectral gap; feasible but model-specific.

Lemma B.2

(Rate improves as angle opens). Let

R_{res}^{(K)} (t) = {∥ p - I s ∥}^{2}

be the degree-K residual at scale t and let

(U, Φ)

be an admissible one-step update (intertwining and

∥ Φ ∥ \leq 1

). Assume the conditional one-step contraction hypothesis of Theorem A.9: there exists

α (t) \in (0, 1]

such that

E [R_{res}^{(K)} (U s, Φ p) | s, p] \leq (1 - α (t)) R_{res}^{(K)} (t) .

(A41)

Then there exist t–independent constants

0 < c_{-} \leq c_{+} < \infty

(depending only on the fixed weights

a_{I}

and B through bounded positive operators) such that

c_{-} (1 - γ (t)) \leq α (t) \leq c_{+} (1 - γ (t)), γ (t) = ∥P_{GBS} (t) P_{deg K}∥ = cos θ_{F} (t) .

(A42)

In particular,

α (t)

increases as

γ (t)

decreases and is maximized at the photon–number match

m_{t B} = 2 K

.

Proof.

Write the residual as

R_{res}^{(K)} (t) = {∥ e ∥}^{2}

with

e : = p - I s \in H_{K}

. One step gives

e^{+} : = Φ p - I U s = Φ (p - I s) + \underset{= 0 by intertwining}{\underset{︸}{Φ I s - I U s}} = Φ e,

(A43)

hence

{R_{res}^{(K)}}^{+} = ∥ e^{+} ∥^{2} = {∥ Φ e ∥}^{2}

. Taking conditional expectation and comparing to

{∥ e ∥}^{2}

,

α (t) = 1 - \frac{{E ∥ Φ e ∥}^{2}}{{∥ e ∥}^{2}} .

(A44)

Decompose

e = e_{‖} + e_{⊥}

with respect to the blueprint/sampling pair,

e_{‖} : = P_{deg K} e, e_{⊥} : = (I - P_{deg K}) e,

(A45)

and note that

Φ

acts through the sampling feature subspace (by admissibility) so that the “through–subspace’’ contraction applies:

∥ Φ e ∥ = ∥ P_{GBS} (t) Φ P_{deg K} e_{‖} + P_{GBS} (t) Φ e_{⊥} ∥ \leq ∥ Φ ∥ ∥ P_{GBS} (t) P_{deg K} e_{‖} ∥ + ∥ Φ ∥ ∥ P_{GBS} (t) e_{⊥} ∥ .

(A46)

Using

∥ Φ ∥ \leq 1

and the two–subspace bounds (Proposition A.7),

∥ P_{GBS} (t) P_{deg K} e_{‖} ∥ \leq γ (t) ∥ e_{‖} ∥, ∥ P_{GBS} (t) e_{⊥} ∥ \leq ∥ e_{⊥} ∥ .

(A47)

Squaring and taking expectation (conditionally on e) yields

{E ∥ Φ e ∥}^{2} \leq γ {(t)}^{2} ∥ e_{‖} ∥^{2} + ∥ e_{⊥} ∥^{2} \leq (1 - (1 - γ {(t)}^{2}) \frac{∥ e_{‖} ∥^{2}}{{∥ e ∥}^{2}}) {∥ e ∥}^{2} .

(A48)

By the two–subspace lower bound,

(1 - γ (t)) (∥ e_{‖} ∥^{2} + {∥ e_{⊥} ∥}^{2}) \leq ∥ e_{‖} + e_{⊥} ∥^{2} = {∥ e ∥}^{2} \Rightarrow \frac{∥ e_{‖} ∥^{2}}{{∥ e ∥}^{2}} \geq \frac{1 - γ (t)}{2},

(A49)

where the constant

\frac{1}{2}

is an absolute (non-optimized) choice; any fixed

c \in (0, 1)

would do after renormalizing constants. Hence

\frac{{E ∥ Φ e ∥}^{2}}{{∥ e ∥}^{2}} \leq 1 - \underset{≍ (1 - γ (t))}{\underset{︸}{\frac{1}{2} (1 - γ {(t)}^{2}) (1 - γ (t))}},

(A50)

which implies an upper bound

α (t) \geq c_{-} (1 - γ (t)),

(A51)

for some

c_{-} > 0

independent of t (absorbing the numerical constants and the fixed weights). A matching upper bound follows by reversing the roles of the projectors in the two–subspace inequalities:

∥ P_{GBS} (t) P_{deg K} e_{‖} ∥ \geq \sqrt{1 - γ {(t)}^{2}} ∥ e_{‖} {∥ \Rightarrow E ∥ Φ e ∥}^{2} \geq (1 - (1 - γ {(t)}^{2}) \frac{∥ e_{‖} ∥^{2}}{{∥ e ∥}^{2}}) {∥ e ∥}^{2},

(A52)

and using

\frac{∥ e_{‖} ∥^{2}}{{∥ e ∥}^{2}} \leq \frac{1}{1 - γ (t)}

(another consequence of Proposition A.7) gives

α (t) \leq c_{+} (1 - γ (t))

(A53)

for some

c_{+} < \infty

independent of t. The monotonicity of

α (t)

in

γ (t)

is immediate, and maximization at

m_{t B} = 2 K

follows since

γ (t)

is minimized (best alignment) at the photon–match by the calibration optimality and angle–conditioning discussion. □

Appendix B.9. Bottom Line

Rock solid now: DPI; calibration optimality

m_{t B} = 2 K

; angle bounds; exact GBS-I variance; large-n GBS-P MSE;

(ε, δ)

inversions; Lyapunov envelope (conditional on a one-step contraction); composition stability; small-noise robustness (Davis–Kahan).

Provable with short add-ons: residual factorization

A C_{2 K} - B

; monotone “angle ⇒ rate” linkage; CLT for GBS-P; certified Gram surrogate bound.

Backed by the literature: nonempty open regions with advantage and percentage-of-advantage computations (Andersen–Shan 2025).

Still to complete: a model-independent proof of the one-step contraction and a fully quantified “poly vs. exp” lower bound with explicit constants under clear ensembles.

References

Andersen, J.E.; Shan, S. Using Gaussian Boson Samplers to Approximate Gaussian Expectation Problems, 2025, [arXiv:quant-ph/2502.19336].
Andersen, J.E.; Shan, S. Estimating the Percentage of GBS Advantage in Gaussian Expectation Problems, 2025, [arXiv:quant-ph/2502.19362].
Andersen, J.E.; Shan, S. Using Gaussian Boson Samplers to Approximate Gaussian Expectation Problems, 2025, [arXiv:quant-ph/2502.19336].
Hamilton, C.S.; Kruse, R.; Sansoni, L.; Barkhofen, S.; Silberhorn, C.; Jex, I. Gaussian Boson Sampling. Phys. Rev. Lett. 2017, 119, 170501.
Kruse, R.; Hamilton, C.S.; Sansoni, L.; Barkhofen, S.; Silberhorn, C.; Jex, I. Detailed study of Gaussian Boson Sampling. Phys. Rev. A 2019, 100, 032326.
Aaronson, S.; Arkhipov, A. The Computational Complexity of Linear Optics. In Proceedings of the STOC ’11, 2011, pp. 333–342.
Deshpande, A.; Mehta, A.; Vincent, T.; Quesada, N.; Hinsche, M.; Ioannou, M.; Madsen, L.; Lavoie, J.; Qi, H.; Eisert, J.; et al. Quantum computational advantage via high-dimensional Gaussian boson sampling. Science Advances 2022, 8, eabi7894. [CrossRef]
Bulmer, J.F.F.; Bell, B.A.; Chadwick, R.S.; Jones, A.E.; Moise, D.; Rigazzi, A.; Thorbecke, J.; Haus, U.U.; Vaerenbergh, T.V.; Patel, R.B.; et al. The boundary for quantum advantage in Gaussian Boson Sampling. Science Advances 2022, 8, eabl9236. [CrossRef]
Björklund, A.; Gupt, B.; Quesada, N. A faster hafnian formula for complex matrices and its benchmarking on a supercomputer. Journal of Experimental Algorithmics 2019, 24, 1–19. [CrossRef]
Gupt, B.; Izaac, J.; Quesada, N. The Walrus: a library for hafnians, Hermite polynomials and Gaussian Boson Sampling. J. Open Source Software 2019, 4, 1705.
Zhong, H.S.; Wang, H.; Deng, Y.H.; Chen, M.C.; Peng, L.C.; Luo, Y.H.; Qin, J.; Wu, D.; Gong, S.Q.; Su, H.; et al. Quantum computational advantage using photons. Science 2020, 370, 1460–1463. [CrossRef]
Madsen, L.S.; Laudenbach, F.; Falamarzi Askarani, M.; Rortais, F.; Vincent, T.; Bulmer, J.F.F.; Miatto, F.M.; Neuhaus, L.; Helt, L.G.; Collins, M.J.; et al. Quantum computational advantage with a programmable photonic processor. Nature 2022, 606, 75–81. [CrossRef]
Quesada, N.; Helt, L.G.; Izaac, J.; Arrazola, J.M.; Shahrokhshahi, R.; Myers, C.R.; Sabapathy, K.K. Simulating realistic non-Gaussian state preparation. Physical Review A 2019, 100, 022341, [arXiv:quant-ph/1810.03706]. [CrossRef]
Quesada, N.; Arrazola, J.M.; Killoran, N. Gaussian boson sampling using threshold detectors. Physical Review A 2018, 98, 062322, [arXiv:quant-ph/1807.01639]. [CrossRef]
Huh, J.; Guerreschi, G.G.; Peropadre, B.; McClean, J.R.; Aspuru-Guzik, A. Boson sampling for molecular vibronic spectra. Nature Photonics 2015, 9, 615–620. [CrossRef]
Huh, J.; Yung, M.H. Vibronic boson sampling: Generalized Gaussian boson sampling for molecular vibronic spectra at finite temperature. Scientific Reports 2017, 7, 7462. [CrossRef]
Banchi, L.; Fingerhuth, M.; Babej, T.; Ing, C.; Arrazola, J.M. Molecular docking with Gaussian Boson Sampling. Science Advances 2020, 6, eaax1950. [CrossRef]
Arrazola, J.M.; Bromley, T.R. Using Gaussian boson sampling to find dense subgraphs. Physical Review Letters 2018, 121, 030503. [CrossRef]
Brádler, K.; Dallaire-Demers, P.L.; Rebentrost, P.; Su, D.; Weedbrook, C. Gaussian boson sampling for perfect matchings of arbitrary graphs. Physical Review A 2018, 98, 032310. [CrossRef]
Brádler, K.; Friedland, S.; Izaac, J.; Killoran, N.; Su, D. Graph isomorphism and gaussian boson sampling. Special Matrices 2021, 9, 166–196. [CrossRef]
Schuld, M.; Brádler, K.; Israel, R.; Su, D.; Gupt, B. A quantum hardware-induced graph kernel based on GBS, 2019, [arXiv:quant-ph/1905.12646].
Jahangiri, S.; Arrazola, J.M.; Quesada, N.; Killoran, N. Point processes with Gaussian boson sampling. Physical Review E 2020, 101, 022134. [CrossRef]
Sparrow, C.; Martín-López, E.; Maraviglia, N.; Neville, A.; Harrold, C.; Carolan, J.; Joglekar, Y.N.; Hashimoto, T.; Matsuda, N.; O’Brien, J.L.; et al. Simulating the vibrational quantum dynamics of molecules using photonics. Nature 2018, 557, 660–667. [CrossRef]
He, Y.; Ding, X.; Su, Z.E.; Huang, H.L.; Qin, J.; Wang, C.; Unsleber, S.; Chen, C.; Wang, H.; He, Y.M.; et al. Time-bin-encoded boson sampling with a single-photon device. Physical Review Letters 2017, 118, 190501. [CrossRef]
Motes, K.R.; Gilchrist, A.; Dowling, J.P.; Rohde, P.P. Scalable boson sampling with time-bin encoding using a loop-based architecture. Physical Review Letters 2014, 113, 120501. [CrossRef]
Wick, G.C. The evaluation of the collision matrix. Physical Review 1950, 80, 268–272. [CrossRef]
Kocharovsky, V.V.; Kocharovsky, V.V.; Tarasov, S.V. The hafnian master theorem. Linear Algebra and its Applications 2022, 651, 144–161. [CrossRef]
Barvinok, A. Approximating Permanents and Hafnians. Discrete Analysis 2017, 2017, 1–34. [CrossRef]
Barvinok, A. Polynomial time algorithms to approximate permanents and mixed discriminants within a simply exponential factor. Random Structures & Algorithms 1999, 14, 29–61. [CrossRef]
Robert, C.P.; Casella, G. Monte Carlo Statistical Methods; Springer, 1999. [CrossRef]
Rubinstein, R.Y.; Kroese, D.P. Simulation and the Monte Carlo Method, 3 ed.; Wiley, 2016. [CrossRef]
Owen, A.B. Monte Carlo Theory, Methods and Examples. https://artowen.su.domains/mc/, 2013. Accessed 2025-02-28.
Owen, A.B. Practical Quasi-Monte Carlo Integration. https://artowen.su.domains/mc/practicalqmc.pdf, 2023. Accessed 2025-02-28.
Madsen, L.S.; Laudenbach, F.; Askarani, M.F.; Rortais, F.; Vincent, T.; Bulmer, J.F.F.; Miatto, F.M.; Neuhaus, L.; Helt, L.G.; Collins, M.J.; et al. Quantum computational advantage with a programmable photonic processor. Nature 2022, 606, 75–81. [CrossRef]
Zhong, H.S.; Wang, H.; Deng, Y.H.; Chen, M.C.; Peng, L.C.; Luo, Y.H.; Qin, J.; Wu, D.; Ding, X.; Hu, Y.; et al. Quantum computational advantage using photons. Science 2020, 370, 1460–1463. [CrossRef]
Deutsch, F. Best Approximation in Inner Product Spaces; Springer, 2001.
Davis, C.; Kahan, W.M. The Rotation of Eigenvectors by a Perturbation. SIAM Journal on Numerical Analysis 1970, 7, 1–46. [CrossRef]
Wedin, P.Å. Perturbation bounds in connection with singular value decomposition. BIT Numerical Mathematics 1972, 12, 99–111. [CrossRef]
van der Vaart, A.W. Asymptotic Statistics; Cambridge Univ. Press, 1998.
Billingsley, P. Probability and Measure, 3rd ed.; Wiley, 1995.

1	When all $Haf (B_{I}) \geq 0$ the convention is trivial; otherwise this corresponds to the usual loop-hafnian or sign–augmented readout.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.

Gaussian Boson Sampling as a Calibrated Variance-Reduction Engine: A DSFL Explanation

Abstract

Keywords:

Subject:

1. Introduction

1.1. Calibrated Residuals, Principal–Angle Conditioning, and Predict–Then–Prove Guarantees

1.2. Problem

1.3. DSFL Intuition: Forecasting When GBS Beats MC

1.4. Thesis (DSFL View)

1.5. Contributions

2. What GBS Does (Succinct but Precise)

GBS distribution and hafnians

Gaussian expectations via Wick’s theorem

Two practical GBS estimators

Tuning knob: photon-number matching

3. Related Work

3.1. Foundations of GBS and Hardness

3.2. GBS Estimators for Gaussian Expectations

3.3. Hilbertian Residual Control and Data Processing

3.4. Applications of GBS

3.5. GBS Estimators for Gaussian Expectations

3.6. Variance-Reduction and MC Baselines

3.7. Hafnian Identities and Tractable Limits

3.8. Angles, Data Processing, and Residual Control

3.9. Positioning

4. Limitations and Scope

4.1. DSFL Foresight for GBS: Verbal Logic → Mathematical Predictors

5. Stage 1: Put GBS in One Hilbert Geometry (the DSFL Stage)

Mathematics:

Stage 2: Admissibility ⇒ No-Inflation (DPI)

Mathematics (DPI)

Stage 3: Optimal Tuning = Mean-Photon Matching m t B = 2 K

Mathematics (Calibration Optimality)

Stage 4: Conditioning and Leakage via Principal Angles

Mathematics (Two-Subspace Inequality)

Stage 5: Rates via a Lyapunov Envelope

Mathematics:

Stage 6: When GBS Beats MC (Regime Maps)

Mathematics (Complexity Ratio)

Stage 7: What Happens When You Sweep t

Mathematics:

Stage 8: Composition and Noise

Mathematics:

Stage 9: Actionable Tests

Executive Summary of Foresight

Bottom Line

6. DSFL Primer (Minimal Machinery, with Proofs)

6.1. Comparison Geometry

6.2. Calibration (Interchangeability)

6.3. Residual of Sameness

6.4. Admissibility and the Data–Processing Inequality

6.5. Angles and Conditioning

6.6. Lyapunov Envelopes (Continuous and Discrete)

7. Mapping GBS ⟺ DSFL

7.1. Blueprint, Physical Channel, and Calibration in One Geometry

7.1.1. Hilbert Model

7.1.2. Blueprint

7.1.3. Physical Channel

7.1.4. Calibration (Estimator Readout)

7.1.5. Residual

7.2. Admissibility of GBS Tuning and Counting

7.3. Calibration Optimality: Photon-Number Matching

7.4. Angles Quantify Conditioning: Two-Sided MSE Bounds

7.5. Lyapunov Envelope and Non-Asymptotic Rates

8. Sample-Complexity Consequences

8.1. GBS-I (Unbiased): Exact Variance and ( ε , δ ) Guarantee

8.2. GBS-P (Bias Vanishes): Asymptotic MSE via Delta Method

8.3. When GBS Outperforms Monte Carlo

8.4. Theory.

9. Design Rules from DSFL (Actionable)

9.1. Calibration by Photon-Number Matching

9.2. Angle Check and Channel Selection

9.2.1. Practical Surrogate

9.3. Degree Decomposition and Hybrid Allocation

9.4. Seam Stability Under Composition (Gluing) and Leakage Quantification

9.4.1. Consequence

10. Experiments

10.1. Simulated GBS and Synthetic ( B , a I )

10.1.1. Design.

10.1.2. Estimators and Metrics

Stage 3: Optimal Tuning = Mean-Photon Matching $m_{t B} = 2 K$

8.1. GBS-I (Unbiased): Exact Variance and $(ε, δ)$ Guarantee

10.1. Simulated GBS and Synthetic $(B, a_{I})$

Calibration optimality ⇒ m_tB = 2K

Appendix B.8.7. Calibration Optimality m_tB = 2K

Appendix B.8.17. Principal-angle surrogate $\hat{γ} = {∥ X_{t}^{⊤} Y_{K} ∥}_{2}$ as a certified upper bound.