Submitted:
21 December 2025
Posted:
22 December 2025
You are already at the latest version
Abstract
Keywords:
1. Introduction
1.1. Problem and Certificate Goal
- (i)
- a refinement-limit learner exists and is unique in a fixed, declared ruler, and
- (ii)
- the multi-ledger evolution admits a single certified contraction clock (a uniform exponential rate class) that does not degrade as the refinement index grows.
1.2. Context and Gap
Multiobjective learning (within a fixed representation).
Refinement limits (stability in a fixed norm).
1.3. State of the Art, Approach, and Contributions
State of the art (adjacent lines).
Approach (two theorem engines).
Engine A (positivity ⇒ one clock).
Engine B (one ruler + summable discrepancies ⇒ refinement limit).
Contributions (with pointers).
- C1:
- One-clock reduction for nonnegative ledgers. We develop a checker form of the Metzler/Hurwitz implication: a copositive witness yields exponential contraction of a declared scalar ledger, with an explicit two-ledger small-gain boundary and monotone design levers (sec:certificate-layer).
- C2:
- Master Certificate across refinement ladders. We prove that four auditable programme lines in one ruler (tail-robust envelope, uniform margin, projective Cauchy tower, uniform dictionary) imply existence/uniqueness of a refinement-limit learner on together with rate inheritance and readout transport (sec:master-certificate-learning).
- C3:
- Instantiations. We give (i) a fully checkable toy ladder with explicit constants and (ii) a width-ladder protocol sketch indicating how to populate the certificate from logged traces (sec:instantiations).
- C4:
- Scope wall and composability. The proved guarantees are training-time stability and refinement-limit existence in a declared ruler. Statistical generalization, out-of-distribution shift, and stochastic-optimizer noise are not asserted unless they are explicitly ledgerized or imported as separate modules (e.g. stability/PAC-Bayes and robustness/verification frameworks) [16,17,18,19,20,21,22].
Certificate interface.
2. Problem: Multi-Metric Learning on Refinement Ladders
2.1. Learner State, Dynamics, and a Single Ambient Ruler
2.1.1. Refinement-Indexed State Spaces and One Ruler
2.1.2. Training Dynamics: Discrete Updates and Continuous Flows
2.1.3. State Augmentation
2.2. Ledgers and Reported Readouts
2.2.1. Ledger Maps and Trajectories
2.2.2. Total Ledgers (Certificate-Compatible Scalarizations)
- (i)
- (ii)
2.2.3. Reported Readouts and Dictionary Conditioning
- Reported/readout metrics depending on a K-dependent apparatus (validation sets, simulator fidelity, feature maps). These require an explicit dictionary condition later (programme line (O4)) to prevent ill-conditioning from faking improvement.
2.2.4. A Minimal Regularity Interface
2.3. Cross-Level Maps and the “Same Task” Requirement
2.3.1. Projective Structure
2.4. What Will Be Certified Later
3. Certificate Layer: One-Clock Reduction for Nonnegative Ledgers
3.1. Why Positivity Is the Right Abstraction
3.1.1. Nonnegative Ledgers and Injections
3.1.2. Why Positive Comparison Is Canonical
3.1.3. How Metzler Bounds Arise in Practice (Derivation Template)
- (i)
- write a differential (or finite-difference) inequality for each nonnegative ledger ;
- (ii)
- isolate a self-decay term (from dissipation, regularization, descent, contractive updates);
- (iii)
- bound cross-terms by nonnegative injections using operator norms or Lipschitz bounds and Young’s inequality, turning mixed products into sums of squares;
- (iv)
- collect coefficients into a Metzler matrix M with and .
3.2. Metzler Comparison Systems
3.2.1. Definition (Metzler Dominance)
3.2.2. Semantics: Funded Diagonals and Injections
3.2.3. Positivity of the Semigroup and Comparison
3.2.4. From Componentwise Bounds to Certified Scalar Ledgers
3.3. One-Clock Reduction
3.3.1. Spectral Abscissa and Effective Rate
3.3.2. Checker form: A Copositive Lyapunov Witness
3.3.3. From Hurwitzness to a Witness, and the Sharp Rate
3.3.4. Two-Ledger Closed Form (Explicit Small-Gain Boundary)
3.3.5. Monotone Engineering Levers: Funded Diagonals vs. Injections
3.3.6. A Checker View: What Must Be Provided and What Is Proved
4. Master Certificate for Learning Under Refinement
- One-ruler Cauchy geometry. Cross-level comparisons are performed in the same instrument norm via the ambient realization (e.g. by comparing to ). If the adjacent discrepancies form a summable tail in K, then the tower is Cauchy in the ruler and determines a unique refinement-limit trajectory on .
- Tail-robust contraction on . Each level admits an inhomogeneous decay inequality for with a margin bounded below uniformly in K and a pollution budget whose integrated size is summable along the ladder. This yields a common exponential envelope on , up to an explicit tail floor that vanishes with refinement.
4.1. Ladder Geometry and Refinement Limits (Analytic Core)
4.1.1. Ambient One-Ruler Structure and Projective Maps
- an ambient real Hilbert space ,
- linear realization maps for each K,
- bounded projections for each K,
- a bounded, self-adjoint, strictly positive operator (theinstrument),
- coarse-graining maps for each K,
- (i)
- Compatibility (coarse-graining is realized by projection).For all ,
- (ii)
-
Single instrument (restriction of one ambient ruler).For defineEquivalently, is the ambient ruler and is its restriction to .
4.1.2. Cross-Level Discrepancy and Telescoping
4.1.3. Existence and Uniqueness of a Refinement-Limit Object
4.2. Programme Lines (O1)–(O4) on
4.2.0.1. Certified time horizon.
4.2.1. Objects on the Window: Trajectories, Ruler, and a Declared Total Ledger
(O1) Tail-robust contraction envelope (summable pollution budget)
(O2) Uniform margin (one exponent class across refinement)
(O3) Projective Cauchy tower in one ruler (summable cross-level inconsistency)
(O4) Uniform dictionary / observability (reported readouts remain well-conditioned)
4.2.2. Checker Summary (What the Certificate Must Provide)
4.3. Master Certificate Theorem on
- (i)
- Tail-robust levelwise envelope.With λ as in(O2)and as in(O1), for all and all ,
- (ii)
-
Existence and uniqueness of a refinement-limit trajectory.Define the geometric tower budgetwhere are the sequences from(O3). Then , and there exists a unique trajectory such that in , and
- (iii)
- Readout transport (rate inheritance at each level).For every (from(O4)), for all and all ,
5. Instantiations
5.1. Setup: an Ladder with Diagonal Constraint Operator
5.2. Dynamics: Damped Primal–Dual Flow
5.3. Ledgers and Readouts
5.4. Metzler Comparison and a Checker-Friendly Hurwitz Witness
5.5. Programme Lines (O1)–(O4) (Vanishing Tails)
- (i)
- (O1)holds with and ;
- (ii)
- (O2)holds with uniform margin from (33);
- (iii)
- (O3)holds with and by Lemma 5;
- (iv)
- (O4)holds with for and .
5.6. Consequence: Refinement-Limit Existence and Inherited Clock
5.7. Numerical Sanity Check (Uniform in K)
5.8. Neural Instantiation (Protocol): A Width Ladder with Auditable Programme Lines
5.8.1. Setup: Width Ladder and a Declared Projection
5.8.2. Practical Ledgers (Loggable During Training)
5.8.3. Estimating Metzler Coefficients from Traces
5.8.4. Empirical Checks for (O1)–(O4)
- (O2) uniform margin: fit envelope rates for and test uniformity in K.
- (O3) projective Cauchy: measure across checkpoints.
- (O1) tail summability: quantify unmodeled remainder budgets and test .
- (O4) dictionary conditioning: bound conditioning constants uniformly in K.
5.8.5. Interpretation
6. Concluding Discussion and Outlook
Notation
| Symbol | Type / Domain | Meaning / Assumptions |
| Symbol | Type / Domain | Meaning / Assumptions |
| Ambient ruler and refinement ladder | ||
| Hilbert space | Ambient realization space used to enforce a single ruler (Definition 2) | |
| W | operator on | Bounded, self-adjoint, strictly positive instrument operator defining |
| inner product | Ambient Hilbert inner product on | |
| norm | Instrument norm: on | |
| set / space | State/parameter space at refinement level K (width, resolution, basis size, etc.) | |
| integer | Minimal refinement index considered; ladder runs over | |
| map | Realization/embedding of level-K states into the ambient ruler space | |
| projection on | Orthogonal projection onto | |
| map | Coarse-graining / restriction map between adjacent refinement levels | |
| map | Multi-step projection (when well-defined) | |
| Training dynamics (discrete and continuous) | ||
| curve in | Continuous-time training trajectory at refinement level K | |
| sequence in | Discrete-time iterates at refinement level K | |
| map on | Discrete update map: | |
| vector field | Continuous-time flow: | |
| t | Continuous time (or rescaled iteration-time) | |
| n | Discrete iteration index | |
| Ledgers and scalarizations | ||
| Nonnegative ledger (risk, constraint violation, robustness proxy, etc.) | ||
| Ledger vector | ||
| Level-K ledger vector along training at refinement K | ||
| Declared total ledger (scalarization) used for the certificate | ||
| w | Positive weight vector defining | |
| SPD matrix | Optional quadratic ledger ruler: | |
| finite set | Family of reported readout metrics | |
| scalars | Dictionary/observability constants in (line (O4)) | |
| Metzler comparison system and one-clock quantities | ||
| M | matrix | Metzler comparison matrix: for (Definition 1) |
| matrix | Level-K comparison matrix in | |
| scalar | Funded self-decay margin for ledger i when | |
| scalar | Injection strength from ledger j to ledger i when | |
| matrix | Matrix exponential (positive for Metzler M) | |
| set | Spectrum (eigenvalues) of M | |
| scalar | Spectral abscissa: | |
| Hurwitz | property | M Hurwitz |
| scalar | Effective one-clock rate: (when M is Hurwitz) | |
| scalar | Gain in the form | |
| Refinement programme-line budgets (Master Certificate) | ||
| T | Declared certification horizon: programme lines (O1)–(O4) are required on | |
| Tail disturbance in (O1) on : | ||
| Integrated pollution budget in (O1): with | ||
| Unresolved-tail budget in (O3): with | ||
| Projective mismatch budget in (O3): with | ||
Definitions
| Entry | Definition / Formula | Role in the paper |
| Entry | Definition / Formula | Role in the paper |
| Ladder geometry (“one ruler”) | ||
| Projective consistency | Encodes “same task” across refinement levels | |
| One ruler (ambient instrument restriction) | There exist and realizations such that and | Forbids moving goalposts (Definition 2) |
| Instrument norm and ruler square root | Fixes the single measurement convention across all levels | |
| Instrument contractivity (projection stability) | on | Ensures coarse projections do not inflate the ruler norm (Assumption A1) |
| Non-expansiveness across levels | Basic stability of coarse-graining (Lemma 2) | |
| Cross-level discrepancy (ambient) | Canonical “apples-to-apples” cross-level distance | |
| Projective telescoping bound | Turns summable adjacent mismatches into a Cauchy tower (Lemma 3) | |
| Refinement-limit learner (projective limit) | with (compatible Cauchy tower) | Existence/uniqueness of a refinement-limit object (Theorem 2) |
| Ledgers, scalarizations, and readouts | ||
| Ledger vector | Collects multiple nonnegative training-time quantities | |
| Declared total ledger | with | Single scalar clock target for contraction |
| Quadratic ledger ruler (optional) | , | Alternative scalarization when a quadratic contract is preferred |
| Reported readouts / metrics | , with | External observables whose stability is transported |
| Dictionary (observability) line | uniformly in K | Transfers the certified clock to readouts ((O4)) |
| Readout Lipschitz transport (optional) | on a declared bounded set | Converts ruler convergence to readout convergence (optional strengthening) |
| Metzler comparison and one-clock reduction | ||
| Metzler matrix | with for | Positivity structure for ledger couplings |
| Metzler comparison inequality | componentwise | Auditable coupling model (Definition 1) |
| Semigroup positivity | M Metzler entrywise for all | Enables order-preserving comparison |
| Comparison principle (Duhamel form) | If y solves , and , then , | Correct proof mechanism for (Lemma 1) |
| Funding + injection parametrization | , | Interpretable design levers (fund diagonals, reduce injections) |
| Spectral abscissa and effective rate | , if | Defines the certified one-clock exponent class |
| One-clock certificate (witness form) | with | Core reduction theorem (Theorem 1) |
| Hurwitz ⇒ witness (sharp rate) | If and M is Metzler, with | Explains existence of witnesses / sharp clock (Proposition 1) |
| Two-ledger small-gain criterion | For funded+injection M, M Hurwitz | Exact design boundary (Proposition 2) |
| Master Certificate programme lines on | ||
| Certification horizon | (declared) | All programme lines are audited on |
| (O1) Tail-robust envelope | , , | Controls time-direction pollution on |
| (O2) Uniform margin | independent of K | One exponent class across refinement |
| (O3) Geometric tower budget | , | Cauchy tower in one ruler (no drift) on |
| (O4) Uniform dictionary on | uniformly in K and | Transfers rates to reported metrics |
| Master Certificate (learning version) | (O1)–(O4) ⇒ refinement-limit trajectory + rate inheritance on | Main soundness engine (Theorem 3) |
| Certificate artifact (what the prover ships) | ||
| Certificate artifact | Declared ; maps ; ledger definition ; budgets ; margin ; dictionary constants ; and (when applicable) a Metzler witness | Concrete verifier-facing interface (the “proof-carrying” object) |
| Witness-finding LP (optional recipe) | Find , s.t. , , | Makes the one-clock witness actionable for ML/CS audiences |
References
- Miettinen, K. Nonlinear Multiobjective Optimization . In International Series in Operations Research & Management Science; Kluwer Academic Publishers: Boston, MA, 1999; Vol. 12. [Google Scholar] [CrossRef]
- Sener, O.; Koltun, V. Multi-Task Learning as Multi-Objective Optimization. Proceedings of the Advances in Neural Information Processing Systems 2018, arXiv:csVol. 31, 525–536. [Google Scholar]
- Kaplan, J.; McCandlish, S.; Henighan, T.; Brown, T.B.; Chess, B.; Child, R.; Gray, S.; Radford, A.; Wu, J.; Amodei, D. Scaling Laws for Neural Language Models. arXiv 2020, arXiv:2001.08361. [Google Scholar] [CrossRef]
- Ciarlet, P.G. The Finite Element Method for Elliptic Problems . In Studies in Mathematics and Its Applications; North-Holland Publishing Company: Amsterdam, 1978; Vol. 4. [Google Scholar]
- Hackbusch, W. Multi-Grid Methods and Applications . In Springer Series in Computational Mathematics; Springer-Verlag: Berlin, 1985; Vol. 4. [Google Scholar] [CrossRef]
- Brenner, S.C.; Scott, L.R. The Mathematical Theory of Finite Element Methods . In Texts in Applied Mathematics, 3 ed.; Springer: New York, 2008; Vol. 15. [Google Scholar] [CrossRef]
- Lax, P.D.; Richtmyer, R.D. Survey of the Stability of Linear Finite Difference Equations. Communications on Pure and Applied Mathematics 1956, 9, 267–293. [Google Scholar] [CrossRef]
- Hille, E.; Phillips, R.S. Functional Analysis and Semi-Groups . In American Mathematical Society Colloquium Publications; American Mathematical Society: Providence, RI, 1957; Vol. 31. [Google Scholar]
- Trotter, H.F. Approximation of Semi-Groups of Operators. Pacific Journal of Mathematics 1958, 8, 887–919. [Google Scholar] [CrossRef]
- Pazy, A. Semigroups of Linear Operators and Applications to Partial Differential Equations . In Applied Mathematical Sciences; Springer: New York, 1983; Vol. 44. [Google Scholar] [CrossRef]
- Berman, A.; Plemmons, R.J. Nonnegative Matrices in the Mathematical Sciences; Academic Press, 1979. [Google Scholar] [CrossRef]
- Farina, L.; Rinaldi, S. Positive Linear Systems: Theory and Applications . In Pure and Applied Mathematics; Wiley–Interscience: New York, 2000; Vol. 255. [Google Scholar]
- Smith, H.L. Monotone Dynamical Systems: An Introduction to the Theory of Competitive and Cooperative Systems . In Mathematical Surveys and Monographs; American Mathematical Society: Providence, RI, 1995; Vol. 41. [Google Scholar] [CrossRef]
- Briat, C. Linear Parameter-Varying and Time-Delay Systems: Analysis, Observation, Filtering & Control . In Advances in Delays and Dynamics; Springer: Berlin, Heidelberg, 2015. [Google Scholar] [CrossRef]
- Kreyszig, E. Introductory Functional Analysis with Applications . In Wiley Classics Library; Wiley, 1989; Vol. 17. [Google Scholar]
- Bousquet, O.; Elisseeff, A. Stability and Generalization. Journal of Machine Learning Research 2002, 2, 499–526. [Google Scholar] [CrossRef]
- McAllester, D.A. PAC-Bayesian Model Averaging. In Proceedings of the Proceedings of the Twelfth Annual Conference on Computational Learning Theory (COLT ’99), New York, NY, USA, 1999; pp. 164–170. [Google Scholar] [CrossRef]
- Catoni, O. PAC-Bayesian Supervised Classification: The Thermodynamics of Statistical Learning . In Institute of Mathematical Statistics Lecture Notes–Monograph Series; Institute of Mathematical Statistics: Beachwood, OH, 2007; Vol. 56. [Google Scholar] [CrossRef]
- Duchi, J.C.; Namkoong, H. Learning Models with Uniform Performance via Distributionally Robust Optimization. The Annals of Statistics 2021, 49, 1378–1406. [Google Scholar] [CrossRef]
- Katz, G.; Barrett, C.; Dill, D.L.; Julian, K.; Kochenderfer, M.J. Reluplex: An Efficient SMT Solver for Verifying Deep Neural Networks. In Proceedings of the Computer Aided Verification (CAV 2017), Proceedings, Part I; Majumdar, R., Kuncak, V., Eds.; Lecture Notes in Computer Science : Cham, 2017; Vol. 10426, pp. 97–117. [Google Scholar] [CrossRef]
- Gehr, T.; Mirman, M.; Drachsler-Cohen, D.; Tsankov, P.; Chaudhuri, S.; Vechev, M. AI2: Safety and Robustness Certification of Neural Networks with Abstract Interpretation. In Proceedings of the 2018 IEEE Symposium on Security and Privacy (SP), San Francisco, CA, USA, 2018; pp. 3–18. [Google Scholar] [CrossRef]
- Seshia, S.A.; Sadigh, D.; Sastry, S.S. Toward Verified Artificial Intelligence An earlier technical version appeared as. Communications of the ACM 2022, arXiv:1606.0851465, 46–55. [Google Scholar] [CrossRef]
- Necula, G.C. Proof-carrying code. In Proceedings of the Proceedings of the 24th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL ’97), New York, NY, USA, 1997; pp. 106–119. [Google Scholar] [CrossRef]
- Coddington, E.A.; Levinson, N. Theory of Ordinary Differential Equations; McGraw–Hill: New York, 1955. [Google Scholar]
- Khalil, H.K. Nonlinear Systems, 3 ed.; Prentice Hall: Upper Saddle River, NJ, 2002. [Google Scholar]
- Grönwall, T.H. Note on the Derivatives with Respect to a Parameter of the Solutions of a System of Differential Equations. Annals of Mathematics (2) 1919, 20, 292–296. [Google Scholar] [CrossRef]
- Robbins, H.; Monro, S. A Stochastic Approximation Method. The Annals of Mathematical Statistics 1951, 22, 400–407. [Google Scholar] [CrossRef]
- Benaïm, M. Dynamics of Stochastic Approximation Algorithms. In Séminaire de Probabilités XXXIII; Azéma, J., Émery, M., Ledoux, M., Yor, M., Eds.; Springer: Berlin, Heidelberg; Lecture Notes in Mathematics , 1999; Vol. 1709, pp. 1–68. [Google Scholar] [CrossRef]
- Kushner, H.J.; Yin, G.G. Stochastic Approximation and Recursive Algorithms and Applications . In Stochastic Modelling and Applied Probability, 2 ed.; Springer: New York, 2003; Vol. 35. [Google Scholar] [CrossRef]
- Bramble, J.H.; Pasciak, J.E.; Xu, J. Parallel Multilevel Preconditioners. Mathematics of Computation 1990, 55, 1–22. [Google Scholar] [CrossRef]
- Thomée, V. Galerkin Finite Element Methods for Parabolic Problems . In Springer Series in Computational Mathematics, 2 ed.; Springer: Berlin, Heidelberg, 2006. [Google Scholar] [CrossRef]
- Emmrich, E. Discrete Versions of Gronwall’s Lemma and Their Application to the Numerical Analysis of Parabolic Problems. In Preprint Reihe Mathematik 637; Technische Universität Berlin, 1999. [Google Scholar]
- Desoer, C.A.; Vidyasagar, M. Feedback Systems: Input-Output Properties; Academic Press: New York, 1975. [Google Scholar]
- Jiang, Z.P.; Teel, A.R.; Praly, L. Small-Gain Theorem for ISS Systems and Applications. IEEE Transactions on Automatic Control 1994, 39, 1609–1619. [Google Scholar] [CrossRef]
| K | ||||
|---|---|---|---|---|
| 10 | ||||
| 20 | ||||
| 40 | ||||
| 80 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
