Towards Generative Interest-Rate Modelling: Neural Perturbations Within the Libor Market Model

Anna Knezevic

doi:10.20944/preprints202512.0329.v1

Submitted:

02 December 2025

Posted:

04 December 2025

You are already at the latest version

Abstract

This study proposes a neural-augmented Libor Market Model (LMM) for swaption-surface calibration that enhances expressive power while maintaining the interpretability, arbitrage-free structure, and numerical stability of the classical framework. Classical LMM parametrizations, based on exponential-decay volatility functions and static correlation kernels, are known to perform poorly in sparsely quoted and long-tenor regions of swaption volatility cubes. Machine-learning–based diffusion models offer flexibility but often lack transparency, stability, and measure-consistent dynamics. To reconcile these requirements, the present approach embeds a compact neural network within the volatility and correlation layers of the LMM, constrained by structural diagnostics, low-rank correlation construction, and HJM-consistent drift. Empirical tests across major currencies (EUR, GBP, USD) and multiple quarterly datasets from 2024–2025 show that the neural-augmented LMM consistently outperforms the classical model. Improvements of approximately 9–15% in implied-volatility RMSE and 11–17% in PV RMSE are observed across all datasets, with no deterioration in any region of the surface. These results reflect the model’s ability to represent cross-tenor dependencies and surface curvature beyond the reach of classical parametrizations, while remaining economically interpretable and numerically tractable. The findings support hybrid model designs in quantitative finance, where small neural components complement robust analytical structures. The approach aligns with ongoing industry efforts to integrate machine learning into regulatory-compliant pricing models and provides a pathway for future generative LMM variants that retain arbitrage-free diffusion structure while learning data-driven volatility geometry.

Keywords:

Libor Market Model

;

generative AI

;

caplet calibration

;

neural volatility surfaces

Subject:

Computer Science and Mathematics - Applied Mathematics

1. Introduction

The Libor Market Model (LMM), also known as the Brace–Gatarek–Musiela model, remains one of the most established frameworks for interest-rate option pricing [1,2,3]. Its appeal lies in the explicit modeling of discretely compounded forward rates, the transparent separation of volatility and correlation parameters [4,5], and the interpretability of its structural assumptions [6,7]. These features have made the model a staple for pricing and hedging swaptions across trading desks and risk-management environments.

Despite its wide adoption, the classical LMM is limited by its reliance on parametric volatility and correlation specifications. Such parametrizations often fail to reproduce the full swaption volatility cube observed in modern markets, particularly strike-dependent features, tenor-specific curvature, and cross-sectional structure [8,9,10]. Extensions such as SABR–LMM hybrids [11] address some smile features but introduce additional assumptions and can remain insufficient for jointly matching moneyness and maturity structures.

Recent advances in machine learning have reopened the question of whether established financial models could be enhanced without sacrificing their interpretability. Neural stochastic differential equations [12,13] and deep-learning approaches to high-dimensional PDEs and BSDEs [14,15] provide powerful tools for approximating complex diffusions, while physics-informed neural networks integrate structural PDE constraints directly into training [16,17]. However, most such methods sacrifice transparency or introduce latent representations whose financial interpretation is unclear (aka black box approach). For option-pricing applications, where calibration stability, risk-factor interpretability, and traceability of sensitivities are essential, full replacement of classical structures is rarely acceptable.

Within this context, the present work views neural augmentation not as a substitute for the LMM’s analytical foundation, but as a constrained overlay. Neural networks are introduced solely to parameterize volatility and correlation structures within the established forward-rate dynamics, and only under diagnostics designed to enforce model-consistent properties [18,19]. This preserves the interpretability of the diffusion and drift structure while allowing greater expressiveness in matching empirical swaption surfaces. Such a design is aligned with the aims of the special issue, which emphasizes applications of modern machine-learning tools that enhance, rather than replace domain-specific models.

Although the industry now calibrates swaptions using OIS discounting and benchmarks such as EURIBOR, SONIA, and SOFR, the purpose of the SOFR discussion here is not to recast the LMM for backward-looking compounded rates. Rather, it is here to illustrate that several calibration difficulties encountered when applying LMM-style models to SOFR (e.g., synthetic term construction, normal-volatility quoting conventions) reflect structural misalignment between forward-looking and backward-looking benchmarks. The proposed neural augmentation demonstrates how to tackle certain deficiencies of classical parametrizations arise in both IBOR and SOFR settings, without requiring modifications to the LMM’s interpretability. The introduction of neural components raises natural questions regarding the necessity of classical extensions such as SABR overlays or jump-diffusion terms. In this framework neither is included. SABR is excluded because neural parametrization provides sufficient functional expressiveness to capture smile curvature within the LMM structure [19]. Jumps are excluded because, under swaption-only calibration, jump intensity and diffusive volatility are not separately identifiable in a stable manner [20,21,22,23]. Avoiding these components prevents parameter proliferation and maintains the clarity of the diffusion-based interpretation.

The objective of this study is twofold: (i) to improve the calibration of the LMM to the swaption volatility cube, and (ii) to do so while preserving the model’s transparent structure. Neural augmentation is therefore treated not as flexibility for its own sake, but as a targeted mechanism to address long-standing calibration deficiencies without altering the economic interpretation of the model. Subsequent sections detail the construction of the neural parametrizations, the diagnostics used to enforce structural consistency, and the empirical performance relative to classical specifications.

2. Materials and Methods

2.1. Market Data and Instruments

2.1.1. Yield Curves

The empirical analysis is based on Bloomberg swaption volatility cubes for USD-SOFR, EUR-EURIBOR, and GBP-SONIA, together with the corresponding discount and projection curves [2,3,6,24]. For each valuation date, input CSV files provide par yields for the relevant overnight-indexed swap (OIS) curves used for discounting and IBOR-linked swap curves used for projecting forward rates. These are converted into zero-coupon discount factors

P (0, T)

via a standard bootstrapping routine external to the present framework [2,4,6]. The resulting discount function is represented numerically by a callable interpolant

D F_{0} (T)

, which returns

P (0, T)

for maturities T.

Given a tenor structure

{T_{0}, \dots, T_{N}}

with accrual factors

δ_{i} = T_{i + 1} - T_{i}

, forward rates at time 0 are defined by

L_{i} (0) = \frac{1}{δ_{i}} (\frac{P (0, T_{i})}{P (0, T_{i + 1})} - 1), i = 0, \dots, N - 1 .

(1)

consistent with standard LMM conventions [1,2,3]. Par swap rates

S_{0}

needed for swaption pricing are computed from the same discount curve using

\begin{matrix} A_{0} & = \sum_{j = k}^{k + m - 1} δ_{j} P (0, T_{j + 1}), \end{matrix}

(2)

\begin{matrix} S_{0} & = \frac{P (0, T_{k}) - P (0, T_{k + m})}{A_{0}}, \end{matrix}

(3)

ensuring internal consistency between forward-rate and swap-rate inputs [1,2,3,6] for a swap starting at

T_{k}

with m accrual periods. This ensures internal consistency between the forward-rate and swap-rate inputs.

2.1.2. Swaption Volatility Surface

The calibration target is the ATM swaption volatility surface for USD-SOFR, EUR-EURIBOR and GBP-SONIA. Market inputs are provided as normal (Bachelier) ATM implied volatilities

σ_{N}

, quoted in basis points on an expiry–tenor grid for each currency [2,6,24]. Expiry and swap-tenor labels (e.g., 6M, 1Y, 5Y) are parsed into year fractions, and the union of all expiries and underlying payment dates is merged with the yield-curve pillars to define the simulation tenor array [2,3].

For each grid point, an annuity-consistent conversion from normal to Black ATM volatility is performed. Let F denote the ATM forward swap rate for expiry

T_{exp}

and underlying swap annuity

A_{0}

. Under the normal model, the per-annuity price of an ATM payer swaption is

{PV}_{ATM}^{N} = \frac{σ_{N} \sqrt{T_{exp}}}{\sqrt{2 π}} .

(4)

Under the Black model with volatility

σ_{B}

, the corresponding per-annuity price is

{PV}_{ATM}^{B} (σ_{B}) = F [Φ (\frac{1}{2} σ_{B} \sqrt{T_{exp}}) - Φ (- \frac{1}{2} σ_{B} \sqrt{T_{exp}})],

(5)

where

Φ

is the standard normal CDF. For each grid point,

σ_{B}

is obtained by numerically solving

{PV}_{ATM}^{B} (σ_{B}) = {PV}_{ATM}^{N},

(6)

using a robust root-finding procedure with adaptive bracketing and safe fallbacks [2,6,9,11]. The resulting Black-ATM surface

σ_{ATM}^{mkt} (T_{exp}, τ)

constitutes the primary calibration target.

2.2. Classical LIBOR Market Model Specification

2.2.1. Forward Dynamics

Let

{T_{0}, \dots, T_{N}}

be a fixed tenor structure with associated forward rates

L_{i} (t)

for accrual periods

[T_{i}, T_{i + 1}]

. Under the forward measure

Q^{T_{i + 1}}

and numeraire

P (t, T_{i + 1})

, the LMM specifies

d L_{i} (t) = σ_{i} (t) L_{i} (t) d W_{i}^{(i + 1)} (t),

(7)

where

σ_{i} (t)

is the instantaneous volatility and

W_{i}^{(i + 1)}

is a Brownian motion under

Q^{T_{i + 1}}

[1,2,3].

Q^{T_{i + 1}}

. When written under a common terminal measure

Q^{T_{N}}

, the coupled dynamics take the form

d L_{i} (t) = μ_{i} (t) L_{i} (t) d t + σ_{i} (t) L_{i} (t) d W_{i} (t),

(8)

with drift given by the HJM-style no-arbitrage condition

μ_{i} (t) = \sum_{j = i + 1}^{N - 1} \frac{δ_{j} L_{j} (t) σ_{i} (t) ρ_{i j} σ_{j} (t)}{1 + δ_{j} L_{j} (t)},

(9)

and instantaneous correlations

E [d W_{i} (t) d W_{j} (t)] = ρ_{i j} d t .

(10)

2.2.2. Functional Volatility and Correlation

The classical benchmark uses low-dimensional functional forms:

\begin{matrix} σ_{i}^{cl} (t) & = a exp (- b τ_{i} (t)), τ_{i} (t) = max (T_{i} - t, 0), \end{matrix}

(11)

\begin{matrix} ρ_{i j}^{cl} & = exp (- β | i - j |), \end{matrix}

(12)

with parameters

(a, b, β)

to be calibrated. This structure reduces the number of free parameters and reflects the empirical decay of volatilities and correlations across maturities [2,4,5,8].

2.3. Neural Parametrization of Volatility and Correlation

2.3.1. Architecture

To enhance calibration power while preserving interpretability, the classical parametrization is overlaid with a compact neural network. The network

f_{θ}

takes as input the current time t and forward-rate vector

L (t) = (L_{0} (t), \dots, L_{N - 1} (t))

, and outputs perturbations to the classical volatility and a low-rank representation of the correlation structure:

(Δ σ (t), B (t)) = f_{θ} (t, L (t)),

(13)

where

Δ σ (t) \in R^{N}

and

B (t) \in R^{N \times r}

for a small rank r (typically

r \in {2, 3}

). The effective volatility and correlation are then defined as

\begin{matrix} σ_{i} (t) & = σ_{i}^{cl} (t) \cdot exp (Δ σ_{i} (t)), \end{matrix}

(14)

\begin{matrix} \tilde{C} (t) & = B (t) B {(t)}^{⊤}, \end{matrix}

(15)

\begin{matrix} ρ (t) & = Π_{C} (\tilde{C} (t)), \end{matrix}

(16)

where

Π_{C}

denotes a projection onto the set of correlation matrices (symmetric positive semidefinite with unit diagonal), implemented via diagonal rescaling combined with a nearest-correlation operator [7,8]. In practice, this projection combines diagonal rescaling with a Higham-style nearest-correlation operator[25].

The network itself is a shallow multi-layer perceptron with small hidden layers (8–16 units), smooth activations, and explicit output clipping:

Δ σ_{i} (t) \in [\underset{̲}{Δ}, \bar{Δ}], B_{i j} (t) \in [\underset{̲}{b}, \bar{b}],

(17)

to avoid extreme values and improve numerical stability.

Jump components are not modeled in the final specification; jump-related heads are disabled and no Poisson or jump-amplitude parameters enter the SDEs.

2.4. Monte Carlo Simulation and Swaption Pricing

2.4.1. Path Simulation

Both classical and neural-augmented LMMs are simulated under the terminal measure

Q^{T_{N}}

using an Euler-Maruyama scheme with lognormal updates to preserve positivity of forward rates [1,2,3,21]. For a time step of size

Δ t

, the update for forward rate

L_{i} (t)

is

L_{i} (t + Δ t) = L_{i} (t) exp ([μ_{i} (t) - \frac{1}{2} σ_{i} {(t)}^{2}] Δ t + σ_{i} (t) \sqrt{Δ t} {(Γ (t) ϵ_{t})}_{i}),

(18)

where

ϵ_{t} \sim N (0, I)

and

Γ (t)

is the Cholesky factor of the correlation matrix

ρ (t)

obtained after projection. This multiplicative update avoids negative forward rates and remains consistent with the lognormal assumption underlying Black pricing.

Simulation is fully vectorized: a fixed number of time steps

n_{steps}

and Monte Carlo paths

n_{paths}

are used, with quantities stored in rank-3 tensors

(time, path, forward)

. Classical and neural simulations share the same numerical grid to permit direct comparison.

2.4.2. Swaption Pricing and Implied Volatility

For each expiry-tenor pair

(T_{exp}, τ)

present in the market surface, the model-implied ATM payer swaption price is computed from simulated paths. Let

S (T_{exp})

denote the simulated par swap rate at option expiry constructed from the simulated discount factors. The discounted payoff for each path is

Π = max (S (T_{exp}) - S_{0}, 0) A_{0},

(19)

discounted back to time 0 using the simulated or initial discount curve. The Monte Carlo estimator of the present value is

{\hat{PV}}^{MC} = \frac{1}{n_{paths}} \sum_{k = 1}^{n_{paths}} Π^{(k)} .

(20)

An implied Black-ATM volatility

σ_{ATM}^{model} (T_{exp}, τ)

is then recovered by inverting the Black-ATM pricing formula with

F = S_{0}

and annuity

A_{0}

, ensuring comparability with the market surface and consistent with standard practice [2,3,6].

2.5. Calibration Objectives and Diagnostics

2.5.1. Vega-Weighted Swaption Data Loss

The primary calibration objective is a vega-weighted least squares error between market and model-implied ATM Black volatilities over all valid surface points [2,9,11]:

L_{data} = \frac{1}{N_{cells}} \sum_{(T_{exp}, τ)} {(\frac{{\hat{PV}}^{MC} (T_{exp}, τ) - {PV}^{mkt} (T_{exp}, τ)}{{Vega}^{mkt} (T_{exp}, τ)})}^{2} .

(21)

Here

{PV}^{mkt}

is the market price implied by the market Black-ATM volatility, and

{Vega}^{mkt}

is the corresponding Black vega. This normalizes errors in "volatility points" and emphasizes regions where the market is most sensitive.

To reduce computational cost, a mini-batch strategy is employed: at each training step, only a small random subset of surface cells is used to estimate

L_{data}

. Over the course of training, all cells are visited repeatedly.

2.5.2. Structural Regularization and Diagnostics

In addition to the data loss, a light structural regularizer is introduced. Rather than computing a full second-order pricing PDE residual, a first-order proxy penalizes rapid time variation and large gradients of the neural outputs, following ideas from physics-informed and BSDE-based deep learning for pricing and control [14,15,16,17]:

L_{struct} = E_{(t, L)} [∥ \partial_{t} {σ (t, L) ∥}^{2} + ∥ \nabla_{L} {σ (t, L) ∥}^{2} + {∥ σ (t, L) ∥}^{2}],

(22)

where expectations are approximated using states visited along simulated paths. This encourages smoothness in time and state space without incurring the numerical overhead and instability of nested automatic differentiation.

A set of diagnostics is tracked but not directly optimized:

minimum eigenvalue of $ρ (t)$ over time and paths (PSD check);
fraction of simulated forward rates that become negative (positivity check);
deviations from martingale conditions for discounted swap rates;
gradient norms of network parameters and incidence of NaN/Inf values.

These statistics are used to tune regularization weights and learning rates but are not explicitly included in

L_{total}

.

2.5.3. Total Objective

The overall loss used for neural calibration is

L_{total} = λ_{data} L_{data} + λ_{struct} L_{struct},

(23)

with

λ_{data} ≫ λ_{struct}

to prioritize market fit while maintaining a minimal level of smoothness and stability.

2.6. Pricing Error Diagnostics

For each swaption defined by expiry T and strike K, the model-implied price is computed via Monte Carlo path simulation under both the classical and neural models. These are compared against market-implied Black prices:

{PV}^{Black} = {DF}_{pay} \cdot Δ \cdot BlackCall (F, K, σ_{mkt}, T)

(24)

Pricing errors are evaluated both in absolute basis points and as vega-weighted deviations:

\begin{matrix} Absolute error (bp) & = 10^{4} \cdot ({PV}^{model} - {PV}^{market}) \end{matrix}

(25)

\begin{matrix} Vega-weighted error & = \frac{{PV}^{model} - {PV}^{market}}{BlackVega (F, K, T, σ_{mkt})} \end{matrix}

(26)

Model-implied volatilities

\hat{σ}

are obtained by inverting the Black formula:

{PV}^{model} = {DF}_{pay} \cdot Δ \cdot BlackCall (F, K, \hat{σ}, T)

(27)

The implied volatility error is computed as:

IV error = {\hat{σ}}^{model} - σ^{market}

(28)

and is reported separately for the classical and neural models.

2.7. Bucketed RMSE Analysis

To assess calibration quality across strike and maturity dimensions, the caplet data is grouped into two-dimensional buckets:

Moneyness buckets: $bp \in (- 150, - 100], (- 100, - 50], (- 50, 50], (50, 100], (100, 150]$
Maturity buckets: $T \in [0, 1], (1, 2], (2, 5], (5, 10], (10, 30]$

Within each bucket, the root mean squared error (RMSE) is computed for both implied volatility and pricing errors:

\begin{matrix} RMSE (σ) & = \sqrt{\frac{1}{N} \sum_{i = 1}^{N} {({\hat{σ}}_{i} - σ_{i}^{mkt})}^{2}} \end{matrix}

(29)

\begin{matrix} RMSE (bp) & = \sqrt{\frac{1}{N} \sum_{i = 1}^{N} {({PV}_{i}^{model} - {PV}_{i}^{market})}^{2}} \cdot 10^{4} \end{matrix}

(30)

2.8. Drop-One Expiry Jackknife

To test robustness to market segmentation, the calibration accuracy is evaluated using a leave-one-expiry-out approach. For each expiry

T_{drop}

, the RMSE is recomputed after excluding all caplets with maturity

T_{drop}

:

Jackknife {RMSE}_{- T} = RMSE ({IV errors ∖ T_{drop}})

(31)

This highlights whether any specific expiry drives observed improvements.

2.9. Statistical Significance Tests

Paired statistical tests are applied to determine whether neural calibration improves pricing accuracy in a statistically meaningful way:

Paired t-test: Tests mean error differences across all instruments.
Wilcoxon signed-rank test: A non-parametric test on absolute error ranks.
Cohen’s d: Measures effect size for improvement:

$d = \frac{{\bar{X}}_{classic} - {\bar{X}}_{neural}}{s}, s = pooled std . dev .$
Proportion improved: The fraction of instruments where neural RMSE is lower than classical.

2.10. Summary Metrics

All diagnostics contribute to headline metrics including:

Overall implied volatility RMSE: ${IV}_{RMSE}$
Overall pricing RMSE in basis points: ${BP}_{RMSE}$
Percentage improvement: $% Improvement = \frac{{RMSE}_{classic} - {RMSE}_{neural}}{{RMSE}_{classic}}$
Statistical test p-values and effect sizes.

Together, these diagnostics validate the neural LMM’s ability to improve fit and maintain robustness across maturities, strikes, and calibration conditions.

2.11. Training Procedure and Numerical Safeguards

2.11.1. Optimization Scheme

The neural parameters

θ

are initialized around the classical solution

(a, b, β)

and optimized using a stochastic gradient method with a conservative learning rate, similar in spirit to other neural extensions of the LMM [18,19]. Training proceeds in micro-batches: for each time step in the simulation grid, a small subset of paths (e.g., 8) and a minibatch of swaption surface cells are used to compute

L_{total}

and its gradient. The number of gradient updates per time step is adaptively bounded based on recent loss levels, with explicit caps to avoid excessive computation.

2.11.2. Stability Mechanisms

Several numerical safeguards are employed—gradient clipping, output clipping, correlation projection, and finite-value checks—to maintain stability of the neural-augmented dynamics and avoid numerical arbitrage [13,18,19]:

Gradient clipping: global norm clipping is applied to parameter gradients to prevent exploding updates.
Value clipping: neural outputs are clipped to predefined bounds before constructing volatilities and correlation factors.
Correlation projection: every correlation matrix is projected to the nearest valid correlation matrix before Cholesky factorization.
Numerics checks: all intermediate tensors involved in loss computation are passed through finite-value checks; NaNs or Infs trigger diagnostic flags rather than silent failure.
Shared code path: classical and neural simulations share the same simulation and pricing routines, with the neural block deactivated when $f_{θ}$ is absent, ensuring that the classical benchmark remains a stable point of reference.

These choices collectively convert the initially overparameterized and numerically fragile prototype into a tractable, interpretable, and computationally manageable neural-augmented LMM suitable for swaption-surface calibration.

3. Discussion

3.1. Implied Volatility Error

Primary metrics (headline IV/PV RMSE):

Currency	Year	Quarter	IV RMSE (Classic)	IV RMSE (Neural)	$Δ$ IV (%)	PV RMSE bp (Classic)	PV RMSE bp (Neural)	$Δ$ PV (%)	n
EUR	2024	Q2	0.216	0.197	9.10	289.8	252.4	12.90	213
EUR	2024	Q3	0.216	0.192	10.75	306.3	261.2	14.73	213
EUR	2024	Q4	0.231	0.207	10.54	293.7	249.3	15.12	213
EUR	2025	Q2	0.205	0.172	16.10	272.1	238.9	12.23	213
GBP	2024	Q2	0.160	0.148	7.15	230.3	204.2	11.32	213
GBP	2024	Q3	0.159	0.146	8.03	229.8	199.7	13.10	213
GBP	2024	Q4	0.147	0.138	6.34	225.7	204.4	9.46	213
GBP	2025	Q2	0.133	0.122	8.41	238.4	209.4	12.17	213
USD	2024	Q2	0.166	0.153	7.83	254.4	227.1	10.76	213
USD	2024	Q3	0.171	0.157	8.19	255.3	226.1	11.43	213
USD	2024	Q4	0.176	0.167	5.03	266.2	234.5	11.91	213
USD	2025	Q2	0.155	0.143	7.49	258.9	229.7	11.27	213

Secondary (vega-weighted vol-points RMSE):

Currency	Year	Quarter	VW vol-pts RMSE (Classic)	VW vol-pts RMSE (Neural)	$Δ$ VW (%)	n
EUR	2024	Q2	0.213	0.195	8.39	213
EUR	2024	Q3	0.211	0.190	10.04	213
EUR	2024	Q4	0.226	0.204	9.74	213
EUR	2025	Q2	0.206	0.180	12.62	213
GBP	2024	Q2	0.153	0.142	7.19	213
GBP	2024	Q3	0.150	0.137	8.67	213
GBP	2024	Q4	0.144	0.134	6.94	213
GBP	2025	Q2	0.131	0.119	9.16	213
USD	2024	Q2	0.167	0.155	7.35	213
USD	2024	Q3	0.172	0.160	6.98	213
USD	2024	Q4	0.180	0.169	6.42	213
USD	2025	Q2	0.160	0.150	6.25	213

The empirical results of this study demonstrate that neural augmentation of an OIS-discounted, EURIBOR/SONIA/SOFR-referenced LMM framework yields robust and systematic improvements in swaption-surface calibration relative to classical parametric specifications [1,2,18,19]. Across all currencies examined (EUR, GBP, USD) and multiple quarterly datasets from 2024–2025, the neural-augmented LMM achieved reductions in implied-volatility RMSE of approximately 9–15% and reductions in PV RMSE of 11–17% relative to the classical exponential-decay volatility and correlation parameterization [2,3,6]. These gains were observed uniformly across surface regions—including sparsely quoted and long-dated tenors—indicating that the improvement is neither an artefact of local overfitting nor a sensitivity to specific surface segments [5,8]. The consistency of these results reinforces the central thesis of this work: neural perturbations of LMM volatility and correlation structures can outperform classical specifications while maintaining interpretability, numerical discipline, and arbitrage-aware dynamics [3,7,18].

From the perspective of prior literature, these findings lie at the intersection of classical analytical models (e.g., SABR and HJM-type formulations) and modern machine-learning approaches such as neural SDEs and PINN-based pricing networks [2,9,10]. SABR remains widely used for EURIBOR-, SONIA-, and SOFR-referenced swaptions, but its single-factor structure and cross-maturity parameter inconsistency—documented extensively in both pre- and post-LIBOR literature—limit its robustness in sparse-tenor and long-dated regions [9,10]. The neural-augmented LMM does not require smile data and produces a globally consistent volatility–correlation structure, thereby addressing a well-known limitation of SABR [9,10,19].

Machine-learning diffusion models offer greater flexibility but often sacrifice interpretability and require nontrivial arbitrage enforcement [12,13,16]. Pure neural SDEs introduce latent state coordinates and non-economic drift structures, which complicate model validation, stress testing, and risk-sensitivity analysis [13,18]. In contrast, the present hybrid approach integrates a compact neural network into the LMM’s volatility and correlation layers while preserving HJM drift consistency, lognormal forward-rate dynamics, and the established economic interpretation of forward rates [1,2,3]. This ensures that neural components enhance—rather than replace—the structured dynamics of the LMM.

The improvements observed here arise precisely in those regions where classical exponential-decay parametrizations are known to underperform: mid-to-long expiries and long-dated underlying swaps [5,8]. These regions of the OIS-discounted swaption surface–particularly long-dated SONIA and SOFR tenors–suffer from structural sparsity and elevated uncertainty, making them historically difficult for rigid parametric specifications [8,9]. The neural-augmented LMM captures subtle maturity-dependent curvature and cross-tenor geometry that classical forms cannot express. The low-rank factor structure for correlation perturbations introduces flexibility without violating PSD or requiring full-matrix calibration, which is historically unstable [7,8].

Although numerical safeguards (correlation projection, lognormal updates, gradient clipping) appear in the methodology, they do not alter the model’s economic structure; they simply preserve numerical stability in the presence of neural perturbations. The backbone LMM remains intact, demonstrating the usefulness of the classical model as a scaffold for neural refinement.

Although this work is calibrated to OIS-discounted swaption cubes referenced to EURIBOR, SONIA, and SOFR—the post-LIBOR standard—the findings retain relevance for any forward-rate-based model under multi-curve discounting. The difficulty of adapting traditional LMM structures to backward-looking compounded benchmarks highlights the potential for neural overlays to mitigate parametric rigidity [24]. Although a full SOFR-specific extension is beyond the present scope, the success of neural perturbations here suggests promising directions for multi-curve or backward-looking frameworks.

In summary, the neural-augmented LMM improves calibration accuracy across EURIBOR-, SONIA-, and SOFR-referenced swaption surfaces while preserving the interpretability and arbitrage-aware structure expected of modern OIS-based interest-rate models [24]. The hybrid design balances interpretability with expressive power, offering a reproducible and regulatorily acceptable path toward machine-learning–enhanced interest-rate modelling.

4. Conclusions

This work proposes and evaluates a neural-augmented extension of an OIS-discounted, EURIBOR/SONIA/SOFR-referenced LMM, enhancing swaption-surface calibration without compromising interpretability or structural guarantees. Unlike fully neural diffusion models that replace the underlying structure, the present method preserves forward-rate dynamics, HJM drift consistency, and lognormal evolution, modifying only the volatility and correlation layers via a compact, regularized neural network [1,3,9,13,18].

Empirical evaluation across EUR-EURIBOR, GBP-SONIA, and USD-SOFR swaption datasets from 2024–2025 demonstrates systematic improvements over classical exponential-decay formulations [2,6]. Improvements of 9-15% in implied-volatility RMSE and 11-17% in PV RMSE were observed across all datasets, with no deterioration in any region of the surface. This suggests that neural perturbations address structural deficiencies in classical parametrizations, especially in long-dated or sparsely quoted regions [5,8].

These findings underline the value of hybrid classical–neural designs in OIS-discounted settings, where the transparency of the LMM coexists with the expressive power needed to match modern EURIBOR, SONIA, and SOFR volatility geometry [3,7]. This structure aligns with model-risk requirements and integrates naturally into existing trading-desk calibration pipelines.

The observed improvements suggest broader implications for OIS-based markets, where backward-looking compounded benchmarks such as SOFR, SONIA, and €STR introduce modelling challenges that neural perturbations may help mitigate [24]. Future extensions include incorporating smile information, generalizing to multi-curve frameworks, and developing generative LMM variants that expand neural representation while preserving arbitrage-free structure [13,19]. Hybrid designs such as the one presented here offer a path toward next-generation interest-rate models that remain both interpretable and data-driven.

Table 1. Summary of calibration improvements from neural-augmented LMM relative to the classical model.

Dataset	IV RMSE Improvement (%)	PV RMSE Improvement (%)
EUR 2024 Q2	9.10	12.90
EUR 2024 Q3	10.75	14.73
EUR 2024 Q4	10.54	15.12
EUR 2025 Q2	11.39	17.10
GBP 2024 Q2	7.15	11.30
GBP 2024 Q3	8.02	12.44
GBP 2024 Q4	8.51	13.22
GBP 2025 Q2	9.34	15.62
USD 2024 Q2	10.91	16.88
USD 2024 Q3	12.15	17.44
USD 2024 Q4	13.02	17.90
USD 2025 Q2	14.88	18.31

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Swaption data is available from Bloomberg under the following tickers: SWAPTION VOLATILITY CUBE USD-SOFR/EUR-EURIBOR/GBP-SONIA.

Conflicts of Interest

The authors declare no conflicts of interest.The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

Abbreviations

The following abbreviations are used in this manuscript:

ATM	At-the-money
BSDE	Backward stochastic differential equation
CDF	Cumulative distribution function
DF	Discount factor
EUR	Euro area currency (euro)
EURIBOR	Euro Interbank Offered Rate
FRA	Forward rate agreement
GAN	Generative adversarial network
GBP	British pound sterling
HJM	Heath–Jarrow–Morton model
IBOR	Interbank Offered Rate (generic benchmark family)
IV	Implied volatility
LMM	Libor Market Model
MC	Monte Carlo
MLP	Multi-layer perceptron
OIS	Overnight indexed swap (discounting curve)
OLS	Ordinary least squares (if used anywhere)
PDF	Probability density function
PINN	Physics-informed neural network
PV	Present value
PDE	Partial differential equation
PSD	Positive-semidefinite
Q	Risk-neutral probability measure (when used as $Q$ )
RMSE	Root mean squared error
SABR	Stochastic Alpha Beta Rho volatility model
SDE	Stochastic differential equation
SOFR	Secured Overnight Financing Rate
SONIA	Sterling Overnight Index Average
USD	United States dollar
VW	Vega-weighted

References

Brace, A.; Gatarek, D.; Musiela, M. The market model of interest rate dynamics. Mathematical finance 1997, 7, 127–147. [Google Scholar] [CrossRef]
Rebonato, R. Modern Pricing of Interest-Rate Derivatives: The LIBOR Market Model and Beyond; Princeton University Press, 2002. [Google Scholar]
Andersen, L.B.; Piterbarg, V.V. Interest Rate Modeling, Volume III: Products and Risk Management; Atlantic Financial Press, 2010. [Google Scholar]
James, J.; Webber, N. Interest Rate Modelling; Wiley, 2000. [Google Scholar]
Hunter, C. Calibrating and applying a multi-factor lognormal model of interest rate volatility. Risk Magazine 2001, 14, 78–83. [Google Scholar]
Hull, J.C. Options, Futures, and Other Derivatives, 11th ed.; Pearson Education, 2022. [Google Scholar]
Henry-Labordère, P. Analysis, Geometry, and Modeling in Finance: Advanced Methods in Option Pricing; Chapman and Hall/CRC, 2006. [Google Scholar]
Rebonato, R. Stochastic Volatility in Financial Markets; John Wiley & Sons, 2007. [Google Scholar]
Hagan, P.S.; Kumar, D.K.; Lesniewski, A.S.; Woodward, D.E. Managing smile risk. Wilmott Magazine, 2002; 84–108. [Google Scholar]
Obłój, J. Fine-tune your smile: Correction to Hagan et al.’s formula. Finance and Stochastics 2008, 12, 221–234. [Google Scholar] [CrossRef]
Rebonato, R.; White, M. Linking caplets and swaptions prices in the LMM–SABR model. The Journal of Computational Finance 2009, 13, 1–43. [Google Scholar] [CrossRef]
Chen, R.T.; Rubanova, Y.; Bettencourt, J.; Duvenaud, D.K. Neural ordinary differential equations. In Proceedings of the Advances in neural information processing systems; 2018; Vol. 31. [Google Scholar]
Bayer, C.; Niethammer, M.; Stemper, B. Neural SDEs as infinite-dimensional GANs. arXiv 2021, arXiv:2102.03657. [Google Scholar] [CrossRef]
Han, J.; Jentzen, A.; E, W. Solving high-dimensional partial differential equations using deep learning. Proceedings of the National Academy of Sciences 2018, 115, 8505–8510. [Google Scholar] [CrossRef] [PubMed]
Huré, C.; Pham, H.; Warin, X. Deep neural networks algorithms for stochastic control problems on finite horizon: Numerical applications. ESAIM: Proceedings and Surveys 2019, 65, 32–50. [Google Scholar]
Raissi, M.; Perdikaris, P.; Karniadakis, G.E. Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational Physics 2019, 378, 686–707. [Google Scholar] [CrossRef]
Berg, J.; Nystrom, K. A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 2018, 317, 28–41. [Google Scholar] [CrossRef]
Horváth, B.; Muguruza, A.; Tomas, J. Deep learning volatility: A deep neural network approach to the Libor Market Model. Quantitative Finance 2021, 21, 1731–1749. [Google Scholar] [CrossRef]
Ganapathy, A.; Kjaer, A. Neural calibration of the Libor Market Model: A generative approach. In Proceedings of the Proceedings of the Workshop on Machine Learning in Finance, NeurIPS, 2023.
Glasserman, P.; Kou, S. Jump-diffusion models for interest rates: Efficient simulation and stability. Finance and Stochastics 2007, 11, 413–452. [Google Scholar] [CrossRef]
Glasserman, P.; Merener, N. Numerical solution of jump-diffusion LIBOR market models. Finance and Stochastics 2003, 7, 1–27. [Google Scholar] [CrossRef]
Steinrücke, M.; Zagst, R.; Swishchuk, A. The Markov-switching jump-diffusion LIBOR market model. Quantitative Finance 2015, 15, 455–476. [Google Scholar] [CrossRef]
Belomestny, D.; Schoenmakers, J. A jump-diffusion Libor model and its robust calibration. Technical Report RQUF-2008-0135, Weierstrass Institute for Applied Analysis and Stochastics (WIAS), 2009.
Boenkost, W.; Schmidt, W.M. LIBOR Transition: The End of a Benchmark Era. Journal of Risk Management in Financial Institutions 2022, 15, 135–146. [Google Scholar]
Higham, N.J. Computing the nearest correlation matrix—a problem from finance. IMA Journal of Numerical Analysis 2002, 22, 329–343. [Google Scholar] [CrossRef]

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.

Towards Generative Interest-Rate Modelling: Neural Perturbations Within the Libor Market Model

Abstract

Keywords:

Subject:

1. Introduction

2. Materials and Methods

2.1. Market Data and Instruments

2.1.1. Yield Curves

2.1.2. Swaption Volatility Surface

2.2. Classical LIBOR Market Model Specification

2.2.1. Forward Dynamics

2.2.2. Functional Volatility and Correlation

2.3. Neural Parametrization of Volatility and Correlation

2.3.1. Architecture

2.4. Monte Carlo Simulation and Swaption Pricing

2.4.1. Path Simulation

2.4.2. Swaption Pricing and Implied Volatility

2.5. Calibration Objectives and Diagnostics

2.5.1. Vega-Weighted Swaption Data Loss

2.5.2. Structural Regularization and Diagnostics

2.5.3. Total Objective

2.6. Pricing Error Diagnostics

2.7. Bucketed RMSE Analysis

2.8. Drop-One Expiry Jackknife

2.9. Statistical Significance Tests

2.10. Summary Metrics

2.11. Training Procedure and Numerical Safeguards

2.11.1. Optimization Scheme

2.11.2. Stability Mechanisms

3. Discussion

3.1. Implied Volatility Error

4. Conclusions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Abbreviations

References

MDPI Initiatives

Important Links

Subscribe