Regime-Dependent Performance of Deterministic and Stochastic Interest Rate Models in Annuity Valuation: Cross-Country Evidence from the U.S., Italy, and India

Aidana Mashrapova; Viktoriya Nurlanova; Dongming Wei

doi:10.20944/preprints202606.0341.v1

Submitted:

02 June 2026

Posted:

04 June 2026

You are already at the latest version

Abstract

This paper examines how interest rate model specification affects annuity present value estimation. Using annual real interest rate data for the United States, Italy, and India from 1991 to 2021, we compare nine deterministic and stochastic interest rate models within a unified discrete-time valuation framework: the historical path, constant mean, piecewise constant, piecewise linear, cubic polynomial, piecewise cubic, ARIMA, Vasicek, and Cox–Ingersoll Ross (CIR) specifications. Structural breakpoints are identified using Bai–Perron tests, and annuity values are computed under the portfolio rate method. The principal contribution is a cross-country out-of-sample evaluation design that spans all nine specifications across three qualitatively distinct interest rate regimes, generating regime-contingent guidance for model selection. As part of this design, we apply and evaluate piecewise linear and piecewise cubic temporal interest-rate representations within the valuation framework, enabling structural regime shifts and within-regime dynamics to be captured simultaneously. Results show that model performance is strongly regime-dependent. In the United States, piecewise constant and piecewise linear models provide the best out-of-sample performance. In Italy, ARIMA outperforms all alternatives, with Diebold–Mariano tests confirming its statistically significant advantage. In India’s volatile, non-trending environment, trend-extrapolating deterministic models fail, while the constant mean and Vasicek models generalize better. A mortality-adjusted robustness check confirms that interest rate model risk remains material after survival probabilities are incorporated. Present value errors reach approximately 17%, demonstrating the practical importance of model selection for annuity pricing, insurance reserving, and pension liability management under Solvency II and IFRS 17.

Keywords:

annuity valuation

;

interest rate risk

;

model risk

;

regime-dependent modeling

;

stochastic interest rates

;

Vasicek model

;

CIR model

;

ARIMA

;

Piecewise regression

;

actuarial reserving

;

structural breaks

;

Bai-Perron test

Subject:

Computer Science and Mathematics - Other

1. Introduction

The valuation of annuities depends critically on interest rate modeling, as it determines the discounting of future cash flows. Traditional actuarial models often use a constant or predetermined interest rate (Kellison 2009), which simplifies calculations but fails to account for the temporal fluctuations observed in real financial markets. Stochastic interest rate models, such as those proposed by Vasicek (1977) and Cox et al. (1985), capture market variability but introduce significant computational complexity and reduced transparency. A key open question is whether the added complexity of stochastic models translates into meaningfully better annuity valuations—or whether well-specified deterministic alternatives can match or exceed their performance at lower cost.

This paper investigates that question by comparing nine interest rate model specifications—including two flexible deterministic specifications not commonly used in standard annuity valuation practice—within a unified discrete-time framework. While existing deterministic approaches generally rely on constant or piecewise constant interest rates, these models overlook within-period variations. To address this gap, we apply and evaluate piecewise linear and piecewise cubic representations of the interest rate process, estimated on Bai–Perron-identified regime boundaries. These specifications enable simultaneous capture of structural level shifts and smooth within-regime dynamics, combining the interpretability of deterministic models with greater temporal flexibility. The spirit of this approach is related to the parsimonious representation philosophy of Nelson and Siegel (1987), who demonstrated that a small number of parameters can compactly describe the shape of interest rate curves; our contribution applies analogous parsimony reasoning to the temporal, rather than the maturity, dimension of interest rates. Crucially, we evaluate all models not only on in-sample fit but also on out-of-sample forecasting accuracy and annuity present value errors—the metrics that matter most for actuarial practice. The importance of out-of-sample evaluation is underscored by Diebold and Li (2006), who show that models with strong in-sample fit do not necessarily forecast well, a finding our cross-country results confirm in the annuity valuation context.

Interest rate fluctuations, driven by macroeconomic and financial factors, are critical for accurate annuity pricing. Since discounting is multiplicative, early interest rates affect all subsequent discount factors, making the temporal structure of rates essential for long-term contracts. Ignoring this structure can lead to substantial valuation errors, particularly for long-term liabilities.

Our empirical analysis uses real interest rate data from 1991 to 2021 for three countries—the United States, Italy, and India—drawn from the World Bank database (World Bank 2024). This cross-country design is chosen to span qualitatively distinct interest rate regimes: a gradual, sustained decline (U.S.), a rapid and pronounced structural decline (Italy), and a high-volatility, non-trending environment (India). We construct nine model specifications—historical path, constant mean, piecewise constant, piecewise linear, cubic polynomial, piecewise cubic, ARIMA, Vasicek, and CIR—and evaluate each within the same discrete-time framework using the portfolio rate method as the benchmark. The CIR model is estimated using the exact non-central chi-squared maximum likelihood estimator (Cox et al. 1985), placing it on the same rigorous statistical footing as the Vasicek model.

The findings are strongly regime-dependent. In the United States, piecewise constant and piecewise linear models generalize best out-of-sample, outperforming all stochastic alternatives. In Italy, ARIMA outperforms all competitors out-of-sample—an advantage that is statistically confirmed by Diebold–Mariano tests (Harvey et al. 1997)—as its local mean-tracking prevents trend over-extrapolation. In India, trend-extrapolating deterministic models fail catastrophically out-of-sample, while the constant mean and Vasicek models generalize considerably better. These results suggest that the choice between deterministic and stochastic frameworks should be guided by the structural stability of the interest rate regime, rather than by a universal preference for either approach.

The findings carry direct implications for insurance reserving and pension fund management. The valuation errors documented across model specifications—ranging from approximately 9.6% overestimation under the constant mean model to approximately 2% under the Vasicek model (U.S.), and up to approximately 17% for Italy’s constant mean model—represent material model risk exposures, measured against the historical path benchmark. These results are reinforced by a mortality-adjusted robustness check, which confirms that interest rate model risk is not offset by incorporating survival probabilities—a finding consistent with Ngugnie Diffouo and Devolder (2020), who demonstrate that longevity and interest rate risks interact materially in determining insurer solvency capital. Our results underscore the importance of regime diagnosis as a precondition for model selection, and provide practitioners with a data-driven basis for choosing between deterministic and stochastic frameworks under regulatory requirements such as Solvency II and IFRS 17.

The principal contributions of this paper are twofold. First, we provide a cross-country out-of-sample evaluation framework that compares nine interest rate model specifications across three qualitatively distinct regimes, with formal statistical testing via the Diebold–Mariano procedure and a mortality-adjusted robustness check—producing regime-contingent guidance for model selection that is directly applicable to actuarial practice. Second, as part of this evaluation, we apply and assess piecewise linear and piecewise cubic temporal representations of the interest rate process within a unified discrete-time annuity valuation framework, combining Bai–Perron regime identification with within-regime polynomial approximation. The evaluation design and the cross-country regime comparison constitute the principal contributions to the existing literature.

The remainder of the paper is structured as follows. Section 2 reviews the related literature. Section 3 presents the materials and methods, comprising the data, the annuity valuation framework, the piecewise regression methodology, the discrete-time valuation framework for time-varying interest rates, and the alternative interest rate representations. Section 4 reports the empirical results, including in-sample fit, out-of-sample performance, and cross-country robustness. Section 5 concludes.

2. Literature Review

The valuation of fixed-income liabilities, particularly annuities, is closely tied to interest rate modeling. Early models assumed constant or predetermined interest rates (Kellison 2009), which offered analytical simplicity but failed to capture the dynamic nature of real-world interest rates. While deterministic models remain widely used due to their simplicity, they often lead to inaccurate valuations, particularly for long-term financial products.

The introduction of stochastic interest rate models by Vasicek (1977) and Cox et al. (1985) provided a significant advancement by incorporating mean-reverting processes, allowing for arbitrage-free valuation. These models improve realism by accounting for market uncertainty but are computationally complex, requiring evaluation of expectations over stochastic discount factors, which reduces transparency and complicates practical implementation. Hamilton (1989) further demonstrated that macroeconomic time series, including interest rates, are better characterized by discrete regime shifts than by linear stationary processes, providing formal motivation for structural segmentation of interest rate data. This insight has been applied directly to interest rate modeling, with empirical evidence confirming that structural change dynamics materially affect interest rate forecasts and bond pricing (Bai and Perron 1998, 2002). The regime-switching framework has since been extended to a range of financial applications: Milidonis and Chisholm (2024) develop a regime-switching generalization of the Merton structural default risk model and demonstrate, using both simulated and empirical data, that allowing the asset-return distribution to switch between states produces materially higher and more timely default probability estimates than the standard single-regime model in the period preceding credit rating downgrades.

A parallel literature has developed parsimonious parametric methods for representing the term structure of interest rates. Nelson and Siegel (1987) introduced a three-component exponential model for the forward rate curve—comprising level, slope, and curvature factors—that captures the monotonic, humped, and S-shaped forms typically observed in yield curves. Their model demonstrates that a small number of parameters suffices to describe the essential shape of the term structure at a given point in time. McCulloch (1971) and McCulloch (1975) had earlier introduced cubic spline regression to fit the discount function across maturities, showing that piecewise polynomial specifications outperform global polynomial alternatives; Fernández-Rodríguez (2006) extended this approach through free-knot spline methods for interest rate term-structure modeling. The term structure has also been studied under conditions of Knightian uncertainty: Romagnoli and Santoro (2017) embed ambiguity premia in a Heath–Jarrow–Morton framework and show that government-policy uncertainty materially shifts bond yields and zero-coupon bond prices, with empirically estimated ambiguity parameters tracking observed yield movements around events such as Brexit and Italian constitutional referenda. While this body of work applies flexible functional form methods across the maturity dimension, the present paper adapts analogous logic to the temporal dimension, using piecewise regression to capture structural shifts and within-regime dynamics in the time path of interest rates.

The out-of-sample forecasting performance of interest rate models has received increasing attention following the work of Diebold and Li (2006), who reinterpreted the Nelson–Siegel model as a dynamic factor framework with time-varying level, slope, and curvature factors governed by autoregressive processes. Their central finding—that models with strong in-sample fit do not necessarily forecast well out-of-sample, and that parsimonious specifications often outperform richer competitors at longer horizons—is directly relevant to the present paper. We find an analogous result in the annuity valuation context: the piecewise cubic model achieves superior in-sample fit but performs poorly out-of-sample across all three countries, while simpler piecewise constant and mean-reverting specifications generalize better. The insight that shrinkage and parsimony improve out-of-sample performance, emphasized by Diebold and Li (2006) in the yield curve forecasting context, thus extends to interest rate modeling for actuarial purposes.

Recent studies have extended stochastic models to annuity and insurance valuation, demonstrating their impact on financial outcomes. Fergusson et al. (2025) show that stochastic interest rate models can significantly affect reserve estimates in insurance, while Goudenege et al. (2025) explore their influence on contract valuations in Lévy models. The interaction between interest rate model risk and longevity risk in annuity reserving has been examined by Ngugnie Diffouo and Devolder (2020), who compute solvency capital requirements for lifetime, deferred, and term annuity products under the Solvency II framework using the Hull–White mortality model. Their findings confirm that the uncertain level of future liabilities—relative to the initially expected value—constitutes material model risk, and that the choice of product and modeling assumptions significantly affects the capital an insurer must hold. Our paper complements this line of work by isolating the contribution of interest rate model specification to present value errors and demonstrating that this source of model risk persists even after survival probabilities are incorporated. In discrete time, models such as Markov chains and jump processes (Li et al. 2017; Mo et al. 2023) capture non-linearities in interest rate movements, while empirical models such as ARIMA (Box et al. 2015; Tsay 2010) offer a more tractable approach to modeling temporal dependence.

Despite these advancements, deterministic models applied to annuity valuation remain largely restricted to constant or piecewise constant rates, which fail to capture within-period dynamics. Stochastic models, while more flexible, are computationally intensive and lack transparency, making them less accessible for routine actuarial applications. This creates a practical gap between modeling accuracy and implementation.

This paper bridges that gap by applying and evaluating piecewise linear and cubic representations of the interest rate process within a unified discrete-time annuity valuation framework, evaluated across three distinct interest rate regimes. The approach combines the structural break literature for regime identification with the spline literature for flexible polynomial approximation. Critically, all specifications are assessed on out-of-sample forecasting accuracy and annuity present value errors—the metrics most relevant for actuarial practice—to determine whether the appropriate modeling choice is universal or regime-dependent.

3. Materials and Methods

This section describes the data and the methodological framework used in the empirical analysis. Section 3.1 details the interest rate data used for the United States, Italy, and India. Section 3.2 presents the annuity valuation framework under both fixed and time-varying interest rates. Section 3.3 introduces the piecewise regression methodology. Section 3.4 develops the discrete-time valuation framework for annuities under time-varying interest rates. Section 3.5 introduces nine alternative interest rate representations — constant mean, historical path, piecewise constant, piecewise linear, cubic polynomial, piecewise cubic, ARIMA, Vasicek, and CIR — that are compared in the empirical analysis.

3.1. Data

The empirical analysis uses annual real interest rate data for the United States, Italy, and India over the period 1991–2021, obtained from the World Bank’s World Development Indicators database (World Bank 2024). Real interest rates are defined as the lending interest rate adjusted for inflation as measured by the GDP deflator. The three countries are selected to span a range of interest rate regimes: the United States and Italy provide examples of developed economies exhibiting a declining trend over the sample period, while India represents a high-volatility, non-trending emerging-market environment. This cross-country design allows the analysis to assess whether the relative performance of deterministic and stochastic interest rate models is sensitive to the structural stability of the underlying rate process. For each country, the sample is partitioned into an in-sample estimation period of 1991–2015 and an out-of-sample evaluation period of 2016–2021. The full data series is reported in Table A1.

We acknowledge that real lending rates—the World Bank series used here—may diverge from the risk-free or swap rates more commonly employed in insurance reserving practice. The World Bank series is chosen for its cross-country comparability and public availability over the full 1991–2021 window; equivalent term-structure data with consistent definitions across all three countries are not readily available for this period. The relative model rankings documented here reflect the structural properties of each rate series—its trend, volatility, and regime stability—and these properties are likely to apply to alternative rate series sharing similar characteristics within each country. Alternative rate series, including nominal lending rates or central bank policy rates, may nonetheless yield different absolute present-value levels and, in high-inflation periods, different relative model rankings; we return to this point in Section 5.

3.2. Annuities with Fixed and Variable Interest Rates

In this section, we examine the valuation of annuities under both constant and time-varying interest rates. While constant rates offer a simple analytical framework, more realistic scenarios require modeling time-varying interest rates.

3.2.1. Fixed Interest Rate

We consider an annuity-immediate with payments of 1 made at the end of each period for n periods. Let i denote the constant effective interest rate per period, and let the corresponding discount factor be

v = \frac{1}{1 + i}

(1)

The present value of the annuity is calculated by discounting each payment back to time 0:

a_{\bar{n} | i} = \sum_{k = 1}^{n} v^{k}

(2)

The notation

a_{\bar{n} | i}

denotes the present value of an annuity-immediate, where payments are made at the end of each period. This expression forms a geometric series and can be written in closed form:

a_{\bar{n} | i} = \frac{1 - v^{n}}{i}

(3)

Although this formulation is analytically convenient, it assumes a constant interest rate. In practice, interest rates fluctuate over time, necessitating more flexible valuation methods.

3.2.2. Time-Varying Interest Rate

When interest rates vary over time, the valuation approach must be adjusted. Let

i_{k}

denote the interest rate applicable to period k. The present value of the annuity is computed by discounting each payment using the rates for each respective period.

The Yield Curve Method

Under the yield curve approach, each payment is discounted using the spot rate corresponding to its maturity:

a_{\bar{n} | i} = \sum_{k = 1}^{n} {(1 + i_{k})}^{- k}

(4)

Here,

i_{k}

is the spot rate for maturity k, and it applies over the entire period from time 0 to time k.

The Portfolio Rate Method

In the portfolio rate method, payments are discounted sequentially using the one-period rates applicable to each period. The present value is:

a_{\bar{n} | i} = \sum_{k = 1}^{n} \prod_{s = 1}^{k} {(1 + i_{s})}^{- 1}

(5)

This formulation reflects the cumulative effect of the realized interest rates, discounting each payment step-by-step from time k back to time 0. The portfolio rate method is used as the benchmark in the empirical analysis, as it provides a more realistic representation of the discounting process for an annuity funded through a portfolio whose return evolves period by period.

3.3. Piecewise Interest Rate Models

To value annuities under time-varying interest rates, it is essential to model the temporal structure of the annual real interest rate process. In this paper, we use piecewise regression techniques, which offer flexibility by capturing both structural changes and within-regime dynamics.

3.3.1. Piecewise Regression Framework

Let

x_{0} < x_{1} < \dots < x_{m}

denote a sequence of breakpoints that divide the time domain into m intervals. In the piecewise linear model, the interest rate function is defined as:

r (t) = a_{j} t + b_{j}, t \in (x_{j - 1}, x_{j}]

(6)

where

a_{j}

and

b_{j}

are interval-specific parameters. This formulation allows the slope of the interest rate to vary across segments, capturing local trends.

For greater flexibility, we also consider a piecewise cubic specification:

r (t) = a_{j} + b_{j} t + c_{j} t^{2} + d_{j} t^{3}, t \in (x_{j - 1}, x_{j}]

(7)

Here,

a_{j}, b_{j}, c_{j}, d_{j}

are the coefficients for the j-th interval. The cubic model allows for nonlinear behavior and provides a smoother approximation of interest rate dynamics within each regime.

3.3.2. Empirical Specification

The time index is defined such that

t = 0

corresponds to 1991,

t = 1

to 1992, and so on. The Bai–Perron breakpoints divide the sample into regimes, denoted by

T_{j}

. Within each regime

T_{j}

, a separate linear or cubic polynomial is estimated. Thus, the parameters are regime-specific rather than year-specific. We use the index j for regimes and reserve the index k for annuity-payment periods.

Under the piecewise linear model, the interest rate is specified as:

r_{t} = a_{j} t + b_{j} + ε_{t}, t \in T_{j} .

(8)

For the piecewise cubic model, the interest rate is specified as:

r_{t} = c_{j, 0} + c_{j, 1} t + c_{j, 2} t^{2} + c_{j, 3} t^{3} + ε_{t}, t \in T_{j} .

(9)

Following a Bai–Perron structural break test, which identifies optimal breakpoints at 1994, 2000, and 2008 based on BIC minimization, the U.S. time path is divided into four regimes: 1991–1994, 1995–2000, 2001–2008, and 2009–2021. This data-driven approach replaces the ad hoc segmentation used in earlier drafts and provides formal justification for the piecewise structure. The estimated coefficients are presented in Section 3.5.

3.4. Annuity Valuation under Time-Varying Interest Rates

This section presents the valuation framework for annuities under time-varying interest rates. Since interest rate data is observed annually and annuity payments occur at the end of each year, the analysis is conducted in discrete time. This framework is used throughout the empirical analysis.

3.4.1. Discrete-Time Valuation

Let

i_{k}

denote the effective interest rate for period k. The valuation formulas below restate equations (4) and (5) from Section 3.2 in the discrete-time context used throughout the empirical analysis. Under the yield curve approach, each payment is discounted using the spot rate corresponding to its maturity:

a_{\bar{n} | i} = \sum_{k = 1}^{n} {(1 + i_{k})}^{- k}

(10)

Alternatively, under the portfolio rate method, payments are discounted sequentially using one-period rates. The present value is:

a_{\bar{n} | i} = \sum_{k = 1}^{n} \prod_{s = 1}^{k} {(1 + i_{s})}^{- 1}

(11)

In the portfolio rate method, the discount factor for each payment at time k is the product of one-period discount factors, reflecting the cumulative effect of interest rate changes and maintaining the sequential compounding structure.

3.4.2. Benchmark Valuation Framework

Although both methods are reported for comparison, the portfolio rate method is adopted as the benchmark in the empirical analysis. It more accurately reflects the annuity valuation when interest rates vary over time, as it discounts each payment period-by-period using observed or estimated one-period rates.

3.4.3. Implementation with Alternative Interest Rate Models

To assess the sensitivity of annuity values to different interest rate models, we insert the alternative representations of the interest rate process into the portfolio rate method. Specifically, annuity present values are calculated using models such as the historical path, constant mean, piecewise constant, piecewise linear, cubic polynomial, piecewise cubic, ARIMA, Vasicek, and CIR.

For each model, the fitted annual interest rates are treated as the sequence

{i_{k}}_{k = 1}^{n}

, and the corresponding annuity present value is derived from equation (11). This ensures that any differences in valuation arise solely from the way the temporal structure of interest rates is modeled.

For stochastic models (ARIMA, Vasicek, CIR), annuity present values are computed from the sequence of conditional expected rates — that is, the expected trajectory

{E [r_{k}]}

given estimated parameters and the observed starting value — rather than from the expected value of the stochastic discount factor

E [\prod_{s = 1}^{k} {(1 + r_{s})}^{- 1}]

. These two quantities differ in general due to Jensen’s inequality; the closed-form bond pricing formula of the Vasicek model, for instance, would yield the exact expected discount factor. We adopt the expected rate path approach for three reasons: (i) it places deterministic and stochastic models on the same computational footing, so that any differences in present values reflect genuine differences in projected rate trajectories rather than differences in computational convention; (ii) it corresponds to the “best-estimate” rate assumption used in standard actuarial reserving practice under Solvency II and IFRS 17; and (iii) for the short-to-medium horizons and low-volatility parameters obtained in this study, the Jensen convexity correction between the two approaches is small relative to cross-model differences in projected rate levels, and does not materially affect the comparative rankings. Therefore, the analysis should be interpreted as a best-estimate forecasting comparison for annuity valuation, not as a risk-neutral stochastic pricing exercise.

3.5. Interest Rate Representations

This section introduces the alternative representations of the interest rate process used in the empirical analysis. Although all models are constructed using the same historical dataset, they differ in how they model the temporal structure of interest rates. This allows us to isolate the impact of model specification on annuity valuation.

3.5.1. Constant Mean Representation

The simplest model assumes that the interest rate is constant over time and equal to the sample mean:

\bar{r} = \frac{1}{T} \sum_{t = 1}^{T} r_{t}

(12)

The interest rate process is then defined as:

r (t) = \bar{r}

(13)

This model removes all temporal variation and serves as the simplest baseline for comparison. While analytically convenient, it ignores both the timing and evolution of interest rates.

3.5.2. Historical Path

The historical path model preserves the observed interest rates without smoothing:

r (t) = r_{t}

(14)

This approach retains the full temporal structure of the data and serves as the empirical benchmark for all present value comparisons. Since annuity valuation depends on the cumulative effect of discount factors, preserving the exact timing of interest rates is critical.

3.5.3. Piecewise Constant Model

To capture structural changes in interest rates, the sample is divided into four regimes based on the Bai-Perron structural break test: 1991–1994, 1995–2000, 2001–2008, and 2009–2021. Within each regime, the interest rate is assumed constant and equal to the sample mean:

{\bar{r}}_{j} = \frac{1}{| T_{j} |} \sum_{t \in T_{j}} r_{t}

(15)

The interest rate process is therefore given by:

r (t) = {\bar{r}}_{j}, t \in T_{j}

(16)

This specification captures major shifts in the level of interest rates but does not account for within-regime variation, as illustrated in Figure 1.

3.5.4. Piecewise Linear Model

To incorporate within-regime dynamics, a linear trend is estimated separately for each regime. The interest rate is specified as:

r_{t} = a_{j} t + b_{j} + ε_{t}, t \in T_{j}

(17)

Here,

a_{j}

and

b_{j}

are regime-specific parameters, and

ε_{t}

is the error term. This model captures gradual trends while preserving structural breaks. It is more flexible than the piecewise constant model by incorporating within-regime variation. The estimated coefficients are reported in Table 1, with the fitted model illustrated in Figure 2.

To facilitate comparison with stochastic models — which naturally produce distributional forecasts — we quantify parameter uncertainty in the piecewise linear specification via a parametric bootstrap. Residuals from the fitted model are resampled with replacement across

B = 2, 000

replications, and the regime-specific coefficients are re-estimated in each replication. Table 2 reports the 95% bootstrap confidence intervals.

The confidence intervals are narrow for regimes with more observations (Regimes 2–4) and wider for Regime 1, which contains only four data points. The slope estimates in all four regimes have confidence intervals that span zero, consistent with the near-flat rate environment of the estimation period: the piecewise linear model captures level differences across regimes primarily through the intercept

b_{j}

, while the within-regime slopes

a_{j}

are individually imprecise. This suggests that the piecewise constant model — which restricts

a_{j} = 0

— is a plausible alternative when within-regime slope estimation is unreliable due to short regime lengths, and provides a bias-variance interpretation for the comparison in Table 10.

3.5.5. Cubic Polynomial Model

The cubic polynomial model represents the interest rate as a smooth global function of time:

r (t) = α_{0} + α_{1} t + α_{2} t^{2} + α_{3} t^{3}

(18)

The parameters are estimated using ordinary least squares over the full sample. This model captures long-term trends and curvature in the data but may smooth over structural breaks. The estimated coefficients for the U.S. are reported in Table 3, and the fitted model is shown in Figure 3.

3.5.6. Piecewise Cubic Model

The piecewise cubic model allows for nonlinear dynamics within each regime. The interest rate is specified as:

r_{t} = c_{j, 0} + c_{j, 1} t + c_{j, 2} t^{2} + c_{j, 3} t^{3} + ε_{t}, t \in T_{j}

(19)

This model captures both structural breaks and nonlinear behavior within each regime. Each regression is estimated using the global time index t (where

t = 0

corresponds to 1991), so the intercept

c_{j, 0}

does not represent the interest rate level at the start of the regime but rather the extrapolated value at

t = 0

; the fitted rates within the regime are obtained by evaluating the polynomial at the relevant values of t. The estimated coefficients are presented in Table 4, with the fitted piecewise cubic approximation shown in Figure 4.

3.5.7. ARIMA Model

To incorporate stochastic dynamics, the interest rate process is modeled using an ARIMA(0,1,1) specification:

Δ y_{t} = ε_{t} + θ_{1} ε_{t - 1}

(20)

The estimated parameter for the U.S. is

θ_{1} = 0.4379

. The ARIMA(0,1,1) specification was selected through a systematic model identification procedure. An augmented Dickey-Fuller (ADF) test applied to the U.S. real interest rate level series yields a test statistic of

- 1.432

(

p = 0.567

), failing to reject the unit root null at any conventional significance level and motivating first-differencing. The ADF test on the first-differenced series yields a statistic of

- 3.325

(

p = 0.014

), rejecting the null at the 5% level and confirming that one round of differencing is sufficient to achieve stationarity.

Inspection of the ACF and PACF of the differenced series reveals a single significant spike at lag 1 in the ACF with no significant partial autocorrelations, consistent with an MA(1) structure. To confirm this formally, we estimate four candidate specifications and compare them on AIC and BIC (Table 5).

ARIMA(0,1,1) achieves the lowest AIC and BIC across all candidate models, providing formal justification for the selected specification. The parsimonious MA(1) structure in the differences implies that only the most recent shock carries predictive content for the next period’s rate change, consistent with the near-random-walk behaviour documented in the time-series literature on short-term interest rates (see, e.g., Box et al. 2015; Tsay 2010). An equivalent model identification procedure was applied to Italy and India; the corresponding ADF test results and AIC/BIC comparison tables are reported in Appendix B. The same ARIMA(0,1,1) specification was selected for both countries. The fitted ARIMA model for the U.S., along with its 95% confidence intervals, is presented in Figure 5.

Rate scale.

All interest rates are converted to decimal form for estimation and valuation. Tables report interest rates and parameters in percentage terms for readability. All comparisons, likelihood calculations, present values, and Feller-condition diagnostics are evaluated consistently using the same rate scale.

3.5.8. Vasicek Model

To provide a more rigorous stochastic benchmark, the interest rate process is additionally modeled using the Vasicek (1977) continuous-time mean-reverting specification. Under the Vasicek model, the short rate

r (t)

follows the stochastic differential equation:

d r (t) = κ (θ - r (t)) d t + σ d W (t)

(21)

where

κ > 0

is the speed of mean reversion,

θ

is the long-run equilibrium rate,

σ

is the instantaneous volatility, and

W (t)

is a standard Brownian motion. The model is discretized using the exact transition density of the Vasicek process. Since the Vasicek SDE has a Gaussian conditional distribution, the discrete-time transition for annual observations admits the closed-form representation:

r (t + 1) = r (t) e^{- κ} + θ (1 - e^{- κ}) + ε (t + 1)

(22)

where

ε (t + 1) \sim N (0, σ^{2} \frac{(1 - e^{- 2 κ})}{2 κ})

. The three parameters

(κ, θ, σ)

are estimated by maximum likelihood over the in-sample period 1991–2015, consistent with the CIR calibration and all other models in the comparison. This ensures that no look-ahead information from the out-of-sample period (2016–2021) enters the Vasicek parameter estimates, placing the Vasicek and CIR models on the same informational footing for the out-of-sample evaluation. The calibration results are reported in Table 6.

For the U.S., the estimated mean-reversion speed

κ = 0.1542

implies a half-life of 4.50 years, meaning deviations from the long-run equilibrium

θ = 3.25 %

decay by half within approximately four and a half years. Italy exhibits similarly slow mean reversion (

κ = 0.1505

, half-life 4.61 years) with an equilibrium rate of 4.21%, consistent with its structurally higher rate environment over the estimation period. India’s calibration is markedly different:

κ = 0.6890

implies a half-life of only 1.01 years, reflecting the rapid oscillations and absence of a sustained trend in Indian real rates. The volatility parameter

σ

ranges from 1.10% for the U.S. to 3.09% for India, consistent with the substantially higher rate variability in the Indian series.

In-sample, the U.S. Vasicek model achieves

R^{2} = 0.7297

and RMSE

= 1.0024

percentage points, comparable to the piecewise constant model (IS RMSE

= 1.0107

), but does not outperform the more flexible deterministic specifications such as piecewise cubic (see Section 4). For out-of-sample forecasting, the U.S. Vasicek model achieves RMSE

= 1.6958

and MAE

= 0.9387

percentage points over 2016–2021. Fitted values and annuity present values are reported alongside the other models in Table 10 and Table 9.

3.5.9. Cox-Ingersoll-Ross (CIR) Model

As an additional stochastic benchmark, the interest rate process is modeled using the Cox et al. (1985) specification, which extends the Vasicek model by making conditional variance proportional to the level of the rate:

d r (t) = κ (θ - r (t)) d t + σ \sqrt{r (t)} d W (t)

(23)

The standard CIR formulation preserves mean reversion while keeping the modeled rate non-negative, a desirable property when the underlying rate series is strictly positive. Because the CIR process is naturally defined for non-negative rates, its application to real interest rate series containing negative observations requires caution. For India, where the in-sample real interest rate series includes a negative observation, the standard CIR process is not a structurally valid model; CIR results for India are therefore retained only as a diagnostic benchmark and should not be interpreted substantively. The exact discrete-time transition for the CIR process follows a scaled non-central chi-squared distribution (Cox et al. 1985). Specifically, conditional on

r (t)

, the variable

c r (t + Δ t)

follows a non-central chi-squared distribution:

c \cdot r (t + Δ t) \sim χ^{2} (2 q + 2, 2 u),

(24)

where

c = \frac{2 κ}{σ^{2} (1 - e^{- κ Δ t})}

,

q = \frac{2 κ θ}{σ^{2}} - 1

, and

u = c \cdot r (t) \cdot e^{- κ Δ t}

. All three parameters

(κ, θ, σ)

are estimated by maximizing the exact non-central chi-squared log-likelihood over the in-sample period 1991–2015, following Cox et al. (1985). This exact MLE approach avoids the discretisation error introduced by the Euler–Maruyama scheme and provides the most statistically efficient parameter estimates attainable under the CIR model.

The Feller condition (

2 κ θ > σ^{2}

), which ensures the rate process remains strictly positive, is satisfied for the U.S. and Italy; for India the condition is not satisfied, as detailed in the footnote to Table 7. The calibration results are reported in Table 7.

For the U.S., the CIR estimates

κ = 0.1780

and

θ = 3.3370 %

imply a half-life of 3.89 years — somewhat faster mean reversion than Vasicek (

κ = 0.1542

, half-life 4.50 years) — while the long-run equilibria of both models are closely aligned (

θ_{CIR} = 3.34 %

vs.

θ_{Vasicek} = 3.25 %

). The substantially smaller

σ

estimate under CIR (

0.57 %

vs.

1.10 %

for Vasicek) reflects the state-dependent scaling: at higher rate levels the effective CIR volatility

σ \sqrt{r}

is amplified, so a lower

σ

is needed to match the observed unconditional variance. Italy and India show the same ordering as the Vasicek results: slower mean reversion in Italy (half-life 4.10 years) and fast reversion in India (half-life 1.01 years).

In-sample, the CIR model achieves

R^{2} = 0.7293

and RMSE

= 1.0031

percentage points for the U.S., virtually identical to the Vasicek model (

R^{2} = 0.7297

, RMSE

= 1.0024

percentage points). Out-of-sample, the U.S. CIR model achieves RMSE

= 1.7328

and MAE

= 0.9721

percentage points over 2016–2021, slightly weaker than the Vasicek model (RMSE

= 1.6958

, MAE

= 0.9387

), confirming that the state-dependent volatility structure of CIR does not translate into meaningfully superior forecasting performance in this sample. The CIR AIC, evaluated using the exact non-central chi-squared transition density, is not directly comparable to the Gaussian-based AIC of the other models; cross-model comparisons therefore rely on RMSE, MAE, and

R^{2}

throughout.

4. Results

To assess both fit and predictive accuracy, all models are estimated on the in-sample period 1991–2015 and evaluated on the out-of-sample period 2016–2021. In-sample performance is measured using

R^{2}

, RMSE, and AIC. Out-of-sample performance is measured using RMSE, MAE, and forecast bias over the six-year test period. Annuity present values are computed using the portfolio rate method and are reported for both the full-sample fitted rates (in-sample) and the out-of-sample forecasts.

4.1. Comparison of Valuation Methods

Table 8 shows the present values of annuities computed using the yield curve and portfolio rate methods for maturities of 10, 20, and 30 years.

The results highlight significant differences between the two methods. The portfolio rate method produces stable values across maturities, reflecting sequential compounding of one-period rates. In contrast, the yield curve method generates lower values for short maturities and higher values for longer maturities, particularly for the 30-year annuity (PV = 19.35 vs. 16.29 under the portfolio rate method). This discrepancy arises from a fundamental structural difference: the yield curve method discounts each payment k independently using the k-period spot rate

i_{k}

, so there is no compounding across periods. The portfolio rate method, by contrast, compounds one-period rates sequentially, so that a high rate in period 1 reduces the present value of all subsequent payments. For annuities funded through a portfolio whose return evolves period by period, the portfolio rate method is the economically appropriate choice. Given its alignment with the cash-flow-matching interpretation of annuity reserves and its consistency with the historical path benchmark, the portfolio rate method is used as the benchmark in the following analysis.

4.2. Results under Alternative Interest Rate Models

We now evaluate annuity present values under various interest rate models, using the portfolio rate method as the benchmark. The results for maturities of 10, 20, 30, and 31 years are shown in Table 9.

A note on comparability is warranted. Deterministic models are evaluated at their fitted interest rate paths, which by construction represent a single trajectory. For the stochastic ARIMA, Vasicek, and CIR models, we report annuity present values computed from each model’s sequence of conditional expected rates — that is, the expected trajectory of the interest rate process given estimated parameters and the observed starting value, rather than an average over simulated paths. As discussed in Section 3.4.3, this approach places both model classes on the same economic footing, so that any differences in Table 9 reflect genuine differences in how each model characterizes the expected evolution of interest rates. One property of mean-reverting models is worth noting: for long horizons, the expected rate path of the Vasicek model converges mechanically to its long-run equilibrium

θ

regardless of the starting rate. For the U.S.,

θ = 3.2488 %

, so for 30-year annuity valuations the Vasicek expected path approaches 3.25% as the horizon grows — a rate consistent with the early part of the historical sample, which tends to inflate the Vasicek present values for long maturities relative to models that project the recently observed low rate levels forward.

Table 9. Present values under alternative interest rate models — U.S.

Model	PV(10)	PV(20)	PV(30)	PV(31)
Historical path	7.67730	12.50397	16.29246	16.63208
Constant mean	8.15094	13.71000	17.50137	17.80670
Piecewise constant	7.69126	12.55201	16.37544	16.72058
Piecewise linear	7.69050	12.51027	16.33882	16.68716
Cubic polynomial	7.68734	12.49141	16.22943	16.52072
Piecewise cubic	7.67490	12.50422	16.22658	16.51804
ARIMA(0,1,1)	7.70748	12.48002	16.17997	16.49878
Vasicek	7.79913	12.70848	16.48539	16.81218
CIR	7.80556	12.72862	16.50414	16.82954

4.3. In-Sample and Out-of-Sample Forecasting Performance

The results reveal a clear divergence between in-sample fit and out-of-sample predictive accuracy. The piecewise cubic model achieves the best in-sample fit (

R^{2} = 0.9611

, RMSE

= 0.3802

pp), but performs poorly out-of-sample (OOS RMSE

= 3.4650

pp), indicating overfitting to the training period. Similarly, the cubic polynomial model fits moderately in-sample but degrades sharply out-of-sample (OOS RMSE

= 3.8742

pp), likely because its global functional form extrapolates poorly beyond the estimation window.

Table 10. In-sample (IS) and out-of-sample (OOS) forecasting performance of alternative interest rate models — U.S. All RMSE, MAE, and Bias values are in percentage points.

Model	IS RMSE	IS $R^{2}$	IS AIC	OOS RMSE	OOS MAE	OOS Bias
Constant mean	1.9279	0.0000	34.8209	2.3116	1.7429	$+ 1.7429$
Piecewise constant	1.0107	0.7251	8.5331	1.5490	1.2864	$- 0.3055$
Piecewise linear	0.9837	0.7397	15.1773	1.5674	1.3702	$- 0.4610$
Cubic polynomial	1.1652	0.6347	15.6455	3.8742	2.6218	$+ 2.6218$
Piecewise cubic	0.3802	0.9611	$- 16.3501$	3.4650	2.6288	$+ 2.6288$
ARIMA(0,1,1)	0.6761	N/A	$- 15.5683$	1.6979	1.0268	$+ 0.7594$
Vasicek	1.0024	0.7297	6.1190	1.6958	0.9387	$+ 0.6266$
CIR	1.0031	0.7293	—^†	1.7328	0.9721	$+ 0.6975$

^† CIR AIC is evaluated using the exact non-central chi-squared transition density and is not comparable to the Gaussian-based AIC of the other models; it is therefore omitted from cross-model AIC comparisons. ARIMA(0,1,1) IS

R^{2}

is reported as N/A: the model is estimated on first-differences (unit-root series), so a conventional

R^{2}

on levels is not a meaningful goodness-of-fit statistic; IS RMSE provides the appropriate in-sample comparison. Cross-model comparisons rely on RMSE, MAE, and

R^{2}

throughout.

In contrast, the piecewise constant model achieves the lowest OOS RMSE (1.5490 pp). Piecewise linear ranks second (1.5674 pp), followed closely by Vasicek (1.6958 pp) and ARIMA (1.6979 pp), with Vasicek showing the smallest OOS MAE (0.9387 pp). It is worth noting that ARIMA marginally exceeds piecewise linear out-of-sample, meaning the claim that all deterministic models outperform all stochastic models holds only for the piecewise constant specification; piecewise linear is narrowly exceeded by both Vasicek and ARIMA. These findings indicate that structural flexibility in the form of piecewise segmentation, combined with mean-reversion in stochastic models, both contribute to out-of-sample accuracy, with the degree of advantage depending on the rate regime.

COVID-19 Robustness Check

The out-of-sample evaluation period (2016–2021) includes 2020, during which real interest rates were materially affected by the COVID-19 pandemic and associated monetary policy responses. To assess whether the OOS rankings are driven by this atypical observation, Table 11 re-computes all forecasting metrics for the U.S. excluding 2020, i.e. over the five-year period 2016–2019 and 2021.

When 2020 is excluded, the piecewise constant model retains the lowest RMSE (1.6372 pp), while the Vasicek model achieves the lowest MAE among the leading specifications (1.1178 pp). The broader qualitative findings are nonetheless preserved: the high-overfitting models (piecewise cubic, cubic polynomial) remain poor, stochastic models generally perform competitively with simple deterministic ones, and the catastrophic failures of trend-extrapolating models in volatile regimes are unchanged. This confirms that the main findings are not an artefact of the pandemic shock, though the narrow differences at the top of the U.S. ranking should be interpreted cautiously given only five or six out-of-sample observations.

Diebold–Mariano Forecast Accuracy Tests

To formally assess whether out-of-sample differences are statistically significant, we apply the Diebold–Mariano test with the Harvey et al. (1997) small-sample correction, comparing each model against the best-performing specification for each country: piecewise constant for the U.S., ARIMA(0,1,1) for Italy, and constant mean for India. The modified DM statistic MDM follows a

t (n - 1) = t (5)

distribution under the null hypothesis of equal predictive accuracy (Harvey et al. 1997); with

n = 6

out-of-sample observations the test has limited power, and non-rejection should not be interpreted as evidence of equal accuracy.

Results are reported in Table 12. For the U.S., no pairwise differences reach statistical significance at conventional levels (all

p > 0.10

), consistent with the narrow RMSE gaps documented in Table 10. For Italy, ARIMA significantly outperforms all competing specifications: MDM statistics range from 2.51 (piecewise cubic,

p = 0.054

) to 9.83 (constant mean,

p < 0.001

), with every competitor significantly inferior at the 10% level or better. These results provide formal statistical support for the central Italy finding that ARIMA’s local mean-tracking is genuinely superior in a rapidly declining rate environment, and are not merely a reflection of numerical chance in a small sample. For India, the piecewise linear model (MDM

= 2.40

,

p = 0.061

) is significantly worse than the constant mean benchmark, confirming statistically that trend-extrapolating models are genuinely harmful in a high-volatility, non-trending environment. The remaining top-performing India models (piecewise constant, ARIMA, Vasicek) do not differ significantly from the constant mean, consistent with the modest numerical gaps in Table 16.

4.4. Discussion of Results

The historical path model serves as the benchmark, yielding a present value of 12.50397 for the 20-year annuity. The results show that annuity valuations are highly sensitive to the chosen interest rate model.

The constant mean model overestimates the 20-year annuity present value by approximately 9.6% (13.71 vs. 12.50 under the historical path benchmark), as it fails to capture the downward trend in interest rates, resulting in insufficient discounting of future cash flows. The piecewise constant model improves by incorporating structural breaks, showing only a

+ 0.38 %

error (PV(20)

= 12.552

), while the piecewise linear model tracks the benchmark even more closely (

+ 0.05 %

, PV(20)

= 12.510

).

Table 12. Diebold–Mariano forecast accuracy tests (Harvey et al. 1997 small-sample correction). Each model is compared against the best-performing specification for the respective country (shown in the panel header). MDM denotes the modified DM statistic, which follows a

t (5)

distribution under

H_{0}

of equal predictive accuracy. Positive MDM indicates the benchmark achieves lower squared error loss than the competitor. RMSE values are in percentage points. Significance:

* * *

p < 0.01

,

* *

p < 0.05

,

*

p < 0.10

(two-sided

t (5)

critical values: 1.476, 2.015, 3.365).

Table 12. Diebold–Mariano forecast accuracy tests (Harvey et al. 1997 small-sample correction). Each model is compared against the best-performing specification for the respective country (shown in the panel header). MDM denotes the modified DM statistic, which follows a

t (5)

distribution under

H_{0}

of equal predictive accuracy. Positive MDM indicates the benchmark achieves lower squared error loss than the competitor. RMSE values are in percentage points. Significance:

* * *

p < 0.01

,

* *

p < 0.05

,

*

p < 0.10

(two-sided

t (5)

critical values: 1.476, 2.015, 3.365).

Model	OOS RMSE	MDM	p-value
Panel A: United States — Benchmark: Piecewise constant (RMSE $= 1.549$ pp)
Constant mean	2.312	1.058	0.338
Piecewise linear	1.567	0.214	0.839
Cubic polynomial	3.874	1.177	0.292
Piecewise cubic	3.465	1.306	0.248
ARIMA(0,1,1)	1.698	0.334	0.752
Vasicek	1.696	0.323	0.760
CIR	1.733	0.374	0.724
Panel B: Italy — Benchmark: ARIMA(0,1,1) (RMSE $= 1.943$ pp)
Constant mean	3.644	$9.832$	$0 . 0002^{* * *}$
Piecewise constant	2.073	$6.914$	$0 . 001^{* * *}$
Piecewise linear	2.225	$5.449$	$0 . 003^{* * *}$
Cubic polynomial	5.939	$3.069$	$0 . 028^{* *}$
Piecewise cubic	4.499	$2.512$	$0 . 054^{*}$
Vasicek	2.277	$3.294$	$0 . 022^{* *}$
CIR	2.352	$3.372$	$0 . 020^{* *}$
Panel C: India — Benchmark: Constant mean (RMSE $= 2.266$ pp)
Piecewise constant	2.551	0.335	0.751
Piecewise linear	10.976	2.403	$0 . 061^{*}$
Cubic polynomial	11.657	1.877	0.119
Piecewise cubic	19.311	1.922	0.113
ARIMA(0,1,1)	2.900	1.413	0.217
Vasicek	2.389	1.066	0.335
CIR	3.238	1.848	0.124

Loss function: squared error. Results under absolute error loss are qualitatively similar and available upon request. With

n = 6

OOS observations, non-rejection of

H_{0}

does not imply equal accuracy — it reflects limited statistical power.

The piecewise cubic model achieves near-perfect in-sample accuracy (

R^{2} = 0.9611

, RMSE

= 0.3802

pp), capturing both structural shifts and nonlinear dynamics within each regime. However, this in-sample precision does not carry over to the out-of-sample period: as shown in Table 10, the piecewise cubic model produces an OOS RMSE of 3.4650 pp — the second worst among all specifications — indicating overfitting of the training data. The cubic polynomial model exhibits the same pattern (OOS RMSE

= 3.8742

pp), likely because its global functional form extrapolates poorly beyond the estimation window.

The ARIMA(0,1,1) model produces present values close to the historical benchmark: PV(20)

= 12.480

versus the benchmark of 12.504, an underestimation of approximately

- 0.19 %

. The Vasicek model, reverting toward its in-sample equilibrium of

θ = 3.2488 %

, overestimates the 20-year annuity present value by approximately 1.64% (PV(20)

= 12.708

vs. benchmark 12.504). The CIR model, calibrated via exact non-central chi-squared MLE, overestimates marginally more at

+ 1.80 %

(PV(20)

= 12.729

), reflecting its slightly higher equilibrium rate of

θ = 3.3370 %

. Crucially, piecewise constant is the best-performing model out-of-sample, while ARIMA and Vasicek narrowly outperform piecewise linear, confirming that neither stochastic complexity alone nor deterministic segmentation alone guarantees superior performance — the appropriate choice is regime-dependent.

Overall, the U.S. results reveal a clear tension between in-sample fit and out-of-sample generalisability. The piecewise constant model achieves the best out-of-sample performance (OOS RMSE

= 1.549

pp), with piecewise linear second (1.567 pp), Vasicek third (1.696 pp), and ARIMA fourth (1.698 pp). For practitioners working in stable, gradually trending rate environments such as the U.S., the piecewise linear specification remains a strong choice: it captures regime-level trends, avoids the overfitting that undermines the piecewise cubic model, and performs competitively with stochastic alternatives on unseen data. However, as the cross-country evidence below demonstrates, this recommendation does not extend universally to all rate regimes.

4.5. Robustness: Cross-Country Evidence

To assess the generalisability of the findings, the full modelling pipeline is replicated for two additional countries: Italy and India. To avoid overloading the main text, graphical diagnostics are shown for the U.S. baseline case, while Italy and India are reported through equivalent numerical tables; the full modelling pipeline is identical across countries. Both series are drawn from the World Bank real interest rate database (World Bank 2024) and cover the same 1991–2021 period used for the United States, with the same train/test split (1991–2015 in-sample; 2016–2021 out-of-sample). Structural breakpoints are identified separately for each country using the Bai-Perron test.

Italy

The Bai-Perron test identifies breakpoints at 1995, 2000, and 2005 for Italy, yielding four regimes: 1991–1994, 1995–1999, 2000–2004, and 2005–2015. These correspond closely to the convergence of Italian interest rates toward the Euro area benchmark following EU accession preparations, the introduction of the euro, and the post-2008 financial crisis period. Italian real interest rates began the sample at approximately 6.6%, peaked at 11.7% in 1992, and declined steadily to 0.73% by 2021, tracing a long structural downtrend broadly similar to, but more pronounced than, the U.S. experience.

Table 13. Present values under alternative interest rate models — Italy.

Model	PV(10)	PV(20)	PV(30)	PV(31)
Historical path	6.57053	10.58997	13.51877	13.78369
Constant mean	7.68199	12.35054	15.18775	15.40130
Piecewise constant	6.60239	10.63561	13.49538	13.73099
Piecewise linear	6.56505	10.58270	13.44132	13.67501
Cubic polynomial	6.58686	10.56476	13.35255	13.54513
Piecewise cubic	6.57053	10.59201	13.41492	13.62415
ARIMA(0,1,1)	6.65915	10.54901	13.33174	13.56182
Vasicek	6.80038	10.85394	13.71907	13.95225
CIR	6.81363	10.87697	13.74029	13.97246

Table 14. In-sample (IS) and out-of-sample (OOS) forecasting performance — Italy. All RMSE, MAE, and Bias values are in percentage points.

Model	IS RMSE	IS $R^{2}$	IS AIC	OOS RMSE	OOS MAE	OOS Bias
Constant mean	2.4828	0.0000	47.4693	3.6442	3.5914	$+ 3.5914$
Piecewise constant	1.0991	0.8040	12.7265	2.0728	1.9786	$+ 1.9786$
Piecewise linear	0.8702	0.8771	9.0503	2.2254	2.1293	$+ 2.1293$
Cubic polynomial	1.0995	0.8039	12.7449	5.9391	5.5277	$+ 5.5277$
Piecewise cubic	0.2908	0.9863	$- 29.7543$	4.4990	4.0768	$+ 4.0768$
ARIMA(0,1,1)	1.3999	N/A	20.8216	1.9426	1.8417	$+ 1.8417$
Vasicek	1.2684	0.7390	17.8872	2.2769	2.1502	$+ 2.1502$
CIR	1.2690	0.7388	—^†	2.3524	2.2220	$+ 2.2220$

^† CIR AIC is evaluated using the exact non-central chi-squared transition density and is not comparable to the Gaussian-based AIC of the other models; it is therefore omitted from cross-model AIC comparisons.

The Italy results present a notable contrast to the U.S. findings. The piecewise cubic model achieves the best in-sample fit (

R^{2} = 0.9863

, RMSE

= 0.2908

pp) but overfits badly, producing an OOS RMSE of 4.499 pp. The cubic polynomial similarly degrades out-of-sample (OOS RMSE

= 5.939

pp). Among the remaining specifications, ARIMA achieves the lowest OOS RMSE (1.943 pp) by a substantial margin. Piecewise constant (2.073 pp), piecewise linear (2.225 pp), Vasicek (2.277 pp), and CIR (2.352 pp) perform comparably, with ARIMA’s dominant performance attributable to its local mean-tracking preventing over-extrapolation of Italy’s steeply declining trend. The CIR model, calibrated via exact MLE with

κ = 0.1691

and

θ = 4.3050 %

, overestimates PV(20) by

+ 2.71 %

(10.877 vs. 10.590) — slightly more than Vasicek (

+ 2.49 %

) — consistent with CIR’s higher equilibrium rate. The core finding is preserved: the model with the best local level-tracking (ARIMA) and mean-reverting stochastic models (Vasicek, CIR) generalize substantially better than trend-extrapolating deterministic alternatives in Italy’s rapidly declining rate environment. The statistical significance of ARIMA’s advantage over all competitors is confirmed by the Diebold–Mariano tests in Table 12, Panel B.

For reference, Table 17 below shows that a full-sample Vasicek calibration would yield an OOS RMSE of 1.615 pp for Italy — close to ARIMA (1.943 pp) and ahead of all deterministic alternatives. The in-sample-only design used throughout the main analysis is the methodologically correct comparison, but this full-sample sensitivity confirms that the Vasicek model’s relatively weaker Italy ranking is partly a consequence of the conservative calibration approach.

The Italy PV errors relative to the historical benchmark are summarised for the 20-year horizon: constant mean

+ 16.62 %

, piecewise constant

+ 0.43 %

, piecewise linear

- 0.07 %

, cubic polynomial

- 0.24 %

, piecewise cubic

+ 0.02 %

, ARIMA

- 0.39 %

, Vasicek

+ 2.49 %

, and CIR

+ 2.71 %

.

India

For India, the Bai-Perron test identifies breakpoints at 1995, 2005, and 2010, yielding regimes 1991–1994, 1995–2004, 2005–2009, and 2010–2015. India’s real interest rates are markedly more volatile than those of the U.S. or Italy, ranging from

- 1.98 %

in 2010 to 9.19% in 1999, and exhibit a non-monotone structure with no sustained long-run trend. This high volatility and the reversal in rates post-2015 — rising from 5.33% in 2017 to 6.89% in 2019 before falling sharply to 0.32% in 2021 — represents a more demanding out-of-sample environment.

The India results reveal a striking picture. The piecewise cubic model again achieves the best in-sample fit (

R^{2} = 0.9118

) but collapses out-of-sample (OOS RMSE

= 19.311

pp), driven by extreme extrapolation of the steeply rising trend estimated in the final training regime. Piecewise linear and cubic polynomial models similarly fail (OOS RMSE

= 10.976

and 11.657 pp, respectively). The models that generalize best are constant mean (OOS RMSE

= 2.266

pp), Vasicek (2.389 pp), and ARIMA (2.900 pp). The differences among the top models are more modest than the catastrophic failures of the trend-extrapolating specifications; with only six out-of-sample observations these differences should be interpreted cautiously, but the substantive conclusion is clear: all mean-reverting or level-stable models substantially outperform the trend-extrapolating specifications in this volatile, non-trending environment. The Diebold–Mariano test in Table 12, Panel C formally confirms that piecewise linear is significantly inferior to the constant mean (

p = 0.061

), while the remaining top models are not statistically distinguishable.

Table 15. Present values under alternative interest rate models — India.

Model	PV(10)	PV(20)	PV(30)	PV(31)
Historical path	7.32611	11.18257	13.81831	14.02104
Constant mean	7.55269	11.99174	14.60076	14.79328
Piecewise constant	7.27939	11.13361	13.75805	13.97634
Piecewise linear	7.27997	11.15439	13.61496	13.73339
Cubic polynomial	7.28861	11.20027	13.66502	13.78503
Piecewise cubic	7.32504	11.17934	13.59559	13.68291
ARIMA(0,1,1)	7.64961	11.61641	14.30281	14.49493
Vasicek	7.56855	11.70162	14.28094	14.46794
CIR	7.53306	11.47255	14.10841	14.29035

Table 16. In-sample (IS) and out-of-sample (OOS) forecasting performance — India. All RMSE, MAE, and Bias values are in percentage points.

Model	IS RMSE	IS $R^{2}$	IS AIC	OOS RMSE	OOS MAE	OOS Bias
Constant mean	2.6055	0.0000	49.8816	2.2661	1.4811	$+ 0.7467$
Piecewise constant	2.0564	0.3771	44.0475	2.5505	2.3894	$- 1.3883$
Piecewise linear	1.3372	0.7366	30.5301	10.9764	9.7957	$+ 9.7957$
Cubic polynomial	1.9102	0.4625	40.3596	11.6565	9.5596	$+ 9.5596$
Piecewise cubic	0.7738	0.9118	19.1787	19.3110	16.0655	$+ 16.0655$
ARIMA(0,1,1)	3.4794	N/A^‡	66.3426	2.8996	2.0308	$+ 1.9570$
Vasicek	2.2323	0.2660	46.1512	2.3889	1.6534	$+ 1.2963$
CIR	2.5180	0.0661	—^†	3.2380	2.5174	$+ 2.5174$

^† CIR results for India are retained as a diagnostic benchmark only; the standard CIR process is not a structurally valid model for this series (negative in-sample observation; Feller condition violated; log-likelihood did not fully converge). See Table 7. ^‡ ARIMA(0,1,1) IS

R^{2}

reported as N/A: the model is estimated on first-differences (unit-root series), so a conventional

R^{2}

computed on levels is not a meaningful goodness-of-fit statistic; IS RMSE (3.479 pp) provides the appropriate in-sample comparison metric.

The India PV errors at the 20-year horizon are: constant mean

+ 7.24 %

, piecewise constant

- 0.44 %

, piecewise linear

- 0.25 %

, cubic polynomial

+ 0.16 %

, piecewise cubic

- 0.03 %

, ARIMA

+ 3.88 %

, Vasicek

+ 4.64 %

, and CIR

+ 2.59 %

.

Table 17. Vasicek model: OOS performance under in-sample-only vs. full-sample calibration. All OOS RMSE values are in percentage points.

	In-sample only (1991–2015)			Full sample (1991–2021)
Country	$κ$	$θ$	OOS RMSE	$κ$	$θ$	OOS RMSE
U.S.	0.1542	3.2488%	1.6958	0.1263	2.0307%	1.5003
Italy	0.1505	4.2056%	2.2769	0.0911	2.2917%	1.6151
India	0.6890	5.7005%	2.3889	0.6702	5.2551%	2.1886

The in-sample-only calibration is used throughout the main analysis. The full-sample calibration is reported here for reference only; it uses information from the 2016–2021 test period and is therefore not a valid basis for out-of-sample comparison. The Italy difference (OOS RMSE 2.277 vs. 1.615 pp) is the largest, confirming that the in-sample-only design is more conservative for Vasicek in that country.

Taken together, the cross-country results provide two important qualifications to the U.S. baseline finding. First, the Italy evidence shows a partial reversal: in a developed economy with a more pronounced structural interest rate decline, ARIMA generalizes best out-of-sample, outperforming the deterministic segmentation models whose trend-extrapolation overshoots. Second, the India evidence shows that in high-volatility, non-trending emerging-market environments, trend-extrapolating deterministic models can fail catastrophically, and mean-reverting stochastic models or the simple constant mean offer a meaningful advantage. These results reinforce the central finding: the appropriate modeling choice is regime-dependent, and no single model class dominates across all environments.

4.6. Mortality-Adjusted Life Annuity Valuation

The analysis so far has treated annuity valuation as a pure present-value problem, abstracting from mortality. This is a deliberate assumption that isolates the effect of interest rate model specification. A natural question is whether the documented model risk persists once survival probabilities are introduced: if mortality discounting concentrates most of the annuity’s present value mass in the early years — where all models agree more closely — the valuation errors from misspecified interest rate assumptions might be substantially attenuated.

To address this question, we compute life annuity present values for a male aged 65 using the Social Security Administration (2020) period life table (Social Security Administration 2020). The mortality-weighted present value is defined as:

{\ddot{a}}_{65} = \sum_{k = 1}^{n} {}_{k}p_{65} \prod_{s = 1}^{k} {(1 + i_{s})}^{- 1}

(25)

where _k

p_{65}

denotes the probability that a male aged 65 survives at least k further years, and the interest rates

{i_{s}}

are drawn from each model’s fitted or projected sequence. We use a 31-year horizon (ages 65–96), consistent with the PV(31) calculations reported in Table 9. Table 18 reports mortality-adjusted present values for selected models for the U.S. and Italy, alongside the corresponding unadjusted values for comparison.

The results confirm that interest rate model risk is not offset by mortality discounting. For the U.S., the constant mean model generates a

+ 8.15 %

error in the life annuity present value — comparable to the

+ 7.06 %

error in the pure interest rate setting — indicating that mortality weighting does not systematically attenuate model risk. Piecewise constant and piecewise linear models remain accurate after mortality adjustment (

+ 0.37 %

and

+ 0.16 %

, respectively), confirming their robustness for practical reserving applications. The Vasicek and CIR models produce errors of

+ 1.37 %

and

+ 1.52 %

, respectively, consistent with the pure-PV results in Table 9.

The Italy results are more striking: the constant mean model’s error remains

+ 15.09 %

after mortality adjustment — still a very large valuation discrepancy that would be material for regulatory reserving purposes. The piecewise constant and piecewise linear models retain small errors (

- 0.30 %

and

- 0.45 %

), while Vasicek (

+ 1.64 %

) and CIR (

+ 1.78 %

) remain modestly above the benchmark.

These findings have a straightforward interpretation. Mortality weighting does not mechanically shrink model errors: because survival probabilities decline gradually over the horizon rather than concentrating all cash flows in the first few years, the cumulative discounting differences across models remain large in absolute terms. The model risk documented in the pure interest rate analysis is therefore directly relevant to practical life annuity reserving, and is not an artefact of ignoring mortality.

5. Conclusion

Table 19 summarises the regime-contingent modelling recommendations that emerge from the cross-country evidence.

This paper has examined the impact of interest rate model specification on annuity valuation within a unified discrete-time framework, making two contributions to the existing literature. First, it provides a cross-country out-of-sample evaluation framework that compares nine model specifications across three qualitatively distinct interest rate regimes, with formal statistical testing via Diebold–Mariano tests and a mortality-adjusted robustness check, generating regime-contingent guidance for actuarial model selection. Second, it applies and evaluates piecewise linear and piecewise cubic representations of the interest rate process within discrete-time annuity valuation, combining Bai-Perron regime identification with within-regime polynomial approximation to capture both structural level shifts and smooth within-regime dynamics.

Our results reveal that annuity values are highly sensitive to interest rate model specification. The piecewise linear specification achieves near-benchmark in-sample accuracy with strong out-of-sample generalisability in stable, gradually-trending environments (U.S.), offering practitioners a transparent and computationally accessible alternative to stochastic models in such regimes. The piecewise cubic model achieves superior in-sample fit but overfits badly out-of-sample across all three countries, demonstrating that high in-sample

R^{2}

is not a reliable guide for model selection in this setting. For the U.S. and Italy, the CIR model, estimated via exact non-central chi-squared MLE, produces parameter estimates and forecasting performance closely in line with the Vasicek model, confirming that the choice between these two stochastic benchmarks has limited practical consequence in the present strictly positive in-sample calibrations. For India, CIR is retained only as a diagnostic benchmark because the in-sample real-rate series contains a negative observation and the Feller condition is violated.

The cross-country evidence presents a nuanced picture. In the U.S., piecewise constant and piecewise linear models generalize best out-of-sample, outperforming the stochastic alternatives. In Italy, ARIMA outperforms all deterministic alternatives out-of-sample, with its level-correcting properties preventing over-extrapolation of the trend; this advantage is formally confirmed by Diebold–Mariano tests against all competitors. In India, trend-extrapolating deterministic models fail catastrophically, while the constant mean and Vasicek specifications generalize considerably better. This suggests that the appropriate model class is not universal: the choice between deterministic and stochastic frameworks should be guided by both the structural stability of the interest rate regime and the speed of any underlying trend.

A mortality-adjusted robustness check using the SSA (2020) period life table confirms that the documented model risk is not attenuated by integrating survival probabilities. For a male aged 65, the constant mean model generates a life annuity error of

+ 8.15 %

for the U.S. and

+ 15.09 %

for Italy, confirming that the interest rate model risk identified in the main analysis is directly relevant to practical life annuity pricing and reserving.

Several limitations should be acknowledged. First, each country series contains only 31 annual observations (25 in-sample, 6 out-of-sample), which limits the precision of structural break detection and parameter estimation, and makes out-of-sample rankings sensitive to individual observations. The narrow RMSE differences at the top of the U.S. ranking, none of which reach statistical significance under the Diebold–Mariano test, should therefore be interpreted with particular caution. Rolling-window and expanding-window out-of-sample validations, which would yield more test observations and more stable rank comparisons, are identified as a priority for future research. Second, the analysis uses real interest rate data from the World Bank; alternative rate series (e.g., risk-free rates, nominal rates, central bank policy rates) may yield different relative rankings and are better suited to specific reserving applications under Solvency II and IFRS 17 that require risk-free discounting. Third, all models employ a single interest rate per year rather than a full yield curve across maturities, which limits applicability to products requiring term-structure information. Fourth, the annuity valuation framework considers only annuities-immediate with level payments; extensions to variable-benefit, inflation-linked, or participating products would require additional modeling components. Future research could address these limitations by using higher-frequency data, longer out-of-sample periods, mortality-integrated frameworks incorporating stochastic mortality, and a broader set of insurance products across additional country regimes.

In conclusion, the choice of interest rate modeling framework should be guided by the empirical characteristics of the rate regime rather than by a universal preference for deterministic or stochastic complexity. From a regulatory perspective, this regime-dependence has direct implications for model risk management under Solvency II and IFRS 17, both of which require insurers to demonstrate that interest rate assumptions are appropriate and adequately stress-tested. The present value errors of up to 17% documented here — arising solely from model specification — represent quantifiable model risk exposures that actuaries and risk managers should account for when selecting interest rate models for reserving and pricing.

Author Contributions

Conceptualization, A.M.; methodology, A.M., V.N. and D.W.; software, A.M.; validation, A.M. and V.N.; formal analysis, A.M.; investigation, A.M.; data curation, A.M.; writing—original draft preparation, A.M.; writing—review and editing, V.N. and D.W.; visualization, A.M.; supervision, D.W.; project administration, D.W. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The real interest rate data analyzed in this study are publicly available from the World Bank’s World Development Indicators database under the indicator FR.INR.RINR (https://data.worldbank.org/indicator/FR.INR.RINR). The complete dataset used in the analysis (1991–2021 annual real interest rates for the United States, Italy, and India) is also reproduced in Appendix A of this manuscript. No new data were created in this study.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A. Real Interest Rate Data

Table A1. Annual real interest rates (%) for the United States, Italy, and India, 1991–2021. Source: World Bank, World Development Indicators (indicator FR.INR.RINR).

Year	US	Italy	India
1991	4.92	6.59	3.62
1992	3.88	11.65	9.13
1993	3.55	10.34	5.81
1994	4.90	8.15	4.34
1995	6.59	7.92	5.86
1996	6.32	8.09	7.79
1997	6.60	7.77	6.91
1998	7.15	6.13	5.12
1999	6.49	4.80	9.19
2000	6.81	5.18	8.34
2001	4.57	4.10	8.59
2002	3.07	3.13	7.91
2003	2.11	2.58	7.31
2004	1.61	2.80	4.91
2005	2.96	3.16	4.86
2006	4.73	3.33	2.57
2007	5.20	3.77	5.68
2008	3.10	4.33	3.77
2009	2.62	2.93	4.81
2010	2.01	3.41	$- 1.98$
2011	1.16	2.82	1.32
2012	1.36	3.46	2.47
2013	1.75	3.97	3.87
2014	1.59	3.93	6.70
2015	2.48	3.32	7.56
2016	2.64	2.22	6.23
2017	2.37	2.27	5.33
2018	2.46	1.59	5.36
2019	3.72	1.54	6.89
2020	2.85	0.74	4.14
2021	$- 1.09$	0.73	0.32

Appendix B. ARIMA Model Selection — Italy and India

This appendix reports the unit root tests and ARIMA model selection results for Italy and India, corresponding to the procedure described in Section 3.5.7 for the U.S.

Italy

An augmented Dickey-Fuller (ADF) test applied to the Italian real interest rate level series yields a test statistic of

- 2.104

(

p = 0.242

), failing to reject the unit root null at any conventional significance level and motivating first-differencing. The ADF test on the first-differenced series yields a statistic of

- 4.871

(

p < 0.001

), confirming stationarity in first differences. Inspection of the ACF and PACF of the differenced series is consistent with an MA(1) structure. Table A2 reports AIC and BIC for four candidate specifications.

Table A2. ARIMA model selection — Italy real interest rates (1991–2015).

Model	AIC	BIC
ARIMA(0,1,0)	79.34	80.52
ARIMA(0,1,1)	77.61	79.97
ARIMA(1,1,0)	78.93	81.29
ARIMA(1,1,1)	79.21	82.75

Bold indicates the selected model. Estimated MA(1) parameter:

{\hat{θ}}_{1} = 0.5123

.

India

An ADF test applied to the Indian real interest rate level series yields a test statistic of

- 2.318

(

p = 0.171

), failing to reject the unit root null. The ADF test on the first-differenced series yields a statistic of

- 5.204

(

p < 0.001

), confirming stationarity in first differences. Table A3 reports AIC and BIC for four candidate specifications.

Table A3. ARIMA model selection — India real interest rates (1991–2015).

Model	AIC	BIC
ARIMA(0,1,0)	91.17	92.35
ARIMA(0,1,1)	89.44	91.80
ARIMA(1,1,0)	90.82	93.18
ARIMA(1,1,1)	91.03	94.57

Bold indicates the selected model. Estimated MA(1) parameter:

{\hat{θ}}_{1} = 0.3847

.

In both countries, ARIMA(0,1,1) achieves the lowest AIC and BIC, confirming the same specification as selected for the U.S.

References

Bai, J., and Perron, P. 1998. Estimating and testing linear models with multiple structural changes. Econometrica 66(1): 47–78. [CrossRef]
Bai, J., and Perron, P. 2002. Computation and analysis of multiple structural change models. Journal of Applied Econometrics 18(1): 1–22. [CrossRef]
Box, G.E.P., Jenkins, G.M., Reinsel, G.C., and Ljung, G.M. 2015. Time Series Analysis: Forecasting and Control, 5th ed.; Wiley.
Cox, J.C., Ingersoll, J.E., and Ross, S.A. 1985. A theory of the term structure of interest rates. Econometrica 53: 385–407. [CrossRef]
Diebold, F.X., and Li, C. 2006. Forecasting the term structure of government bond yields. Journal of Econometrics 130(2): 337–364. [CrossRef]
Fergusson, K., Sun, J., Platen, E., and Shevchenko, P.V. 2025. Fair pricing and reserving of variable annuities with guarantees under the benchmark approach. Scandinavian Actuarial Journal: 1–34. [CrossRef]
Fernández-Rodríguez, F. 2006. Interest rate term structure modeling using Free-Knot Splines. The Journal of Business 79(6): 3083–3099. [CrossRef]
Goudenege, L., Molent, A., Wei, X., and Zanette, A. 2025. Enhancing valuation of variable annuities in Lévy models with stochastic interest rate. Scandinavian Actuarial Journal 2025(2): 213–235. [CrossRef]
Hamilton, J.D. 1989. A new approach to the economic analysis of nonstationary time series and the business cycle. Econometrica 57: 357–384. [CrossRef]
Harvey, D., Leybourne, S., and Newbold, P. 1997. Testing the equality of prediction mean squared errors. International Journal of Forecasting 13(2): 281–291. [CrossRef]
Kellison, S.G. 2009. The Theory of Interest, 3rd ed.; McGraw-Hill.
Li, S., Yin, C., Zhao, X., and Dai, H. 2017. Stochastic interest model based on compound Poisson process and applications in actuarial science. Mathematical Problems in Engineering, 2017: Article 3472319. [CrossRef]
McCulloch, J.H. 1971. Measuring the term structure of interest rates. The Journal of Business 44(1): 19–31. [CrossRef]
McCulloch, J.H. 1975. The Tax-Adjusted yield curve. The Journal of Finance 30(3): 811–830. [CrossRef]
Milidonis, A., and Chisholm, K. 2024. The regime-switching structural default risk model. Risks 12(3): 48. [CrossRef]
Mo, X., Qin, G., and Ou, H. 2023. Actuarial calculation of annuities under Markov stochastic interest rate model. Chinese Journal of Applied Probability and Statistics 39(6): 791–801.
Nelson, C.R., and Siegel, A.F. 1987. Parsimonious modeling of yield curves. Journal of Business 60(4): 473–489.
Ngugnie Diffouo, P.M., and Devolder, P. 2020. Longevity risk measurement of life annuity products. Risks 8(1): 31. [CrossRef]
Romagnoli, S., and Santoro, S. 2017. Interest rates term structure under ambiguity. Risks 5(3): 50. [CrossRef]
Social Security Administration. 2020. Period Life Table, 2020. Available online: https://www.ssa.gov/oact/STATS/table4c6.html (accessed on 12 May 2026).
Tsay, R.S. 2010. Analysis of Financial Time Series, 3rd ed.; Wiley. [CrossRef]
Vasicek, O. 1977. An equilibrium characterization of the term structure. Journal of Financial Economics 5(2): 177–188. [CrossRef]
World Bank. 2024. Real interest rate (%). World Development Indicators. Available online: https://data.worldbank.org/indicator/FR.INR.RINR (accessed on 12 May 2026).

Figure 1. U.S. piecewise-constant approximation of real interest rates, illustrating regime shifts over the sample period.

Figure 2. Piecewise-linear approximation of U.S. real interest rates, capturing within-regime trends and structural changes.

Figure 3. Global cubic polynomial fit of U.S. real interest rates, showing smooth long-term trends.

Figure 4. Piecewise-cubic approximation of U.S. real interest rates, capturing nonlinear dynamics and regime-specific behavior.

Figure 5. ARIMA(0,1,1) fit with 95% confidence intervals for U.S. real interest rates: estimated on 1991–2015 (in-sample) with 2016–2021 shown as forecast period.

Table 1. Estimated coefficients for the U.S. piecewise linear model.

Regime	Years	$a_{j}$	$b_{j}$
1	1991–1994	$- 0.000390$	$0.043710$
2	1995–2000	$0.000617$	$0.062589$
3	2001–2008	$0.001139$	$0.018807$
4	2009–2015 $*$	$- 0.000239$	$0.023554$

*

Regime 4 spans 2009–2021 in the full sample; the model is estimated on the in-sample period (1991–2015) only, so the table shows the estimation window. Out-of-sample forecasts for 2016–2021 are generated using the Regime 4 parameters.

Table 2. Bootstrap 95% confidence intervals for piecewise linear model coefficients (U.S.).

Regime	Years	$a_{j}$ (Slope)		$b_{j}$ (Intercept)
Regime	Years	Estimate	95% CI	Estimate	95% CI
1	1991–1994	$- 0.000390$	$[- 0.005558, 0.004778]$	$0.043710$	$[0.034891, 0.052269]$
2	1995–2000	$0.000617$	$[- 0.000509, 0.001736]$	$0.062589$	$[0.055995, 0.070403]$
3	2001–2008	$0.001139$	$[- 0.002194, 0.004730]$	$0.018807$	$[- 0.030351, 0.065365]$
4	2009–2015 $*$	$- 0.000239$	$[- 0.002153, 0.001583]$	$0.023554$	$[- 0.013740, 0.064194]$

*

Estimated on the in-sample window only; the full Regime 4 spans 2009–2021. See footnote to Table 1.

Table 3. Estimated coefficients of the U.S. cubic polynomial model.

$α_{0}$	$α_{1}$	$α_{2}$	$α_{3}$
0.038902	0.008007	$- 0.000905$	0.000023

Table 4. Estimated coefficients for the U.S. piecewise cubic model. Note: coefficients are estimated using the global time index (

t = 0

in 1991) and are not individually interpretable as interest rate levels.

Table 4. Estimated coefficients for the U.S. piecewise cubic model. Note: coefficients are estimated using the global time index (

t = 0

in 1991) and are not individually interpretable as interest rate levels.

Regime	Year	$c_{j, 0}$	$c_{j, 1}$	$c_{j, 2}$	$c_{j, 3}$
1	1991–1994	0.049200	$0.010717$	$0.001300$	0.001617
2	1995–2000	0.108170	$0.023307$	0.003990	$0.000212$
3	2001–2008	2.770151	$0.616866$	0.045421	$0.001094$
4	2009–2015 $*$	1.397800	$0.170212$	0.006774	$0.000086$

*

Regime 4 spans 2009–2021 in the full sample; the model is estimated on the in-sample period (1991–2015) only. See footnote to Table 1.

Table 5. ARIMA model selection — U.S. real interest rates (1991–2015).

Model	AIC	BIC
ARIMA(0,1,0)	73.13	74.31
ARIMA(0,1,1)	71.52	73.88
ARIMA(1,1,0)	72.35	74.71
ARIMA(1,1,1)	72.70	76.24

Bold indicates the selected model.

Table 6. Vasicek model calibration — in-sample period 1991–2015.

Country	$κ$	Half-life	$θ$	$σ$
U.S.	0.1542	4.50 yrs	3.2488%	1.1029%
Italy	0.1505	4.61 yrs	4.2056%	1.3931%
India	0.6890	1.01 yrs	5.7005%	3.0925%

Table 7. CIR model calibration — in-sample period 1991–2015 (exact non-central chi-squared MLE).

Country	$κ$	Half-life	$θ$	$σ$	Feller
U.S.	0.1780	3.89 yrs	3.3370%	0.5749%	Satisfied ( $2 κ θ = 1.1880 > σ^{2} = 0.3305$ )
Italy	0.1691	4.10 yrs	4.3050%	0.5233%	Satisfied ( $2 κ θ = 1.4558 > σ^{2} = 0.2738$ )
India	0.6890	1.01 yrs	5.7005%	3.0925%	Not satisfied ( $2 κ θ = 7.8587 < σ^{2} = 9.5635$ )^†

^† The Feller condition is violated for India under the exact MLE estimates. The Indian in-sample real-rate series also contains a negative observation, whereas the standard CIR process is defined for non-negative rates; the exact non-central chi-squared log-likelihood did not fully converge for this calibration. The India CIR estimates represent the best-attained solution and are retained only as a diagnostic benchmark; they should not be interpreted as a structurally valid model. AIC is therefore not reported for India in Table 16.

Table 8. Comparison of discrete valuation methods. Computed using the U.S. historical real interest rate path (1991–2021) under the respective discounting methods.

Method	$n = 10$	$n = 20$	$n = 30$
Portfolio Rate Method	7.67730	12.50397	16.29246
Yield Curve Method	7.32552	13.60084	19.34917

Table 11. U.S. out-of-sample performance: full period vs. excluding 2020. All values in percentage points.

Model	RMSE (full)	RMSE (excl. 2020)	MAE (excl. 2020)
Constant mean	2.3116	2.4883	1.8812
Piecewise constant	1.5490	1.6372	1.3443
Piecewise linear	1.5674	1.6326	1.4065
Cubic polynomial	3.8742	3.9783	2.4852
Piecewise cubic	3.4650	3.5545	2.5590
ARIMA(0,1,1)	1.6979	1.8597	1.2186
Vasicek	1.6958	1.8575	1.1178
CIR	1.7328	1.8973	1.1395

Table 18. Mortality-adjusted life annuity present values for a male aged 65 (SSA 2020 period life table), 31-year horizon. PV errors are computed relative to the historical path benchmark. “No mortality” values are reproduced from Table 9 (PV(31)) and Table 13 (PV(31)) for reference.

	U.S.			Italy
Model	PV (no mort.)	PV (life ann.)	Error (%)	PV (no mort.)	PV (life ann.)	Error (%)
Historical path	16.632	11.310	—	13.784	9.482	—
Constant mean	17.807	12.232	$+ 8.15$	15.401	10.913	$+ 15.09$
Piecewise constant	16.721	11.352	$+ 0.37$	13.731	9.454	$- 0.30$
Piecewise linear	16.687	11.328	$+ 0.16$	13.675	9.439	$- 0.45$
Vasicek	16.812	11.465	$+ 1.37$	13.952	9.638	$+ 1.64$
CIR	16.830	11.482	$+ 1.52$	13.972	9.651	$+ 1.78$

Life annuity present values are computed using equation (25) with survival probabilities from the SSA (2020) period life table for males. The 31-year horizon covers ages 65–96. Mortality data source: Social Security Administration (2020).

Table 19. Recommended interest rate model class by regime type.

Setting	Regime Characteristics	Best OOS Model	Key Reason
Country-Specific Evidence
U.S.	Stable, sustained gradual decline	Piecewise constant / linear	Structural segmentation generalises well
Italy	Stable, rapid structural decline	ARIMA	Local mean-tracking prevents trend over-extrapolation
India	High-volatility, non-trending	Constant mean / Vasicek	Level stability and mean-reversion prevent runaway extrapolation
General Decision Rules
Stable, gradual trend	Trending, structurally stable	Piecewise linear	Accuracy + transparency
Stable, sharp trend	Strong or rapid declining trend	ARIMA / local-level forecasting model	Level correction avoids overextrapolation
Volatile / unstable	High volatility, no persistent trend	Vasicek or constant mean	Mean-reversion is an asset

Practical guidance: Precede model selection with a structural stability assessment (e.g., Bai–Perron test or sub-sample ADF test). The appropriate model class depends on both the direction and stability of the rate regime. For series containing negative real interest rates, CIR should be used only with an appropriate shifted specification or treated as a diagnostic benchmark.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.

Regime-Dependent Performance of Deterministic and Stochastic Interest Rate Models in Annuity Valuation: Cross-Country Evidence from the U.S., Italy, and India

Abstract

Keywords:

Subject:

1. Introduction

2. Literature Review

3. Materials and Methods

3.1. Data

3.2. Annuities with Fixed and Variable Interest Rates

3.2.1. Fixed Interest Rate

3.2.2. Time-Varying Interest Rate

The Yield Curve Method

The Portfolio Rate Method

3.3. Piecewise Interest Rate Models

3.3.1. Piecewise Regression Framework

3.3.2. Empirical Specification

3.4. Annuity Valuation under Time-Varying Interest Rates

3.4.1. Discrete-Time Valuation

3.4.2. Benchmark Valuation Framework

3.4.3. Implementation with Alternative Interest Rate Models

3.5. Interest Rate Representations

3.5.1. Constant Mean Representation

3.5.2. Historical Path

3.5.3. Piecewise Constant Model

3.5.4. Piecewise Linear Model

3.5.5. Cubic Polynomial Model

3.5.6. Piecewise Cubic Model

3.5.7. ARIMA Model

Rate scale.

3.5.8. Vasicek Model

3.5.9. Cox-Ingersoll-Ross (CIR) Model

4. Results

4.1. Comparison of Valuation Methods

4.2. Results under Alternative Interest Rate Models

4.3. In-Sample and Out-of-Sample Forecasting Performance

COVID-19 Robustness Check

Diebold–Mariano Forecast Accuracy Tests

4.4. Discussion of Results

4.5. Robustness: Cross-Country Evidence

Italy

India

4.6. Mortality-Adjusted Life Annuity Valuation

5. Conclusion

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Appendix A. Real Interest Rate Data

Appendix B. ARIMA Model Selection — Italy and India

Italy

India

References

MDPI Initiatives

Important Links

Subscribe