1. Introduction
The yield curve continues to be a useful tool for the financial market participants as a guide to levels of interest rates and their economic expectations. Synonymously, it is referred to as the term structure of interest rates, a representation of the yields plotted against a cross-section of unexpired maturities. Bonds and yields are used as a relative pricing benchmark for other financial instruments. Traders of various fixed income securities such as credit default swaps, loans, and related derivatives among others, rely on the yield curve to price their trades. The yield is a discount rate at which cash flow of fixed income securities is equated to the present value of an instrument a point in time. The yield is dependent on short-term interest rates, which in our SA context is a short-rate such as a repo set by the Monetary Policy Committee of the SA Reserve Bank is a proxy.
The yield is linked to a combination of either observable and unobservable (latent) factors, which may be either macroeconomic such as inflation, gross domestic product, yield curve factors, or the three statistical factors level, slope, and curvature as per the empirical evidence by [
2]. They evaluate methods of constructing yield curve factors and their loadings such as principal component analysis, the dynamic Nelson-Siegel and affine term structure models (ATSMs). The yield curve factors in particular, refer to the instantaneous or short rate of interest changes and its volatility.
Our study, which motivates the choice for ATSMs, seeks to address the cross-sectional yield dynamics with special attention to conditional volatility and correlation in a multi-factor setting. This would enable us to study several active SA Treasury bonds of different maturities simultaneously. Moreover, these dynamics have a critical impact on the pricing of bond options, risk management (hedging), and measurement such as calculation of value-at-risk. Stochastic volatility models of [
1,
3], and [
4] among others, consider the interest rate volatility to be a latent factor from which many properties of the yield may be observed.
Recent study by [
5] models the SA government bond yield using the interpolation method of Nelson-Siegel. Their report cites a remarkable statement that SA bond market is considered to be a leader within the emerging markets, and also ranked the highest in terms of liquidity. In addition, our review of the BIS quarter 4 of 2023 statistics highlights an amount of
billions of debt securities issued by the SA central government, in all markets at all original maturities denominated in domestic currency at nominal value; see [
6]. Whereas the Nelson-Siegel method produced good-fitting results in their study, our aim is to exploit the econometric modelling capabilities within ATSMs to model the SA bond yields. ATSMs provide a tractable and flexible framework to describe this dynamic behaviour of the entire term structure of interest rates by modelling the evolution of short-term and long-term rates simultaneously. They are known to have certain mathematical properties that make them analytically tractable. In particular, they can accommodate stochastic volatility, jumps and correlation among risk factors that drive asset returns. Affine representations of state variables in the DAPMs are popular because they lead to computationally tractable pricing relations and moment equations that can be used in estimation; see [
7].
The traditional practice of capturing the bond yield movement over time together with some macroeconomic variables has been conducted by using methods such as the vector autoregression. However, there are limitations when it comes to treatment of certain aspects of the bond and yield. First, a bond is an asset, and that the same bond with several different maturities may be traded at the same time. Second, there is a risk of holding long-dated bonds for a short-term as the investors may require compensation for such risk. There is an expectation that long-term yields represents a risk-adjusted average of a cross-section of short-term yields plus a risk premium in the absence of arbitrage. Third, the yields are not normally distributed, thus making it difficult to compute the expected values of future short rates. ATSMs provide a solution for this problem; see [
8].
ATSMs commence with the presumption that the instantaneous short rate
is an affine function of an
dimensional state vector
Y,
, and that
followed Gaussian and square-root diffusions, respectively. Affine models were extended by [
9] into consistent and arbitrage-free multifactor model of the term structure of interest rates in which yields at selected fixed maturities follow a parametric multivariate Markov diffusion process with "stochastic volatility." [
1] explore the structural differences and relative goodness-of-fit of the ATSMs. In particular, they consider a trade-off between the following modelling issues. First, the economic representation of both short and long-term dynamics of the state variable when studied simultaneously. Second, the computational burden of estimation and curve fitting. This trade-off is formalized by their classification of
factor affine family into
non-nested subfamilies of models. Their special attention to three-factor ATSMs suggests, based on theoretical considerations and empirical evidence, that some subfamilies of ATSMs are better suited than others to explaining historical interest rate behaviour. One example is the case where the conditional probability density of yields in closed-form is non-existent, rendering the maximum likelihood of no use. To circumvent this, a feasible choice should come from estimation methods such as Fourier-based, generalised methods of moments (GMM), Markov chain Monte Carlo (MCMC) and simulated methods of moments (SMM) among others. It is known that the conditional likelihood function of the latent state vector
may not be known, as a result [
1] follow the SMM of [
10].
The remainder of the paper is structured as follows:
Section 2 capturers the review of relevant literature.
Section 3 describes the ATSMs.
Section 4 defines a canonical representation of the ATSM and restrictions that are imposed on the parameters. 5 An overview and descibes the three-factor ATSMs. 6 Reviews briefly the characteristic function-based estimation methods for the ATSMs.
Section 7 - 10 Discusses data collection, scenario determination, model implementation and analysis of the results. In section 11 we conclude.
2. Literature Review
This paper explores the behaviour of the SA interest rates in terms of historical time series and a cross-section of yields across a maturity spectrum. Inspired by the seminal work of [
1], we proceed by implementing their model and its maximal counterpart. Among three options of models tested on US Treasury swaps, their
was found to perform better than other models, followed by their
. Initially, they consider a comprehensive framework for the specification, analysis, and classification of ATSMs. They provide a complete characterisation of admissible and identified ATSMs from which it is required that sufficient general conditions exist; see [
11], who describe the regular affine process. They also characterise the sufficient general conditions that must be met for a process to be affine; see [
7,
8] among others.
ATSMs are among popular models in the vast literature on interest rates term structure and bond pricing. Few examples are the early generation consisting of a single-factor Gaussian of [
12], and a square-root process by [
13], extended by [
14] into a multi-factor. The next generation are the correlated mixture affine models of [
1,
9], among others. The reason for their popularity is the ability to accommodate stochastic volatility, jumps and correlations among factors driving the asset returns; and lead to computationally tractable closed-form prices, and estimation through moment equations; see [
7]. Among research problems addressed using ATSMs is the description and treatment of the co-movement of short and long-term bond yields. An affine process
Y is defined as one in which a conditional mean
and variance
are affine functions of
Y. The process is further defined and characterised by [
11] as regular affine process, a class of time-homogeneous Markov processes. They consider a state space
, for integers
and
, from which the logarithm of a characteristic function of a transitional probability
of such process is affine with respect to the initial state
. [
7] conveniently formalises it in terms of their exponential-affine Fourier (for continuous-time) and Laplace (for discrete-time) transforms. The affine relationship is defined by coefficients which are solved by a family of ordinary differential equations (ODEs). These ODEs are the essence of tractability of regular affine processes. [
9] apply the ODEs as time-dependent drivers of the solution to a zero-coupon bond, provided the parameters are admissible. An inverted form of these zero-coupon bond gives rise to a yield as a state variable. They also exploit the idea of a yield-only analysis without including additional economic variables as latent factors.
[
15,
16] are among several authors who have approached the application of ATSMs in discrete-time although they are known to have less popularity compared to their continuous-time counterparts. Earlier models exhibited a tendency of perfectly correlated returns of bonds of all maturities, which is an unrealistic behaviour and unsuitable for hedging; see [
17]. Several authors extended these one-factor Markov representation of a short-rate by introducing a range of multi-factor models with the long-run mean
, and the stochastic volatility
of
that are affine functions of
for which [
1] explores several specifications. [
18] endorse a parsimonious representation of the yield curve matching the time series and cross-sectional variation of bond yields through three-factor models. They develop a simple estimation approach by exploiting the exponential-affine structure of these models; see also [
19] on the stochastic mean and stochastic volatility and three-factor model of the term structure of interest rates and its applications in derivatives pricing and risk management
A specification of an ATSM should be "admissible" and therefore lead to well-defined bond prices. The admissibility property is completely characterised by [
11] in the "canonical" state space
, with a non-negative diagonal matrix. However, this property has a problem of imposing parameter restrictions on the affine process to ensure that it is well defined. One typical scenario is the restriction of parameters to ensure that the conditional variance of a state variable remains non-negative. The requirements for admissibility become more complex as the number of state variables determining conditional variances increase; see [
7]. The admissibility condition ensures that the process does not exit the domain
. A family of
models with a domain
are a common admissible family of models; where
M factors evolve in a positive state space while
evolve in an unrestricted space; see [
20]. [
1] verifies this easily through admissible
factor ATSMs that are uniquely classified into
non-nested subfamilies.
Admissible models should also be canonical, meaning that they are economically identified, and maximally flexible; see [
7]. As a result, the
benchmark ATSM models should have a canonical representation and also satisfy the non-negative and non-explosive solution of ([
21]). Their drift should satisfy a Lipschitz condition, and the diffusion should satisfy the uniqueness condition of [
22]; see [
8]. These conditions have an effect of restricting the correlation structure of the affine diffusions. Exploiting the Gaussian and square-root form of diffusions, there still appear to be non-satisfaction of the regularity conditions of non-explosive growth and uniqueness, giving rise to need for a Feller condition;
1 see [
8]. A multi-dimensional extension of a Feller condition was implemented by [
9], which was found to handle the general correlated affine diffusions. The condition ensures that only positive factors enter the volatility
. This involves restrictions on the state variables that prevent the instantaneous conditional variances
from becoming negative. This condition is sufficient for the existence of a unique solution to the affine SDE according to [
11].
For each of the
subfamilies, there exists a maximal model that is econometrically plausible for all other models within this subfamily. They describe further the maximal models in relation to the
classification; and highlight an interaction within the family of ATSMs between the dependence of the conditional variance of each
on
and the admissible structure of the correlation matrix for
Y. A key advantage of maximal models is that of overcoming the overidentifying restrictions that are imposed on yield curve dynamics; see [
1]. The admissibility property is also confirmed by the no-arbitrage solution for a zero-coupon bond following [
9].
[
1] specification applied the continuous-time approach to the ATSMs which is popular to a majority of empirical literature. They explore the structural differences and relative goodness-of-fits of ATSMs. They refer to a trade-off between flexibility in modeling the conditional correlations and volatilities of the risk factors. They classify a family of
factor affine into
non-nested subfamilies of models. From their three-factor ATSMs, empirical analysis suggests that some subfamilies of ATSMs are better suited than others to explaining historical interest rate behaviour.
The focus of the research is to implement the specifications of [
1] to test the pricing of zero-coupon bonds and forecasting the yield curve dynamics when using the SA bond yield. It also attempts to extract the latent factors from the yield itself, without any consideration for other economic factors; see [
9]. ATSMs are proven to dominate both theoretical and empirical frameworks in term structure modelling; see [
8]. A link between the cross-sectional and time series properties is made consistent by the ATSMs. Evolution of unobserved factors from the risk-neutral dynamics of the yield are proved to have both the drift and the diffusion coefficients as affine functions of such factors by the ATSMs; see [
8]. Several methods of estimation are available and require mostly the knowledge of the joint conditional density of yields. In this study, we follow the estimation method of Fourier inversion for the characteristic function of a state variable, which is assumed to lead to a conditional density. This method leads to a closed-form solution where the maximum likelihood is an efficient estimator.
3. Model Establishment
We discuss the model in the context of admissibility of ATSMs. In the absence of arbitrage opportunities, a zero-coupon bond that matures at time
T is priced as
where:
is the price of a bond at time t maturing at time .
t is the current or initial time at which the bond is evaluated.
is the maturity date, at which the bond pays its face value.
s is a continuous time variable at which the interest rate process
evolves.
2
denotes the conditional expectation under the risk-neutral measure Q given the information available at time t.
To obtain an
factor ATSMs it is assumed that an instantaneous short-rate
is an affine function of a vector of
N unobservable state variables
, written as
where
and
.
Another assumption is that
follows an affine diffusion
and
represent the reversion rate and central tendency (long-term mean) parameters under a risk-neutral measure, respectively.
is an
dimensional independent Brownian motion under the risk-neutral measure
Q;
, and
are
matrices, which may be asymmetric or non-diagonal.
is a diagonal matrix with the
diagonal elements written as
where
and
. The parameter
can be interpreted as an intercept, which represents the base or long-run level of the variance for the
component.
represents the sensitivity of the variance for the
component to the state vector
. Together,
and
ensure that the conditional variance is always positive. The non-negativity in (
4) is the core requirement for admissibility in this framework.
The drifts in (
3) and conditional variances in (
4) are both affine in
. [
9] has the following time dependent solution to the price of a zero-coupon bond, provided that parameters are admissible.
and the related yield is computed as
where
and
are coefficients whose solution satisfies the following ODEs (Ricatti equations)
A solution to these ODEs is found through numerical integration, starting from the initial conditions
=
. Risk-neutral dynamics of the short rate
in (
2) through to (
4) determine this specification of the ODEs.
To use the closed-form representation of (
1) in the empirical study of ATSMs, it is required that the distributions of
and
under actual physical measure
P be known. To this end, a market price of risk
is introduced as
where
is an
vector of constants. The process
under physical measure
P, therefore also has an affine form
3
Note that a superscript Q has been removed. is an dimensional vector of independent Brownian motion under P, , . comprises of in its row, and is an vector with as its element.
[
1] acknowledge that their main purpose of (
9) is to preserve the affine structure of
under
. They do not pursue the impact of the market price of risk on the forecast but only focus on the correlation and volatility dynamics of the state variables. The square-root process is followed here to ensure non-negativity to the price of risk. Their form is referred to as "completely affine", which is found to have limitations as far as pricing of risk is concerned. As a results, some authors extended the completely affine form of market price of risk to address various factors that impact on the price of risk itself; see [
25,
26] and [
27] among others. Our approach focusses on the econometric representation of state variables, which we assume to incorporate a detailed market price of risk. In spite of this, we follow in the footsteps of [
1] where the main focus was to address the specification problems considering both correlation and volatility on the cross-section of yield data. Our workings would not focus on analysing the market price of risk but rather leave it to the future research which might incorporate other forms suitable for SA yield curve. We also support the idea of extending a detailed market price of risk to the emerging markets to respond to factors such as volatility, liquidity and bond credit risk; see the recent report by [
28]. Their report addresses a joint modelling of liquidity and credit risk for SA bond market, where they recommend their model or similar approach for emerging markets and corporate bonds.
4. A Canonical Representation of ATSMs
According to [
1], a general specification for (
10) may not always lead to a positive conditional variances over a range of
Y, given an arbitrary set of parameters
. However, admissibility requires that parameters restrict
in (
10) to be strictly positive for all
i; where
denotes the matrix of coefficients on
Y in
.
From (
4), there is a special case where there is no admissibility problem when
, for all
i, since the instantaneous conditional variances are all constant. Outside the special case, it is necessary to impose constraints on the drift parameters
and
, and diffusion coefficients
and
. The requirements for admissibility become more restrictive as a number of state variables determining
increases.
They consider a case where there are
M state variables driving the instantaneous conditional variance of the
vector
Y, such that
. They further propose a set of
benchmark models
as the most flexible econometrically identified affine DTSM on the state space
; see also([
11]). It is only when the admissibility conditions are met that a canonical representation may be defined.
Definition 3.1: For each
M,
is partitioned as
where
is
, and
is
; where
V and
D represent the volatility sources and the dependent factor, respectively. The canonical representation of the benchmark model
is defined as a special case of (
3) with
for
, and
is either lower or upper triangular for
.
The canonical representation of
is the mean-reversion matrix, with diagonal terms expected to pull the mean level to non-negativity, thus influencing positive variances. Its off-diagonal terms on the other hand reflect how different state variables influence each other, indicating potential dependencies or interactions that could affect the overall system behaviour. The matrix
therefore captures both the stabilising effects of the mean-reversion rates and the dynamic interplay between different state variables. In the three-factor analysis, this trade-off between non-negative variance and correlations requires a special attention. It also has an impact on the choice of
M, number of state variables entering volatility and the interactions among
factors
The following parameter restrictions are imposed:
[
1] define a subfamily
; of affine DTSM as nested special cases of the
canonical model or its invariant transformation; where
. Equivalent affine models are obtained under invariant transformations that preserve admissibility and identification and leave the observable quantities like short rate unchanged. Details on invariant transformation are discussed in Appendix A of [
1].
The following issues are further noted from [
1]:
The assumed structure of
ensures that
for the
canonical representation. To verify that
M resides in
, instantaneous conditional correlations among
are zero, whereas the instantaneous correlations among
are determined by parameters
because
. Admissibility is established provided (
20) holds, and that the conditional covariance matrix of
Y depends only on
. Zero restrictions in the upper right
block of
and the constraints in (
18) and (
19) ensure that
is positive. Stationarity is also assured by ensuring that all the eigenvalues of
are strictly positive; see also Appendix B in [
1].
In addition to an admissible canonical representation, in which the minimal known sufficient condition for admissibility were imposed, minimal normalisations for econometric identification are imposed to derive a "maximal" model in
. A more unique class of maximal
, referred to as the equivalence class of
model is obtained by invariant transformation of the canonical representation; see Appendix A in [
1].
[
1] further points that the canonical representation of
models may not always be a practical way for analysing state variables in ATSMs. Often, existing literature opted for parameterising ATSMs with the riskless rate
r as a state variable, resulting in "affine in
r" (Ar) representation. This can be rewritten as an "affine in
Y"
, where
can be expressed as an unobserved state vector
. As a result, a thorough specification analysis for
factor ATSMs necessitates evaluating
non-nested, maximal models, and ensuring that a thorough understanding of the model’s structure and implications is obtained.
5. The Three-Factor ATSMs
Three-factor models are used to describe the historical behaviour of the term structure of interest rates. Traditionally, these factors are unobserved (latent) and can only be defined statistically using techniques such as principal component analysis to convey economic meaning. Popular yield-curve fitting approaches such as the dynamic Nelson-Siegel model apply the principal component analysis (PCA) loadings to fit a yield curve; see [
2]. These approaches appear to fit and forecast well but lack the theoretical rigour to enforce some no-arbitrage restrictions. Contrary to the yield-curve fitting approaches, the empirical approaches to the factor models such as the ATSMs are worth pursuing. They consider the maximal parameterisation through which in general the economic identification of factors can be revealed. [
18] is among early works that are based on enforcing the no-arbitrage restrictions by implementing the three-factors models. They constructed a simple affine model with short term interest rate, mean rate and volatility as three factors, which are easy to estimate. They further conclude that the short rate plays and important role in yield curve modeling, following their observation that it could not be dominated by any other factor across all maturities.
[
1] explores various forms of the canonical ATSMs and their maximal counterparts, as influenced by the number conditional volatility and correlation of factors. Fixing these factors into
gives rise to their three-factor models which posit mainly the representation of the short-rate itself, its mean rate and volatility as the three-factors. Analysis and comparisons are made of the Gaussian versus the square root diffusion forms of the models, even though the latter appear to be preferred as it imposes the non-negative variance restrictions.
Three-factor models were derived from the notation
where
M is the number of state variables that enter volatility
according to [
1]. Emphasis has been put on the trade-off between conditional volatility and correlation as a focus for the analysis of the term structure of interest rates. As previously discussed, [
9] introduced a multi-dimensional Feller condition. It ensures that negative state factors do not enter the volatility
by restricting correlations; see also [
8]. We have previously also discussed in a similar context, the role of the mean-reversion rate matrix
, its non-negative diagonal terms restrictions and interactions among state variables through its off-diagonal terms.
A number
M of the factors that drive the process which enters a volatility
become the main argument on the choice of an
model, depending on the purpose of the study. [
29] point out that more volatility factors result in less flexibility in allowing risk premium and correlation structure. As a result, they are in favour of a one conditional volatility factor models
such as the
by [
1]. [
30] also favour the
with
and
for the same purpose of allowing flexibility for risk premium and correlation. Their focus is to impose restrictions on the parameters of
such that the volatility factor
disappears from the bond pricing equation. In our approach, we analyse the admissibility of parameters and cross-equation restrictions that result from interactions among the factors
. As previously discussed, the mean-reversion rate matrix
has elements with either negative of positive magnitudes playing a role of ensuring that factors are pulled from entering the variance only for non-negative values, otherwise non-negative correlations are the result. This is also applicable in the case of a three-factor model.
In this study we focus on the , models and their maximal counterparts and to determine both the fit and estimation when applied to the SA bond yield curve.
5.1.
These models are characterised by one factor
Y as a source of conditional volatility. As a result,
gives rise to the model form of
. From the original [
18] BDFS model, the
according to the notation of [
1] is specified as
where:
The state variables , and are the stochastic volatility for , central tendency or long-run mean of and short rate processes, respectively. The volatility affects the short rate through its volatility factor . The coefficient represents the rate at which the short rate reverts to the central tendency. The stochastic volatility also enters and it is also instantaneously correlated with as noted in the last term .
The maximal model is best suited for interpreting the parameter restrictions. As a result, [
1] prefer the following model in (
23) as a maximal
which is affine in
r. They determine their
by relaxing the parameters
and
in order to accommodate a non-zero correlation between the short rate and central tendency. All the other parameters inside the square boxes are set to zero to impose significant restrictions on the dynamics of interest rates and their volatility.
where:
serves as stochastic volatility for
, but also enters the drift of
, and correlated to
as noted in the term
;
is the central tendency of
r and
is the rate at which the short rate reverts to its central tendency. Appendix E in [
1] describes the transformation framework from which a test for admissibility and canonical representations in
can be achieved.
5.2.
These models are characterised by two factors of
Y as a source of conditional volatility. As a result,
gives rise to the model form of
. The [
19] model is the member of this sub-class of models, and it is represented as
,
and
are independent Brownian motions. The
follows a square-root diffusion unlike in the case of the BDFS model. Other parameters
,
and
are the same as in the above models. These leads us to the convenient maximal model for
which is represented as
[
1] relaxes the restrictions on
,
and
, while other parameters within a square box are restricted to zero.
6. Estimation for Affine Models
Several estimation strategies such as the maximum likelihood, generalised method of moments, simulated method of moments, Markov chain Monte Carlo, and the characteristic function-based method are discussed by many authors; see [
7,
8,
31]. We mention three among a possible list of issues to consider when selecting an estimation strategy for the affine models. First, an infinite set of moment conditions can cause a stochastic singularity problem, which leads to constraints for the GMM. This result from a cross-sectional yield data with many maturities. Second, a choice between inclusion or non-inclusion of a measurement error in a representation that links the observed yield with a state variable. Third, maximum likelihood efficiency is dependent on the conditional density of the state variable which is not always known; see [
32].
In contrast to the maximum likelihood estimation which requires that the density functions must be computed, CCF-based methods are straight-forward. They depend on the knowledge of functional form of the CCF for variables that are observed from affine diffusions. CCFs are the foundation for computationally tractable and asymptotically efficient estimators of the parameters of affine diffusions and asset pricing models representing the affine state variables; see [
31].
It is generally known that the conditional density function
f of
has a solution up to an inverse Fourier transform of
The charaterisic function
for
given
From the Proposition 1 of [
33], it can be shown that under suitable regulations (
5) is the conditional characteristic function of
, with
and
derived from the solution of (
7) and (
8) for
. Therefore, the conditional characteristic function becomes
The log-likelihood form for (
27) becomes
By conjecturing the parameters
and computing the Fourier inversion, maximum likelihood can be obtained by maximising (
30), to obtain a maximum likelihood estimator by characteristic function (ML-CCF); see [
7].
[
7] considers densities of individual columns
, for
. A selector vector
is assigned an entry 1 and zero elsewhere. The density
f of
given the entire
is the inverse Fourier transform of
Estimation of (
31) is based on one-dimensional
N integrations instead of
dimensional integrations.
Alternatively, the general method of moments (GMM) using a characteristic function is achieved by the residual
For an arbitary instrument
; the estimator becomes
The GMM approach is a beter alternative to a multi-dimensional Fourier inversion. However, as a grid of
becomes finer, correlations among moments become increasingly large, leading to a singular distance matrix; see [
7].
For an affine DTSM, a link between a set of
N-dimenstional yields
of several maturities
, it follows from
6 that
; where
follows an affine diffusion,
A is an
vector,
B is an
matrix. Vector
is a set of parameters linking
to the affine representation under the risk-neutral measure
.
A solution to the latent variable
can be solved, provided
is invertible, as
By the standard change-of-variable analysis, the conditional density function of
under
P becomes
If the conditional density function
of a state vector is known, it is easy to continue with the estimation of parameters
. For special cases of continuous-time Gaussian and independent square-root processes,
is known as long as the choice of the market price of risk is chosen to suite an affine process under
P; see [
7] and references therein.
For the continuous-time affine models in the family
, the unknown
can be computed from the CCF. It is easy to express the CCF of
in terms of the CCF of
, since
is nonsingular and both
and
can generate the same information set; see [
7].
For an affine diffusion it can be shown that CCF of
is
The Fourier inversion becomes
Fourier transforms are more suitable for low dimension problems, as they become numerically burdensome as
N increases. The burden could be avoided by selecting a method of moments, though at the expense of econometric efficiency; see [
7].
From (
34) it is clear that the measurement errors in the yields were excluded. As a result,
can be inverted to compute the state vector
, with Kalman filtering becoming very usefull; see [
8] and references therein. Kalman filtering provide the best solution for extracting nonlinear state vectors from the affine diffusions. We do not discuss the Kalman filter and its different forms in this paper even though we apply it to filter out the state vector
. There are many sources for a detailed discussion on the Kalman filter; see [
23].
7. Data Collection
A sample of weekly yields for active SA government treasury bonds over the periods October 2013 to September 2024 with maturities 3 months, 5, 10, 12, 20, 25, and 30 years, were retrieved from the Thomson Reuters database. Our in-sample and out-sample data were based on the periods October 2013 - Sep 2023 and October 2023 - September 2023, respectively. The out-sample will be best-suited for forecasting and validation. A summary of descriptive statistics for yields across maturities is presented in
Table 1. Mean values range between 6.6% to 10.4% exhibiting an upward slope which is also convex in shape. Recent study by [
5] reported a similar behaviour for the average yields. The highest weekly standard deviation of 1.3% is observed for the 3-months maturity which is typical in the short end of the yield curve, suggesting that yields may react quickly to changes in monetary policy or market sentiment. It is followed by a drop to 0.7% and 0.6% for the 5-year and 10-year maturities, respectively. The 20-year maturity exhibits a rise in weekly standard deviation to 0.9%, remaining constant towards the long-end. This suggests variation and higher volatility in the short end, followed by relative stability for the long end of the yield curve.
Table 2 presents the correlations across maturities of yields. The short end of the yield curve is characterised by weak correlations. The 3-month and 5-year terms exhibit negative correlations with their long end counterparts. This may suggest a difference and diversity in dynamics between the short and long end of the yield curve. It is also possible among reasons for the negative correlation that high liquidity in SA bond market has a portion of foreign investors who are favoured by falling currency exchange rates. SA bond is also found to have long maturities when compared to its emerging market counterparts; see [
28]. Other possible pertaining to positive correlations towards the long end should be associated with lower volatility and large portions of pension fund portfolios investing the same bonds with similar maturities but less trading activity. From the 10-year maturity, higher correlations ranging from 0.612 to 0.999 are observed, suggesting stability as the yield curve approaches its long end.
Figure 1 presents the first three principal components (PC) of the yields over maturities. They are explained by the variance of about 99.70%, which is close but slightly abovet the 98% empirical finding according to [
34]; see also [
2,
8]. The first PC represents a key rate shift or level change in rates. It is the result of volatility causing rates of all maturities to fluctuate by almost the same amount. The observation is that short end is associated with high volatility and increasing rates. At mid-term around the 10-year maturity they reach a peak, followed by stability as they approach the long end of the yield curve. The second PC represents a slope which exhibits a downward slope but with its highest level in the short end which might be associated with rate increases and volatility, followed by a drop in rates in mid-term region and stable but falling rates towards the long end. Volatility forces a fall in rates in the short end followed by a drop towards mid-term, 10-year maturity, thereafter, stabilises towards the long end. Both the slope and curvature exhibit the downward and convex slope in the same direction, suggesting a stylised fact of high volatility in the short end and low towards the long end.
The behaviour of PCs is also associated with the correlations as discussed earlier, where short end is associated with weak correlations whereas in the long end of the yield curve, strong correlations were observed. These patterns are suitable for trading in swaps and correlation-based hedging strategies. Our focus being the ATSMs, we believe that these PCs are somehow closely related to the latent factors derived by the solution of coefficients
and
in (
6). The three-factor models with three labels short-rate, volatility and central tendency should exhibit nearly a similar pattern to PCs; see [
8].
Figure 2 presents a time series of three principal components level, slope and curvature as derived from the observed yields. They are compared to the Kalman filtered time series of state vectors, namely factor 1, factor 2 and factor 3. Both time series are plotted against the observations period of October 2013 - Sep 2023. Both plots have similar patterns with factors on the right hand exhibiting smoother shapes than the principal components on the left. The analysis is extended to conducting a regression between the Kalman-filtered factors (dependent variable) and principal components (PC), with results presented in
Table 3. We note a high cumulative variance (94.3%) , suggesting that the first few PCs explain the yields variance. Intercept coefficients are close to zero for all the PCs, suggesting less impact in explaining behaviour of movements relative to the PCs. A sharp jump from 84.1% to 95.8% is a confirmation that the first few PCs captured most of the information in the yields data, with the later components contributing progressively less. The standard errors also vary across rows, with some being quite small, which indicates that the estimates of the corresponding coefficients are relatively precise. It can be concluded that the level, slope and curvature display similar characteristics to the factors from an affine model, therefore confirming the empirical findings by [
34].
Figure 3 presents a selection of observed yields from the SA Treasury bonds plotted against maturities of up to 30 years. A spread between 5 and 20 years is also plotted and expected to represent a slope. Crossovers are observed among individual plots from time to time, indicative of either positive or negative (inverted) yield curves. Initial unobserved inputs to the three-factor simulation of
are set to the initial values of the first three principal components. The length of a full matrix of yields comprising of 418 weekly observations and seven maturities is based on the maturity column with a minimum length. No gaps in data were discovered, otherwise the ommisions would be filled by applying an average of any preceeding two values. We apply the first vector of the yield matrix as an initial input values together with the initially guessed parameters to simulate the state variables from (
10). These are further used as inputs to the solution of ODE (
7) and (
8) from which coefficients
and
are obtained. Thereafter, a model-based set of zero-coupon bonds and zero-yields are obtained from (
6). The selection is also guided by the observations from the PCA, suggesting that our selection is a proxy for level, slope and curvature, taking into consideration the PCA features for short end, mid-term and long end.
11. Conclusions
The historical behaviour of the term structure of interest rates for the SA treasury bond was analysed. The purpose was to establish whether the ATSMs specifications of the three-factor models were suitable for the data. The study also considered the conditional volatilities and correlations of the state variables as they are essential components. It has therefore been essential to thoroughly examine the interactions among the state variables—stochastic volatility, central tendency, and the short rate—within the framework of a three-factor approach to the ATSM.
Results of both AIC, BIC for model selection and parameter complexity are in favour of maximal models. The statistic also show bias towards the maximal models. The differences in AIC, BIC and statistic show small differences marginally, suggesting that both models and their maximal counteparts may fit data equally well. Evidence from a pairwise analysis between models confirm this position with RMSE, mean errors and statistical results accordingly. In both cases, despite a strong possibility of model similarities, RMSE tends to exhibit slightly better performance for and than their maximal counterparts. We extended the tests to a pairwise analysis on both models, which lead to a conclusion that out-performs both in and out-sample.
Our study does not focus more on the market price of risk despite its importance. It is only relevant to preserve the formulation of the affine structure of the SDE and the transition from risk-neutral to physical measure. SA as an emerging market is susceptible to several risk factors, such as liquidity, exchange rate, credit risk and political. It is crucial that a specific form of market price of risk be incorporated into these specifications. We could not explore any of these forms of market price of risk that are mentioned within the ATSMs literature for SA. We leave them to future research.
The specification makes a selection of a number of factors M in the model to serve as proxies for volatility. There are limitations in this method giving rise to a need to consider unspanned stochastic volatility models (USV). USV is capable of identifying latent volatility that could not be easily detected by three-factor models. They are also found to work better for capturing jumps, option pricing and hedging as they are able to highlight the hidden volatility. Future research on suitable specifications for SA should be extended to incorporating USV.
Author Contributions
Conceptualization, M.M. and G.V.; methodology, G.V.; software, M.M; validation, M.M, and G.V.; formal analysis,M.M., and G.V; investigation, M.M and G.V.; resources, M.M and G.V; data curation, M.M and G.V; writing—original draft preparation, M.M; writing—review and editing, G.V.; visualization, M.M.; supervision, G.V.; project administration, G.V.; funding acquisition, None. All authors have read and agreed to the published version of the manuscript.
Figure 1.
Loadings of the first three principal components of the yields over maturities. Data were retrieved from Thomson Reuters.
Figure 1.
Loadings of the first three principal components of the yields over maturities. Data were retrieved from Thomson Reuters.
Figure 2.
(
a) Time series of the first principal components of observed yields
. (
b) Time series of the latent factors
as extracted from the affine diffusion using (
34) and the Kalman filtering. Data were retrieved from Thomson Reuters.
Figure 2.
(
a) Time series of the first principal components of observed yields
. (
b) Time series of the latent factors
as extracted from the affine diffusion using (
34) and the Kalman filtering. Data were retrieved from Thomson Reuters.
Figure 3.
3-month, 5, 10, 20, and 30-year observed yields are plotted together with the spread between 5 and 20-year yields, against maturities. The 5 - 20-year spread is indicative of a slope while there are overlaps from one line to others suggesting either positive of inverted yields. Data were retrieved from Thomson Reuters.
Figure 3.
3-month, 5, 10, 20, and 30-year observed yields are plotted together with the spread between 5 and 20-year yields, against maturities. The 5 - 20-year spread is indicative of a slope while there are overlaps from one line to others suggesting either positive of inverted yields. Data were retrieved from Thomson Reuters.
Figure 4.
In-sample models and are plotted versus their maximal counterparts. The top panel plots RMSE against maturities while the bottom panel plots the MAE against maturities. (a) RMSE for model vs . (b) RMSE for model vs . (c) MAE for model vs . (d) MAE for model vs .
Figure 4.
In-sample models and are plotted versus their maximal counterparts. The top panel plots RMSE against maturities while the bottom panel plots the MAE against maturities. (a) RMSE for model vs . (b) RMSE for model vs . (c) MAE for model vs . (d) MAE for model vs .
Figure 5.
Model-implied instantaneous rate in percentages is plotted against maturities. and appear in blue and magenta colours, respectively. Data were retrieved from Thomson Reuters.
Figure 5.
Model-implied instantaneous rate in percentages is plotted against maturities. and appear in blue and magenta colours, respectively. Data were retrieved from Thomson Reuters.
Figure 6.
Model-implied instantaneous rate in percentages is plotted against maturities. and appear in blue and magenta colours, respectively. Data were retrieved from Thomson Reuters.
Figure 6.
Model-implied instantaneous rate in percentages is plotted against maturities. and appear in blue and magenta colours, respectively. Data were retrieved from Thomson Reuters.
Table 1.
Statistical summary of in-sample yields for the SA Treasury bond by maturity caption. Data were retrieved from Thomson Reuters.
Table 1.
Statistical summary of in-sample yields for the SA Treasury bond by maturity caption. Data were retrieved from Thomson Reuters.
|
3 mths |
5 yrs |
10 yrs |
12 yrs |
20 yrs |
25 yrs |
30 yrs |
count |
418 |
418 |
418 |
418 |
418 |
418 |
418 |
mean |
0.066 |
0.084 |
0.094 |
0.098 |
0.104 |
0.104 |
0.104 |
std |
0.013 |
0.007 |
0.006 |
0.007 |
0.009 |
0.009 |
0.009 |
min |
0.035 |
0.066 |
0.084 |
0.085 |
0.087 |
0.088 |
0.088 |
25% |
0.058 |
0.080 |
0.090 |
0.093 |
0.097 |
0.097 |
0.097 |
50% |
0.069 |
0.085 |
0.092 |
0.096 |
0.101 |
0.101 |
0.101 |
75% |
0.074 |
0.089 |
0.096 |
0.101 |
0.110 |
0.111 |
0.110 |
max |
0.094 |
0.105 |
0.117 |
0.122 |
0.127 |
0.127 |
0.127 |
skew |
-0.406 |
-0.361 |
1.162 |
0.939 |
0.569 |
0.561 |
0.583 |
kurtosis |
-0.278 |
0.116 |
1.349 |
0.633 |
-0.511 |
-0.536 |
-0.457 |
Table 2.
Correlation matrix of in-sample yields across maturities. Data were retrieved from Thomson Reuters.
Table 2.
Correlation matrix of in-sample yields across maturities. Data were retrieved from Thomson Reuters.
|
3 mths |
5 yrs |
10 yrs |
12 yrs |
20 yrs |
25 yrs |
30 yrs |
3 mths |
1.000 |
0.713 |
0.302 |
0.046 |
-0.140 |
-0.152 |
-0.152 |
5 yrs |
0.713 |
1.000 |
0.612 |
0.269 |
-0.041 |
-0.063 |
-0.051 |
10 yrs |
0.302 |
0.612 |
1.000 |
0.918 |
0.736 |
0.718 |
0.724 |
12 yrs |
0.046 |
0.269 |
0.918 |
1.000 |
0.940 |
0.931 |
0.934 |
20 yrs |
-0.140 |
-0.041 |
0.736 |
0.940 |
1.000 |
0.999 |
0.998 |
25 yrs |
-0.152 |
-0.063 |
0.718 |
0.931 |
0.999 |
1.000 |
0.999 |
30 yrs |
-0.152 |
-0.051 |
0.724 |
0.934 |
0.998 |
0.999 |
1.000 |
Table 3.
Regression analysis between the three factors from the affine model against the three principal components. Factors are labelled , and and principal components level, slope and curvature are labelled , and , respectively.
Table 3.
Regression analysis between the three factors from the affine model against the three principal components. Factors are labelled , and and principal components level, slope and curvature are labelled , and , respectively.
Dependent Variable |
Independent Variable |
Coefficient |
Standard Error |
|
|
Intercept |
0.000 |
0.000 |
0.943 |
|
|
0.921 |
0.011 |
0.943 |
|
|
-0.040 |
0.014 |
0.943 |
|
|
-0.225 |
0.037 |
0.943 |
|
Intercept |
0.000 |
0.000 |
0.958 |
|
|
-0.031 |
0.008 |
0.958 |
|
|
0.950 |
0.010 |
0.958 |
|
|
-0.084 |
0.026 |
0.958 |
|
Intercept |
0.000 |
0.000 |
0.841 |
|
|
-0.037 |
0.005 |
0.841 |
|
|
-0.007 |
0.006 |
0.841 |
|
|
0.789 |
0.017 |
0.841 |
Table 4.
The estimators reported here are based on the log-likelihood computed from the Fourier inversion of the characteristic function of yields. Computation is based on variables
,
and
of both the model-based and observed yields. Parameters in the first column are the same as those used in (
23), second column are initial guesses while the 3rd and 4th columns are calibrated from the models
and
, respectively. Bold and underlined figures in the second column refer to initial parameter values which are restricted to zero in terms of the model assumptions.
Table 4.
The estimators reported here are based on the log-likelihood computed from the Fourier inversion of the characteristic function of yields. Computation is based on variables
,
and
of both the model-based and observed yields. Parameters in the first column are the same as those used in (
23), second column are initial guesses while the 3rd and 4th columns are calibrated from the models
and
, respectively. Bold and underlined figures in the second column refer to initial parameter values which are restricted to zero in terms of the model assumptions.
Parameter |
Initial |
Estimates |
|
|
|
|
|
0.365 |
0.365 |
0.366 |
|
0.015 |
0.015 |
0.008 |
|
0.001 |
0.001 |
0.001 |
|
0.083 |
0.083 |
0.083 |
|
0.000 |
0.000 |
0.000 |
|
0.000 |
0.000 |
0.000 |
|
4.270 |
4.270 |
4.200 |
|
0.000 |
0.000 |
0.021 |
|
-0.094 |
-0.094 |
-0.089 |
|
-3.420 |
-3.420 |
-3.770 |
|
0.000 |
0.000 |
0.000 |
|
0.000 |
0.000 |
0.035 |
|
17.400 |
17.400 |
18.000 |
|
0.050 |
0.050 |
0.050 |
|
0.050 |
0.050 |
0.050 |
|
0.378 |
0.378 |
0.378 |
|
0.756 |
0.756 |
0.756 |
|
0.866 |
0.866 |
0.866 |
|
|
0 |
0.000 |
|
|
-0.019 |
0.000 |
|
|
0.206 |
0.059 |
AIC |
|
-1557.54 |
-1615.92 |
BIC |
|
-1460.68 |
-1519.07 |
Log-likelihood function |
|
-802.768 |
-831.96 |
|
|
24.98 |
16.38 |
Table 5.
The estimators reported here are based on the log-likelihood computed from the Fourier inversion of the characteristic function of yields. Computation is based on moment labels
,
and
of both the model-based and observed yields. Parameters in the first column are the same as those used in (
26), second column are initial guesses while the 3rd and 4th columns are calibrated from the models
and
, respectively. Bold and underlined figures in the second column refer to initial parameter values which are restricted to zero in terms of the model assumptions.
Table 5.
The estimators reported here are based on the log-likelihood computed from the Fourier inversion of the characteristic function of yields. Computation is based on moment labels
,
and
of both the model-based and observed yields. Parameters in the first column are the same as those used in (
26), second column are initial guesses while the 3rd and 4th columns are calibrated from the models
and
, respectively. Bold and underlined figures in the second column refer to initial parameter values which are restricted to zero in terms of the model assumptions.
Parameter |
Initial |
Estimates |
|
|
|
|
|
0.636 |
0.634 |
0.291 |
|
-33.900 |
-33.962 |
-12.400 |
|
-35.300 |
-35.071 |
-273.996 |
|
0.000 |
0.000 |
-0.002 |
|
0.000 |
0.000 |
3.550 |
|
2.700 |
2.694 |
3.540 |
|
0.000 |
0.000 |
0.000 |
|
0.026 |
0.026 |
0.014 |
|
0.026 |
0.026 |
0.053 |
|
-182.000 |
-182.301 |
-133.003 |
|
0.000 |
0.000 |
1.000 |
|
0.000 |
0.000 |
0.095 |
|
0.000 |
0.000 |
0.000 |
|
0.003 |
0.003 |
0.002 |
|
0.000 |
0.000 |
0.000 |
|
0.000 |
0.000 |
0.000 |
|
0.050 |
0.050 |
0.050 |
|
0.562 |
0.562 |
0.562 |
|
0.035 |
0.035 |
0.037 |
|
0.111 |
0.111 |
0.111 |
|
|
0.000 |
0.000 |
|
|
0.058 |
0.000 |
|
|
0.643 |
1.990 |
AIC: |
|
-1721.53 |
-1750.42 |
BIC: |
|
-1616.61 |
-1645.50 |
Degree of freedom |
|
389 |
384 |
statistic: |
|
5.15 |
4.83 |
P-value: |
|
1 |
1 |
Log-likelihood function |
|
-886.77 |
-901.21 |
Table 6.
The table contains a pairwise analysis between models and and their maximal counterparts. In the top panal, in-sample RMSE for each maturity. In the lower panel mean errors are reported for each maturity. They are both accompanied by the t-statistics and p-values at each maturity level.
Table 6.
The table contains a pairwise analysis between models and and their maximal counterparts. In the top panal, in-sample RMSE for each maturity. In the lower panel mean errors are reported for each maturity. They are both accompanied by the t-statistics and p-values at each maturity level.
Maturity |
|
|
t-stat |
p-value |
|
|
t-stat |
p-value |
|
In-sample RMSE |
0.25 |
0.044 |
0.051 |
2.002 |
0.092 |
0.034 |
0.073 |
3.589 |
0.012 |
5 |
0.029 |
0.033 |
-1.341 |
0.228 |
0.017 |
0.055 |
5.597 |
0.001 |
10 |
0.024 |
0.026 |
3.529 |
0.012 |
0.012 |
0.046 |
3.563 |
0.012 |
12 |
0.023 |
0.024 |
-0.599 |
0.571 |
0.012 |
0.042 |
1.633 |
0.154 |
20 |
0.023 |
0.022 |
1.328 |
0.233 |
0.015 |
0.037 |
1.025 |
0.345 |
25 |
0.024 |
0.023 |
-1.244 |
0.260 |
0.016 |
0.037 |
-0.009 |
0.993 |
30 |
0.023 |
0.022 |
2.179 |
0.072 |
0.015 |
0.037 |
-0.455 |
0.665 |
In-sample mean error |
0.25 |
-0.037 |
-0.045 |
2.746 |
0.102 |
-0.030 |
-0.070 |
31.978 |
0.007 |
5 |
-0.019 |
-0.027 |
2.746 |
0.006 |
-0.012 |
-0.052 |
31.978 |
0.000 |
10 |
-0.009 |
-0.017 |
2.746 |
0.538 |
-0.002 |
-0.042 |
31.978 |
0.000 |
12 |
-0.005 |
-0.013 |
2.746 |
0.041 |
0.002 |
-0.038 |
31.978 |
0.000 |
20 |
0.001 |
-0.007 |
2.746 |
0.008 |
0.007 |
-0.033 |
31.978 |
0.000 |
25 |
0.001 |
-0.007 |
2.746 |
0.663 |
0.008 |
-0.032 |
31.978 |
0.000 |
30 |
0.001 |
-0.007 |
2.746 |
0.033 |
0.007 |
-0.033 |
31.978 |
0.000 |
Table 7.
In-smaple pairwise analysis between models and as a refinement to the tests performed comprehensively against their maximal counterparts.
Table 7.
In-smaple pairwise analysis between models and as a refinement to the tests performed comprehensively against their maximal counterparts.
Maturity |
|
|
t-stat |
p-value |
0.25 |
0.040 |
0.076 |
-4.934 |
0.003 |
5 |
0.028 |
0.057 |
-0.538 |
0.610 |
10 |
0.025 |
0.048 |
0.486 |
0.644 |
12 |
0.026 |
0.045 |
2.554 |
0.043 |
20 |
0.028 |
0.040 |
2.689 |
0.036 |
25 |
0.028 |
0.040 |
0.028 |
0.978 |
30 |
0.027 |
0.040 |
0.256 |
0.807 |
Table 8.
The table contains a pairwise analysis between models and and their maximal counterparts. In the top panal, out-sample RMSE for each maturity. In the lower panel mean errors are reported for each maturity. They are both accompanied by the t-statistics and p-values at each maturity level.
Table 8.
The table contains a pairwise analysis between models and and their maximal counterparts. In the top panal, out-sample RMSE for each maturity. In the lower panel mean errors are reported for each maturity. They are both accompanied by the t-statistics and p-values at each maturity level.
Maturity |
|
|
t-stat |
p-value |
|
|
t-stat |
p-value |
|
Out-sample RMSE |
0.25 |
0.016 |
0.022 |
-1.337 |
0.230 |
0.022 |
0.018 |
3.373 |
0.015 |
5 |
0.014 |
0.020 |
0.922 |
0.392 |
0.025 |
0.021 |
6.831 |
0.000 |
10 |
0.014 |
0.017 |
1.304 |
0.240 |
0.036 |
0.032 |
5.646 |
0.001 |
12 |
0.018 |
0.018 |
-0.498 |
0.636 |
0.045 |
0.040 |
6.219 |
0.001 |
20 |
0.029 |
0.027 |
-0.644 |
0.543 |
0.058 |
0.054 |
5.331 |
0.002 |
25 |
0.030 |
0.028 |
0.185 |
0.860 |
0.059 |
0.055 |
5.171 |
0.002 |
30 |
0.029 |
0.027 |
0.431 |
0.681 |
0.058 |
0.054 |
4.185 |
0.006 |
Out-sample mean error |
0.25 |
-0.011 |
-0.016 |
-0.548 |
0.271 |
0.021 |
0.016 |
2.923 |
0.008 |
5 |
-0.008 |
-0.013 |
-0.548 |
0.251 |
0.024 |
0.019 |
2.923 |
0.000 |
10 |
0.004 |
-0.001 |
-0.548 |
0.526 |
0.036 |
0.031 |
2.923 |
0.000 |
12 |
0.012 |
0.008 |
-0.548 |
0.235 |
0.044 |
0.039 |
2.923 |
0.000 |
20 |
0.026 |
0.021 |
-0.548 |
0.002 |
0.058 |
0.053 |
2.923 |
0.000 |
25 |
0.027 |
0.022 |
-0.548 |
0.008 |
0.059 |
0.054 |
2.923 |
0.000 |
30 |
0.026 |
0.021 |
-0.548 |
0.603 |
0.058 |
0.053 |
2.923 |
0.027 |
Table 9.
Out-smaple pairwise analysis between models and as a refinement to the tests performed comprehensively against their maximal counterparts.
Table 9.
Out-smaple pairwise analysis between models and as a refinement to the tests performed comprehensively against their maximal counterparts.
Maturity |
|
|
t-stat |
p-value |
0.25 |
0.017 |
0.021 |
0.488 |
0.148 |
5 |
0.015 |
0.024 |
0.488 |
0.073 |
10 |
0.012 |
0.036 |
0.488 |
0.079 |
12 |
0.016 |
0.044 |
0.488 |
0.046 |
20 |
0.027 |
0.058 |
0.488 |
0.099 |
25 |
0.028 |
0.059 |
0.488 |
0.049 |
30 |
0.028 |
0.058 |
0.488 |
0.643 |