On the Performance of Garch Family Models in the Presence of Additive Outliers

It is a common practice to detect outliers in a financial time series in order to avoid the adverse effect of additive outliers. This paper investigated the performance of GARCH family models (sGARCH; gjrGARCH; iGARCH; TGARCH and NGARCH) in the presence of different sizes of outliers (small, medium and large) for different time series lengths (250, 500, 750, 1000, 1250 and 1500) using root mean square error (RMSE) and mean absolute error (MAE) to adjudge the models. In a simulation iteration of 1000 times in R environment using rugarch package, results revealed that for small size of outliers, irrespective of the length of time series, iGARCH dominated, for medium size of outliers, it was sGARCH and gjrGARCH that dominated irrespective of time series length, while for large size of outliers, irrespective of time series length, gjrGARCH dominated. The study further leveled that in the presence of additive outliers on time series analysis, both RMSE and MAE increased as the time series length increased.


Introduction
Response variables are not only affected by exogenous variables but also by themselves from their past behavior.On the basis of this theoretical underpinning, autoregressive models have been invented.Box and Jenkins time series modeling is indispensable in analyzing stochastic processes.Autoregressive and moving average models are used frequently by many disciplines.
The autoregressive framework has very useful application in macroeconomics, such as for money supply, interest rate, price, inflation, exchange rates and gross domestic product and in financial time series analysis.The autoregressive heteroskedastic modeling framework is used in financial economics, such as asset pricing, portfolio selection, option pricing, and hedging and risk management (Ali, 2013).Studies abound in the financial literature about modeling the return on stocks.Usually, in the financial market, upward movements in stock prices are followed by lower volatilities, while negative movements of the same magnitude are followed by much higher volatilities (Ali, 2013).Engle (1982) developed the time varying variance model known as autoregressive conditional heteroskedastcity (ARCH) model which was the first model to assume that the volatility is not constant.Bollerslev (1986) extended the model to include the ARMA structure as generalized autoregressive conditional heteroskedasticity (GARCH).Ali (2013) asserted that subsequently, a number of studies have adopted the autoregressive conditional heteroskedasticity (ARCH) or a generalized autoregressive conditional heteroskedasticity (GARCH) models to explain volatility of the stock market; some of these studies have also transformed and developed Engel's basic model to more sophisticated models, such as generalized autoregressive conditional heteroskedasticity (GARCH) model, integrated GARCH (IGARCH), threshold GARCH (TGARCH), exponential GARCH (EGARCH) models, GARCH-in mean (GARCH-M) and others (Atoi, 2014;Grek, 2014); however these sophisticated models, in most case, failed to make the forecast accuracy of the original ARCH model better.

Outliers
In Statistics, an outlier is an observation point that is distant from the other observations.An outlier may be due to variability in the measurement or it may indicate experimental error; the later are sometimes excluded from the data set.An outlier can cause serious problems in statistical analyses.Outliers can occur by chance in any distribution, but they often indicate either measurement error or that the population has a heavy-tailed distribution.In the former case 3 one wishes to discard them or use statistics that are robust to outliers, while in the later case they indicate that the distribution has high skewness and that one should be very cautious in using tools or intuitions that assume a normal distribution (Wikipedia, 2017).
There are two types of outliers, namely: innovation outlier (IO), in which an outlier affects future values of the series, and additive outlier (AO), in which an outlier affects only the current observation (McQuarrie & Tsai, 2003).It should be noted that additive outliers affects forecast performance of GARCH models such that the sum of squares increases as additive outlier increases to a large number, (McQuarrie & Tsai, 2003).
This study focuses on the impact of additive outliers on performance of GARCH family models.
Consequently, some GARCH models are reviewed and the impacts of additive outliers on the GARCH models are examined.Furthermore the study carried out simulation of the GARCH family models in the presence of outliers, assuming three levels of outliers (small, medium and large) at different time series length.The simulation is replicated 1,000 times for each level of outliers and at different time series length, and the performance of the GARCH models is judged using the mean absolute error (MAE) and the root mean square error (RMSE).

Justification
There is need to have appropriate forecasting models which seek to improve forecast performance in financial time series, especially when there are additive outliers, which sort of violates some assumptions of the model.
Since additive outliers affect forecast performance of GARCH models such that the sum of squares increases as additive outlier increases to a large number, this study will reveal the GARCH model(s) which are more robust in forecasting volatility when additive outliers exist. 4 The aim of this study is to compare the family of GARCH models when the problem of outliers exists in a financial time series.

The GARCH Family Models
The autoregressive conditional heteroskedasticity (ARCH) model introduced by Fredrick Engel in 1982 is the first model that assumed that volatility is not constant.ARCH models are commonly employed in modelling financial time series that exhibit time-varying volatility clustering, that is, period of swings interspersed with periods of relative calm.(Grek, 2014;Wikipedia, 2017).
Over the years the ARCH model has seen several modifications and extensions resulting in different forms of the generalized autoregressive conditional heteroskedasticity (GARCH) models.GARCH model, which is an extension of ARCH model with autoregressive moving average (ARMA) formulation, was proposed independently by Bollerslev (1986) andTylor (1986) in order to model in a parsimonious way, and to solve some discovered disadvantages of ARCH model, this position was collaborated by Rossi (2004), Ragnarsson (2011) and Kelkay & G/Yohannes (2014).

Exponential generalized autoregressive conditional heteroskedasticity (EGARCH) model
proposed by Nelson (1991) to overcome some weaknesses of the GARCH model in handling financial time series.In particular, to allow for asymmetric effects between positive and negative asset returns.The log of the conditional variance in EGARCH signifies that the leverage effect is exponential and not quadratic.And (Tsay, 2005 transformation of volatility by its logarithm removes the restriction on the parameter to guarantee the positivity of the variance. The Nonlinear Generalized Autoregressive Conditional Heteroskedasticity (NGARCH) Model which Higgins & Bera (1992), Hsieh & Ritchken (2005) and Duan, et al (2006) said is an important modification of the GARCH model as it exhibits the leverage effect, a very attractive feature of stock return data, by shifting the minimum of the news impact curve away from the origin.
Other extensions of the model include the Glosten-Jagannathan-Runkle GARCH (GJR-GARCH) model proposed by Glosten, et al (1993) models asymmetry in the ARCH process.The model assumes a specific parametric form for the conditional heteroskedasticity present in zero mean white noise series which, although being serially uncorrelated, does not need to be serially independent.The Threshold GARCH (TGARCH) model by Zakoian (1994)  Genton and Loperfido (2005).It is a GARCH structure that takes into account the heteroskedastic nature of financial time series.It allows for parsimonious modeling of multivariate skewness, and according to De Luca and Loperfido ( 2012), all its elements are either null or negative, consistently with previous empirical and theoretical findings.

Methodology
This study focuses on the GARCH models that are robust for forecasting the volatility of financial time series data in the presence of outliers; so GARCH model and some of its extensions are presented

Autoregressive Conditional Heteroskedasticity (ARCH) Family Model
Every ARCH or GARCH family model requires two distinct specifications, namely: the mean and the variance equations (Atoi, 2014).The mean equation for a conditional heteroskedasticity in a return series, t y is given by where The mean equation in equation ( 1) also applies to other GARCH family models. (.) is the expected value conditional on information available at time t-1, while t  is the error generated from the mean equation at time t and t  is the sequence of independent and identically distributed random variables with zero mean and unit variance.
The variance equation for an ARCH(p) model is given by It can be seen in the equation that large values of the innovation of asset returns have bigger impact on the conditional variance because they are squared, which means that a large shock tends to follow another large shock and that is the same way the clusters of the volatility behave.
So the ARCH(p) model becomes:  is assumed to follow the standard normal or a standardized student-t distribution or a generalized error distribution (Tsay 2005).

Asymmetric Power ARCH
According to Rossi (2004), the asymmetric power ARCH model proposed by Ding, Engel & Granger (1993) given below forms the basis for deriving the GARCH family models Given that: where This model imposes a Box-Cox transformation of the conditional standard deviation process and the asymmetric absolute residuals.The leverage effect is the asymmetric response of volatility to positive and negative "shocks".
The restriction on ARCH and GARCH parameters ) , ( j i   suggests that the volatility ( i a ) is finite and that the conditional standard deviation ( i  ) increases.It can be observed that if q = 0, then the model GARCH parameter ( j  ) becomes extinct and what is left is an ARCH(p) model.
To expatiate on the properties of GARCH models, the following representation is necessary: , (i = 0, . . ., q) into Eq.( 4), the GARCH model can be rewritten as It can be seen that { t  } is a martingale difference series (i.e., E( t  ) = 0 and 0 ) , cov( = − j t t   , for j ≥ 1).However, { t  } in general is not an iid sequence.
A GARCH model can be regarded as an application of the ARMA idea to the squared series 2 t a .
Using the unconditional mean of an ARMA model, results in this provided that the denominator of the prior fraction is positive.(Tsay, 2005) When p =1 and q =1, we have GARCH(1, 1) model given by:

GJR-GARCH(p, q) Model
The Glosten-Jagannathan-Runkle GARCH (GJRGARCH) model, which is a model that attempts to address volatility clustering in an innovation process, is obtained by letting Which is the GJRGARCH model (Rossi, 2004).
Then recall Eq. ( 8) which allows positive shocks to have a stronger effect on volatility than negative shocks (Rossi, 2004).But when , the GJRGARCH(1,1) model will be written as

IGARCH(1, 1) Model
The integrated GARCH (IGARCH) models are unit-root GARCH models.The IGARCH (1, 1) model is specified in Tsay (2005) and Grek (2014) as iid , and The model is also an exponential smoothing model for the { 2 t a } series.To see this, rewrite the model as By repeated substitutions, we have which is the well-known exponential smoothing formation with 1  being the discounting factor (Tsay, 2005).

TGARCH(p, q) Model
The Threshold GARCH model is another model used to handle leverage effects, and a TGARCH(p, q) model is given by the following: where  , and j  are nonnegative parameters satisfying conditions similar to those of GARCH models, (Tsay, 2015).When , the TGARCH(1, 1) model becomes:

SGARCH(p, q) Model
The SGARCH model can be written as:

Simulation Procedure
The simulation procedure here considers the following equations of GARCH (1,1): The Case simulated is the case of financial time series where there are outliers at three level, namely: small values as 0.000005, 0.00006; medium values as 10, 50 and large values as 100, 500, at the following different time series length: 250, 500, 750, 1000, 1250 and 1500 The rugarch package of the R software was used to execute the simulation.

Forecast Assessment
The following are the criteria for Forecast assessments used: 1. Mean Absolute Error (MAE) has a formula . This criterion measures deviation from the series in absolute terms, and measures how much the forecast is biased.This measure is one of the most common ones used for analyzing the quality of different forecasts.
2. The Root Mean Square Error (RMSE) is given as 2 (y y ) is the time series data and f y is the forecast value of y (Caraiani, 2010).
For the two measures above, the smaller the value, the better the fit of the model (Cooray, 2008) In this simulation study,

Results
The results of the simulation carried out are presented in Table 1 to Table 8 below.series lengths, sGARCH dominated for lower time series lengths, irrespective of whether MAE or RMSE was used in the assessment.

Recommendations
The study therefore recommended that investors, financial analysts and researchers interested in stock prices and asset return should adapt gjrGARCH and sGARCH when outliers exist in their data.
is similar to GJR GARCH.It is commonly used to handle leverage effects.It allows the conditional standard deviation to depend on the sign of lagged innovation, and it does not show parameter restriction to guarantee that the conditional variance to be positive.The Integrated Generalized Autoregressive Conditional Heteroskedasticity (IGARCH) model is a restricted version of the GARCH model, where the persistent parameters sum up to one, and imports a unit root in the GARCH process.The Quadratic Generalized Autoregressive Conditional Heteroskedasticity (QGARCH) model bySentana (1995) is also used to model asymmetric effects of positive and negative shocks.Hentschel (1995) proposed the family GARCH (fGARCH) model as an omnibus model that nests a variety of other popular symmetric and asymmetric GARCH models including APARCH, GJR, AVGARCH, NGARCH, and so on.And the Skew-Generalized Autoregressive Conditional Heteroskedasticity (SGARCH) Model was introduced by De Luca,

th
is the conditional variance, and  ,  and  satisfy 0 1) is the innovation (or shock) of the market, and is hypothesized to be Gaussian.0  has to be positive and the remaining parameters nonnegative in order to ensure the positivity of 2 t  , (De Luca & Loperfido, 2012) models performance in the presence of outliers using the root mean square error (RMSE) from the results of the simulation When the additive outlier was small, iGARCH outperformed the other models at time series lengths (T) of 250, 750 and 1500, and TGARCH performed better than the other models at time series length (T) of 500 and 1000, while NGARCH performed better than the other models at time series length (T) of 1250.But for medium level of additive outliers, it can be clearly seen that the GARCH models that dominated were sGARCH and gjrGARCH.Whereas sGARCH performed better at time series lengths (T) = 250, T = 500 and T = 750, gjrGARCH outperformed the other models at T = 1000, T = 1250 and T = 1500.Also for the large level of outliers, gjrGARCH dominated, performing better at time series lengths (T) of 500, 750, 1250 and 1500, while TGARCH performed better at time series length (T) of 250, and sGARCH outperformed the other models at T = 1000.4.2.2 GARCH models performance in the presence of outliers using the mean absolute error (MAE) from the results of the simulationFor the small level of additive outliers, iGARCH dominated as it outperformed the other models at time series lengths (T) of 250, 750 and 1500.TGARCH performed better than the other models at time series length (T) of 500 and 1000, while NGARCH outperformed the other models at time series length (T) of 1250.

Table 2 :
The Ranks of The RMSE and MAE values from the fGARCH family model at different levels of outlier of 0.

Table 3 :
The RMSE and MAE values from the fGARCH family model at different levels of outlier of 10, 50 at different time series lengths

Table 4 :
The Ranks of The RMSE and MAE values from the fGARCH family model at different levels of outlier of 10, 50 at different

Table 5 :
The RMSE and MAE values from the fGARCH family model at different levels of outlier of 100, 500 at different time series lengths

Table 6 :
The Ranks of The RMSE and MAE values from the fGARCH family model at different levels of outlier of 100, 500 at different time series lengths

Table 7 :
The Performances of the fGARCH family models at different levels of outliers and at different time series lengths using RMSE

Table 8 :
The Performances of the fGARCH family models at different levels of outliers and at different time series lengths using MAE Preprints (www.