Submitted:
08 May 2023
Posted:
09 May 2023
You are already at the latest version
Abstract
Keywords:
1. Introduction
- The LightGBM model and the Categorical boost model yield more accurate VaR estimates than the benchmark QR-IM model.
- The ensemble models achieve good estimates across all quantiles.
- The neural network models are in general quite unstable and could benefit from more training data and perhaps a better model architecture.
- Model stacking and hyperparameter tuning improved overall model predictions.
2. Literature Review
3. Data
3.1. Spot-rate returns
3.2. Examining the data
3.3. Testing the data set
4. Methods
4.1. Generating the training data
4.1.1. The quantile regression implied moments model
4.2. Value-at-Risk
4.3. Ensemble learning
4.4. Deep learning
4.4.1. Artificial neural network (ANN)
4.4.2. Activation functions
4.5. Implementation
4.6. The chosen ensemble models
4.7. Hyperparameter architecture
4.8. The considered neural networks

4.9. Explainable AI
4.10. Developed neural network models
4.10.1. Feed forward neural network

4.10.1. Recurrent neural network

4.10.1. Long short-term memory neural network

4.11. Evaluating the models
5. Results
5.1. Estimating the quantile regression VaR model

5.2. General scheme for evaluating the machine learning models
5.2.1. Illustrating the scheme: Random Forest
5.3. Summary of results - comparing the ML models to the QR-IM model
6. Conclusion
- The ensemble methods are better at predicting the spike behavior of the EURUSD than the neural networks. The ensemble methods are well suited for predicting the daily EURUSD currency cross distribution, including its VaR.
- The second LightGBM model, Categorical boost, and the stacking model between the two, stand out in terms of comparing the breach values to the base line, QR-IM, model. These models all perform better or equal to the latter at the lower quantiles and have poorer or equal performance at the higher quantiles.
- The random forest and LightGBM 1 and 2 have the best performances among the stand-alone models, in terms of i.b. and i.n.b. values.
- Neural networks can improve a lot compared to the ensemble methods. One way of doing this might be introducing more layers and more nodes. However, by doing so, the models become more of a black box.
- The advantage of neural networks compared to ensemble methods is the way they estimate the quantiles. The neural network is constructed such that all quantiles can be estimated simultaneously. On the other hand, the ensemble methods forecast one quantile at a time. For huge data sets, the possibility of computing all quantiles simultaneously can significantly improve computational time.
- Model stacking and hyperparameter tuning significantly improved the models in terms of the overall performance of the breach ratio.
6.1. Future research
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Andreani, M., Candila V. & Petrella, L. (2022). Quantile Regression Forest for Value-at-Risk Forecasting Via Mixed-Frequency Data. 4th International Conference on Information and Communications Technology (ICOIACT). Springer International Publishing. ISBN: 978-3-030-99638-3.
- Barone-Adesi, G., Legnazzi C. & Sala, C. (2019). Option-implied risk measures: An empirical examination on the s&p 500 index. International Journal of Finance & Economics. Vol. 24, issue 4. [CrossRef]
- Bijelic, A., & Ouijjane, T. (2019). Predicting exchange rate value-at-risk and expected shortfall: A neural network approach. https://lup.lub.lu.se/student-papers/search/publication/8989138.
- Bossens, F., Rayée, G., Skantzos, N. S. & Deelstra, G. (2010). Vanna-volga methods applied to fx derivatives: From theory to market practice. International Journal of Theoretical and Applied Finance. https://arxiv.org/pdf/0904.1074.pdf.
- Cai, X., Yang, Y. & Jiang, G. (2020). Online risk measure estimation via natural gradient boosting. 2020 Winter Simulation Conference. [CrossRef]
- Chang, B. Y., Christoffersen, P. & Jacobs, K. (2013). Market skewness risk and the cross section of stock returns. Journal of Financial Economics, 2013. Vol. 107, issue 1.
- Chen, M. & Chen, J. (2002). Application of quantile regression to estimation of value at risk. Review of Financial Risk Management. https://www.jcic.org.tw/upload/download/7e0d92a1-1d70-48a7-8d21-fee5317e9ce3.pdf.
- de Lange, P. E., Risstad, M., & Westgaard S. (2022). Estimating value-at-risk using quantile regression and implied moments. The Journal of Risk Model Validation. Vol. 16, no. 1.
- Engle, R. F., & Manganelli, S. (2004). Caviar: Conditional autoregressive value at risk by regression quantiles. Journal of Business & Economic Statistics. Vol. 22, no. 4.
- Görgen, K., Meirer, J. and Schienle, M. (2022). Predicting value at risk for cryptocurrencies with generalized random forests. Available at SSRN: https://ssrn.com/abstract=4053537 or. /: at SSRN: https. [CrossRef]
- He, K., Ji, L., Tso, G. K. F., Zhu, B. & Zou, Y. (2018). Forecasting exchange rate value at risk using deep belief network ensemble-based approach. Procedia Computer Science. file:///C:/Users/pelange/Downloads/Forecasting_Exchange_Rate_Value_at_Risk_using_Deep.pdf.
- Heryadi, Y. & Wibowo, A. (2021). Foreign exchange prediction using machine learning approach: A pilot study. International Conference on Information and Communications Technology.
- Huang, A. Y., Peng, S. P., Li, F., & Ke, C. J. (2011). Volatility forecasting of exchange rate by quantile regression. International Review of Economics & Finance. Vol. 20, issue 4.
- Huggenberger, M., Zhang, C. & Zhou, T. (2018). Forward-looking tail risk measures. SSRN papers. file:///C:/Users/pelange/Downloads/SSRN-id2909808.pdf.
- Jeon, J., & Taylor, J. W. (2013). Using CAViaR Models with Implied Volatility for Value-at-Risk Estimation. Journal of Forecasting, 32, 62-74. [CrossRef]
- Jiang, F., Wu, W. & Peng, Z. (2017). A semi-parametric quantile regression random forest approach for evaluating muti-period value at risk. 2017 36th Chinese Control Conference (CCC) [Internet]. 2017 Jul 7;5642–6. Available from: http://resolver.scholarsportal.info/resolve/19341768/v2017inone/5642_asqrrffemvar.xml.
- Kakade, K., Jain, I. & Mishra, A. K. (2022). Value-at-risk forecasting: A hybrid ensemble learning garch-lstm based approach. Resources Policy. Vol. 78.
- Petneházi, G. (2021). Quantile convolutional neural networks for value at risk forecasting. 2021. Machine Learning with Applications. Vol. 6. https://reader.elsevier.com/reader/sd/pii/S2666827021000487?token=4F25CEBCCE1C55F298694A71EB8700AF046FA8EC25CE9943280D97AF6446930D3A0C09FD8D495980B6F8041037869800&originRegion=eu-west-1&originCreation=20230426095056.
- Pradeepkumar, P. & Ravi, V. (2017). Forecasting financial time series volatility using particle swarm optimization trained quantile regression neural network. Applied Soft Computing. Vol. 58.
- Rosenblatt, F. (1958). The perceptron - a perceiving and recognizing automaton. https://blogs.umass.edu/brain-wars/files/2016/03/rosenblatt-1957.pdf.
- Sarma, M., Thomas, S. & Shah, A. (2003). Selection of value-at-risk models. Journal of Forecasting.
- Schaumburg, J. (2012). Predicting extreme value at risk: Nonparametric quantile regression with refinements from extreme value theory’, Computational Statistics and Data Analysis.
- Shapley, L. S. (1951) Notes on the n-person game – ii: The value of an n-person game. https://www.rand.org/content/dam/rand/pubs/research_memoranda/2008/RM670.pdf.
- Taylor, J. W. (2008). Using exponentially weighted quantile regression to estimate value at risk and expected shortfall, Journal of financial Econometrics. Vol 6, issue 3.
- Taylor, J. W. (1999). A quantile regression approach to estimating the distribution of multiperiod returns. Journal of Derivatives. Vol. 7, no. 1.
- Taylor, J. W. (2000). A quantile regression neural network approach to estimating the conditional density of multiperiod returns. Journal of Forecasting. Vol. 19, issue 4.
- Xu, Z., Zeng, Y., Xue, Y. & Yang, S. (2021). Foreign exchange prediction using machine learning approach: A pilot study. 2021 4th International Conference on Information and Communications Technology (ICOIACT). file:///C:/Users/pelange/Downloads/1570751055final.pdf.
- Xu, Q., Liu, X., Jiang, C. & Yu, K. (2016). Quantile autoregression neural network model with applications to evaluating value at risk. Applied Soft Computing. Vol. 49. https://www.tarjomefa.com/wp-content/uploads/2017/05/6615-English-TarjomeFa.pdf.
- Yan, X., Zhang, W., Ma, L., Liu, W. & Wu, Q. (2015). Parsimonious quantile regression of financial asset tail dynamics via sequential learning. In Advances in neural information processing systems. https://proceedings.neurips.cc/paper_files/paper/2018/file/9e3cfc48eccf81a0d57663e129aef3cb-Paper.pdf.
- Yen, J., Chen, X., & Lai, K. K. (2009). A statistical neural network approach for value-at-risk analysis. International Joint Conference on Computational Sciences and Optimization. https://www.researchgate.net/profile/Kin-Keung-Lai/publication/221187237_A_Statistical_Neural_Network_Approach_for_Value-at-Risk_Analysis/links/5be7d2d192851c6b27b5ffdf/A-Statistical-Neural-Network-Approach-for-Value-at-Risk-Analysis.pdf.
- Zhang, G., Patuwo, B. & Hu, M. (1998). Forecasting with artificial neural networks: The state of the art. International Journal of Forecasting.







| n | 2930 |
| Mean | -0.0001 |
| Std. dev | 0.0053 |
| Skewness | 0.0330 |
| Kurtosis | 4.7508 |
| Min | -0.0229 |
| Max | 0.0295 |
| Quantile – α | 1.0 % | 2.5 % | 5.0 % | 95.0 % | 97.5 % | 99.0 % |
| Constant | -0.61 (i) | -0.21 | -0.07 | 0.09 | 0.05 | 0.12 |
| ATM | -0.06 (i) | -0.09 (i) | -0.08 (i) | 0.09 (i) | 0.12 (i) | 0.15 (i) |
| 25-delta RR | 0.21 (i) | 0.16 (i) | 0.10 (ii) | 0.16 (i) | 0.30 (i) | 0.38 (i) |
| Breach Ratio | 1.04 | 2.48 | 4.97 | 94.99 | 97.43 | 99.01 |
| Quantile – α | 1.0 % | 2.5 % | 5.0 % | 95.0 % | 97.5 % | 99.0 % |
| Breach Ratio | 0.7 | 2.1 | 5.75 | 95.09 | 97.62 | 99.44 |
| Sum if breach | -0.005 | -0.018 | -0.054 | 0.067 | 0.022 | 0.005 |
| Sum if no breach | -7.421 | -5.641 | -4.483 | 4.729 | 5.982 | 7.650 |
| Min. Var | -0.023 | -0.022 | -0.019 | 0.004 | 0.005 | 0.006 |
| Max. VaR | -0.008 | -0.005 | -0.004 | 0.016 | 0.023 | 0.029 |
| Model | Data set used for training |
| XGBoost (1) | QR-IM generated data set |
| XGBoost (2) | Gradient boost generated data set |
| LGBM (1) | QR-IM generated data set |
| LGBM (2) | QR-IM generated data set with new movement variable (*) |
| LGBM (3) | LGBM generated data set |
| Categorical Boosting | QR-IM generated data set |
| FFNN | QR-IM generated data set with new movement variable (*) |
| RNN | QR-IM generated data set with new movement variable (*) |
| LTSM | QR-IM generated data set with new movement variable (*) |
| quantile – α | 1.0 % | 2.5 % | 5.0 % 95.0 % | 97.5 % | 99.0 % | |
| Base line model | -Breach Ratio- | |||||
| QR-IM Ensemble methods |
0.7 | 2.1 | 6.3 | 95 | 97.5 | 99.3 |
| Random Forest | 0.7 | 2.1 | 5.75 | 95.09 | 97.62 | 99.44 |
| XGB (1) | 0.7 | 2.1 | 5.47 | 95.09 | 97.9 | 99.44 |
| XGB (2) | 1.26 | 3.65 | 7.01 | 95.37 | 98.46 | 99.16 |
| LightGBM (1) | 0.55 | 1.82 | 5.33 | 95.09 | 97.62 | 99.44 |
| LightGBM (2) | 0.7 | 2.1 | 5.05 | 95.23 | 97.76 | 99.44 |
| LightGBM (3) | 1.4 | 4.49 | 8.42 | 95.65 | 96.63 | 99.3 |
| CatBoostStacking Models | 0.7 | 2.23 | 5.47 | 94.95 | 97.62 | 99.44 |
| LGBM2&CatBoostNeural network | 0.7 | 2.18 | 5.31 | 95.05 | 97.67 | 99.44 |
| FFNN | 2.38 | 4.35 | 8.27 | 90.74 | 96.21 | 98.74 |
| RNN | 2.42 | 4.27 | 7.68 | 90.75 | 96.02 | 98.72 |
| LTSM | 3.13 | 4.97 | 8.68 | 96.16 | 96.16 | 98.44 |
| quantile - α | 1.0 % | 2.5 % | 5.0 % | 95.0 % | 97.5 % | 99.0 % | |Sum| | |
| Ensemble methods | ||||||||
| Random Forest | i.b. i.n.b |
-0.005 -7.421 |
-0.018 -5.641 |
-0.054 -4.483 | 0.067 4.729 |
0.022 5.982 |
0.005 7.650 |
0.171 35.91 |
| XGB (1) | i.b. i.n.b. |
-0.006 -7.442 |
-0.019 -5.680 |
-0.052 -4.521 |
0.064 4.756 |
0.023 6.028 |
0.005 7.643 |
0.169 36.07 |
| XGB (2) | i.b. i.n.b. |
-0.025 -5.860 |
-0.056 -5.152 |
-0.184 -3.612 |
0.112 4.155 |
0.040 5.477 |
0.020 6.391 |
0.437 30.65 |
| LightGBM (1) | i.b. i.n.b. |
-0.004 -7.445 |
-0.017 -5.668 |
-0.053 -4.500 |
0.066 4.748 |
0.022 6.005 |
0.004 7.691 |
0.166 36.06 |
| LightGBM (2) | i.b. i.n.b. |
-0.005 -7.445 |
-0.019 -5.682 |
-0.055 -4.526 |
0.060 4.816 |
0.020 6.046 |
0.005 7.712 |
0.164 36.23 |
| LightGBM (3) | i.b. i.n.b. |
-0.020 -6.255 |
-0.054 -4.860 |
-0.104 -4.110 |
0.067 4.929 |
0.033 5.657 |
0.010 7.164 |
0.288 32.97 |
| CatBoost Neural network |
i.b. i.n.b. |
-0.006 -7.411 |
-0.020 -5.641 |
-0.056 -4.496 |
0.064 4.789 |
0.022 6.017 |
0.005 7.688 |
0.173 36.04 |
| FFNN i.b.i.n.b. | -0.052 -9.477 |
-0.098 -8.139 |
-0.188 -6.752 |
0.186 6.420 |
0.086 7.917 |
0.038 9.873 |
0.648 48.57 |
|
| RNN i.b.i.n.b. | -0.069 -9.120 |
-0.157 -8.486 |
-0.208 -6.287 |
0.201 6.274 |
0.088 8.406 |
0.047 10.064 |
0.770 47.00 |
|
| LTSM i.b.i.n.b. | -0.052 -9.303 |
-0.095 -8.114 |
-0.200 -6.529 |
0.213 6.132 |
0.110 7.346 |
0.046 9.242 |
0.716 46.66 |
|
|
Christoffersen test (p-value) |
1.0 % | 2.5 % | 5.0 % | 95.0 % | 97.5 % | 99.0 % |
| Random Forest | 0.398 | 0.489 | 0.364 | 0.364 | 0.846 | 0.200 |
| XGB (1) | 0.398 | 0.489 | 0.564 | 0.917 | 0.489 | 0.200 |
| XGB (2) | 0.175 | 0.000 | 0.000 | 0.000 | 0.157 | 0.001 |
| LightGBM (1) | 0.200 | 0.343 | 0.810 | 0.917 | 0.846 | 0.200 |
| LightGBM (2) | 0.398 | 0.489 | 0.945 | 0.781 | 0.660 | 0.200 |
| LightGBM (3) | 0.306 | 0.002 | 0.000 | 0.419 | 0.157 | 0.398 |
| CatBoost | 0.398 | 0.660 | 0.564 | 0.945 | 0.846 | 0.200 |
| DQ test (p-value) | 1.0 % | 2.5 % | 5.0 % | 95.0 % | 97.5 % | 99.0 % |
| Random Forest | 0.622 | 0.929 | 0.191 | 0.727 | 0.255 | 0.965 |
| XGB (1) | 0.806 | 0.944 | 0.192 | 0.280 | 0.858 | 0.965 |
| XGB (2) | 0.128 | 0.000 | 0.000 | 0.000 | 0.084 | 0.001 |
| LightGBM (1) | 0.673 | 0.938 | 0.319 | 0.648 | 0.717 | 0.964 |
| LightGBM (2) | 0.617 | 0.942 | 0.491 | 0.381 | 0.201 | 0.963 |
| LightGBM (3) | 0.000 | 0.001 | 0.000 | 0.951 | 0.506 | 0.991 |
| CatBoost | 0.692 | 0.943 | 0.371 | 0.678 | 0.204 | 0.964 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).