Submitted:
13 June 2024
Posted:
21 June 2024
You are already at the latest version
Abstract
Keywords:
MSC: 62M10; 62M20; 62J05
1. Introduction
2. Orthogonal Greedy Algorithm
3. Supervised Dynamic PCA
4. The Proposed GO-sdPCA
| Algorithm 1:GOGA+HDAIC () |
|
Input: Number of maximum iterations , HDAIC parameter C, number of lags , candidate set
Initialization: ; selected index
Output: selected indices
|
| Algorithm 2:Peeling () |
|
Input: Number of peeling iterations M, number of GOGA iterations , HDAIC parameter C, number of lags in group predictors
Initialization: ,
Output: selected indices
|
5. Simulation Studies
5.1. Simulation Designs and Results
6. Empirical Examples
6.1. U.S. Macroeconomic Data
6.2. Particulate Matters in Taiwan
7. Discussion and Concluding Remarks
References
- Peña, D.; Box, G.E. Identifying a simplifying structure in time series. Journal of the American statistical Association 1987, 82, 836–843. [Google Scholar] [CrossRef]
- Stock, J.H.; Watson, M.W. Forecasting Using Principal Components From a Large Number of Predictors. Journal of the American Statistical Association 2002, 97, 1167–1179. [Google Scholar] [CrossRef]
- Stock, J.H.; Watson, M.W. Macroeconomic Forecasting Using Diffusion Indexes. Journal of Business & Economic Statistics 2002, 20, 147–162. [Google Scholar]
- Lam, C.; Yao, Q.; Bathia, N. Estimation of latent factors for high-dimensional time series. Biometrika 2011, 98, 901–918. [Google Scholar] [CrossRef]
- Tsay, R.S. Multivariate time series analysis: with R and financial applications; John Wiley & Sons, 2013.
- Bernanke, B.S.; Boivin, J. Monetary policy in a data-rich environment. Journal of Monetary Economics 2003, 50, 525–546. [Google Scholar] [CrossRef]
- Bernanke, B.S.; Boivin, J.; Eliasz, P. Measuring the Effects of Monetary Policy: A Factor-Augmented Vector Autoregressive (FAVAR) Approach. The Quarterly Journal of Economics 2005, 120, 387–422. [Google Scholar]
- Jurado, K.; Ludvigson, S.C.; Ng, S. Measuring Uncertainty. The American Economic Review 2015, 105, 1177–1216. [Google Scholar] [CrossRef]
- McCracken, M.W.; Ng, S. FRED-MD: A Monthly Database for Macroeconomic Research. Journal of Business & Economic Statistics 2016, 34, 574–589. [Google Scholar]
- Boivin, J.; Ng, S. Are more data always better for factor analysis? Journal of Econometrics 2006, 132, 169–194. [Google Scholar] [CrossRef]
- Kim, H.H.; Swanson, N.R. Forecasting financial and macroeconomic variables using data reduction methods: New empirical evidence. Journal of Econometrics 2014, 178, 352–367. [Google Scholar] [CrossRef]
- Huang, D.; Jiang, F.; Li, K.; Tong, G.; Zhou, G. Scaled PCA: A New Approach to Dimension Reduction. Management Science 2022, 68, 1678–1695. [Google Scholar] [CrossRef]
- Gao, Z.; Tsay, R.S. Supervised Dynamic PCA: Linear Dynamic Forecasting with Many Predictors. Journal of the American Statistical Association, (accepted). 2024+. [Google Scholar] [CrossRef]
- Chan, N.H.; Ing, C.K.; Li, Y.; Yau, C.Y. Threshold Estimation via Group Orthogonal Greedy Algorithm. Journal of Business & Economic Statistics 2017, 35, 334–345. [Google Scholar]
- Ing, C.K. Model selection for high-dimensional linear regression with dependent observations. The Annals of Statistics 2020, 48, 1959–1980. [Google Scholar] [CrossRef]
- Ing, C.K.; Lai, T.L. A STEPWISE REGRESSION METHOD AND CONSISTENT MODEL SELECTION FOR HIGH-DIMENSIONAL SPARSE LINEAR MODELS. Statistica Sinica 2011, 21, 1473–1513. [Google Scholar] [CrossRef]
- Tibshirani, R. Regression Shrinkage and Selection via the Lasso. Journal of the Royal Statistical Society. Series B (Methodological) 1996, 58, 267–288. [Google Scholar] [CrossRef]
- Breiman, L. Random Forests. Machine Learning 2001, 45, 5–32. [Google Scholar] [CrossRef]
- Temlyakov, V. Greedy Approximation; Cambridge Monographs on Applied and Computational Mathematics, Cambridge University Press, 2011.
- Barron, A.R.; Cohen, A.; Dahmen, W.; DeVore, R.A. Approximation and Learning by Greedy Algorithms. The Annals of Statistics 2008, 36, 64–94. [Google Scholar] [CrossRef]
- James, G.; Witten, D.; Hastie, T.; Tibshirani, R. An Introduction to Statistical Learning with Appications in R, second ed.; Springer New York, NY, 2021.
- Chi, C.M.; Vossler, P.; Fan, Y.; Lv, J. Asymptotic properties of high-dimensional random forests. The Annals of Statistics 2022, 50, 3415–3438. [Google Scholar] [CrossRef]
- Saha, A.; Basu, S.; Datta, A. Random Forests for Spatially Dependent Data. Journal of the American Statistical Association 2023, 118, 665–683. [Google Scholar] [CrossRef]
- Medeiros, M.C.; Mendes, E.F. ℓ1-regularization of high-dimensional time-series models with non-Gaussian and heteroskedastic errors. Journal of Econometrics 2016, 191, 255–271. [Google Scholar] [CrossRef]
- Liaw, A.; Wiener, M. Classification and Regression by randomForest. R News 2002, 2, 18–22. [Google Scholar]
- Diebold, F.X.; Mariano, R.S. Comparing Predictive Accuracy. Journal of Business & Economic Statistics 1995, 13, 253–263. [Google Scholar]
- Harvey, D.; Leybourne, S.; Newbold, P. Testing the equality of prediction mean squared errors. International Journal of Forecasting 1997, 13, 281–291. [Google Scholar] [CrossRef]
- Caro, A.; Elias, A.; Peña, D.; Tsay, R.S. SLBDD: Statistical Learning for Big Dependent Data, 2022. R package version 0.0.4.


| GsP* | GsP | sdPCA | SW | LYB | Lasso | RF | |
| (5, 50) | 1.894 | 2.443 | 1.993 | 2.657 | 2.136 | 2.207 | 2.860 |
| (10, 100) | 2.406 | 3.299 | 2.488 | 3.288 | 2.909 | 2.831 | 4.353 |
| (15, 150) | 3.315 | 3.908 | 3.584 | 5.368 | 5.201 | 3.537 | 5.776 |
| (10, 100) | 2.391 | 2.947 | 2.590 | 3.482 | 2.077 | 2.544 | 4.428 |
| (20, 200) | 3.020 | 4.355 | 3.242 | 4.626 | 3.657 | 3.679 | 6.604 |
| (30, 300) | 3.818 | 5.581 | 4.155 | 5.666 | 4.651 | 4.450 | 8.253 |
| GsP* | GsP | sdPCA | SW | LYB | Lasso | RF | |
| (5, 50) | 0.857 | 0.883 | 0.695 | 1.017 | 0.966 | 0.870 | 2.017 |
| (10, 100) | 2.833 | 3.137 | 2.834 | 3.436 | 3.436 | 3.802 | 6.971 |
| (15, 150) | 6.559 | 7.068 | 6.379 | 7.801 | 7.761 | 9.532 | 16.264 |
| (10, 100) | 4.091 | 4.547 | 2.259 | 5.224 | 4.917 | 5.118 | 8.977 |
| (20, 200) | 15.891 | 17.357 | 8.411 | 20.012 | 19.411 | 20.961 | 35.132 |
| (30, 300) | 34.557 | 37.956 | 34.543 | 42.904 | 42.905 | 50.996 | 70.180 |
| GsP* | GsP | sdPCA | SW | LYB | Lasso | RF | |
| (5, 50) | 12.255 | 12.166 | 12.347 | 12.565 | 12.469 | 7.537 | 12.813 |
| (10, 100) | 17.818 | 18.857 | 17.608 | 17.371 | 17.701 | 16.670 | 18.427 |
| (15, 150) | 23.142 | 24.417 | 23.041 | 21.777 | 22.027 | 21.125 | 22.660 |
| (10, 100) | 16.523 | 17.540 | 17.767 | 18.647 | 18.968 | 14.833 | 18.602 |
| (20, 200) | 27.266 | 29.110 | 26.063 | 25.291 | 25.717 | 25.953 | 26.503 |
| (30, 300) | 33.763 | 35.635 | 32.438 | 31.672 | 31.827 | 31.340 | 32.427 |












Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).