Version 1
: Received: 15 April 2024 / Approved: 16 April 2024 / Online: 16 April 2024 (10:29:34 CEST)
How to cite:
Alharthi, M.; Mahmood, A. Enhanced Linear and Vision Transformer-Based Architectures for Time Series Forecasting. Preprints2024, 2024041024. https://doi.org/10.20944/preprints202404.1024.v1
Alharthi, M.; Mahmood, A. Enhanced Linear and Vision Transformer-Based Architectures for Time Series Forecasting. Preprints 2024, 2024041024. https://doi.org/10.20944/preprints202404.1024.v1
Alharthi, M.; Mahmood, A. Enhanced Linear and Vision Transformer-Based Architectures for Time Series Forecasting. Preprints2024, 2024041024. https://doi.org/10.20944/preprints202404.1024.v1
APA Style
Alharthi, M., & Mahmood, A. (2024). Enhanced Linear and Vision Transformer-Based Architectures for Time Series Forecasting. Preprints. https://doi.org/10.20944/preprints202404.1024.v1
Chicago/Turabian Style
Alharthi, M. and Ausif Mahmood. 2024 "Enhanced Linear and Vision Transformer-Based Architectures for Time Series Forecasting" Preprints. https://doi.org/10.20944/preprints202404.1024.v1
Abstract
Time series forecasting has been a challenging area in the field of Artificial Intel-ligence. Various approaches such as linear Neural Networks, Recurrent Linear Neural Networks, Convolutional Neural Networks and recently Transformers have been attempted for the time series forecasting domain. Although, Trans-former-based architectures have been outstanding in the Natural Language Pro-cessing domain, especially in autoregressive language modeling, the initial at-tempts to use transformers in the time series arena have met with mixed success. One of the recent papers demonstrated that modeling the multi-variate time se-ries problem as independent channels on a transformer architecture produced lower mean square and mean absolute errors on benchmarks, however, another recent paper claimed that a simple linear neural network out-performs previous approaches including transformer-based designs. We investigate this paradox in detail comparing the linear Neural Network and Transformer based designs, provide insights to why a certain approach may be better for a particular type of problem. We also improve upon the recently proposed simple linear Neural Network based architecture by using dual pipelines with batch normalization and reversible instance normalization. Our enhanced architecture outperforms all existing architectures for time series forecasting on majority of the popular benchmarks.
Keywords
transformer; linear network; time series forecasting; state space model
Subject
Computer Science and Mathematics, Artificial Intelligence and Machine Learning
Copyright:
This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.