Preprint
Article

Testing RNN-LSTM Forecasting with Simulated Astronomical Lightcurves

This version is not peer-reviewed.

Submitted:

19 July 2019

Posted:

22 July 2019

Read the latest preprint version here

Abstract
With an explosion of data in the near future, from observatories spanning from radio to gamma-rays, we have entered the era of time domain astronomy. Historically, this field has been limited to modeling the temporal structure with time-series simulations limited to energy ranges blessed with excellent statistics as in X-rays. In addition to ever increasing volumes and variety of astronomical lightcurves, there's a plethora of different types of transients detected not only across the electromagnetic spectrum, but indeed across multiple messengers like counterparts for neutrino and gravitational wave sources. As a result, precise, fast forecasting and modeling the lightcurves or time-series will play a crucial role in both understanding the physical processes as well as coordinating multiwavelength and multimessenger campaigns. In this regard, deep learning algorithms such as recurrent neural networks (RNNs) should prove extremely powerful for forecasting as it has in several other domains. Here we test the performance of a very successful class of RNNs, the Long Short Term Memory (LSTM) algorithms with simulated lightcurves. We focus on univariate forecasting of types of lightcurves typically found in active galactic nuclei (AGN) observations. Specifically, we explore the sensitivity of training and test losses to key parameters of the LSTM network and data characteristics namely gaps and complexity measured in terms of number of Fourier components. We find that typically, the performances of LSTMs are better for pink or flicker noise type sources. The key parameters on which performance is dependent are batch size for LSTM and the gap percentage of the lightcurves. While a batch size of $10-30$ seems optimal, the most optimal test and train losses are under $10 \%$ of missing data for both periodic and random gaps in pink noise. The performance is far worse for red noise. This compromises detectability of transients. The performance gets monotonically worse for data complexity measured in terms of number of Fourier components which is especially relevant in the context of complicated quasi-periodic signals buried under noise. Thus, we show that time-series simulations are excellent guides for use of RNN-LSTMs in forecasting.
Keywords: 
;  ;  ;  ;  
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.

Downloads

633

Views

349

Comments

0

Subscription

Notify me about updates to this article or when a peer-reviewed version is published.

Email

Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

© 2025 MDPI (Basel, Switzerland) unless otherwise stated