Financial Market Prediction Using Recurrence Plot and Convolutional Neural Network

An application of deep convolutional neural network and recurrence plot for financial market movement prediction is presented. Though it is challenging and subjective to interpret its information, the pattern formed by a recurrence plot provide a useful insight into the dynamical system. We used a recurrence plot of seven financial time series to train a deep neural network for financial market movement prediction. Our approach is tested on our dataset and achieved an average of 53.25% classification accuracy. The result suggests that a well trained deep convolutional neural network can learn a recurrence plot and predict a financial market direction.


Introduction
A time series is a sequence of values exhibited at a successive time points [1]. Time series data can be multivariant or univariant. A multivariant time series is an observation of one or more values at a time. Since our day to day life produces lots of sequential data, time series data can be easily obtained. They are vital in financial market analysis, clinical prediction (for instance, electrocardiogram signals (ECG), physiological signals (EEG)), and study of natural phenomena (for instance, weather readings). One of the most interesting approaches to business intelligence is a stock price prediction using financial time series. However, due to the volatile and nonlinear nature of the stock market, a financial market prediction is a complex problem. Besides, the high level of noise in stock time series worsens the problem.
Many traditional methods have been applied for stock market prediction. The two conventional approaches are technical analysis and fundamental analysis. Technical analysis relies on historical prices [2]. It makes use of a chart to study the pattern of past stock prices. Several stock price prediction methods have been developed on the ground of this approach. Its goal is to analyze the existing trend and predict the future direction of the stock market. Technicians analyze technical indicators derived from historical data. The most common technical indicators are moving average, moving average convergence divergence (MACD), relative strength index (RSI), and on balance volume (OBV). On the other hand, fundamental analysis is information dependent [3]. Its aim is to gather information such as companies economic data, reports, and news that has an impact on the supply and demand to forecast the market (for instance, [4][5][6]).
Despite its complexity and prone to overfitting, numerous works have studied machine learning algorithms for financial market price prediction. Nowadays, traders use intelligent trading systems to help their trading investment decision. For this reason, algorithmic trading has drawn public attention. Consequently, there is a need for an algorithm that is efficient to predict market prices and movement direction. The potential problem of applying a machine learning algorithm for market prediction is that a machine learning model decision is a black box system and not easily understandable [7], yet traders need a comprehensible decision. Machine learning approach includes but not limited to an artificial neural network (ANN), support vector machine (SVM), and a hybrid model [8]. The feed-forward artificial neural networks have shown a good performance in multivariant time series prediction [9]. The use of ANN approach is well researched in stock price prediction and achieved a good result [10,11]. Notably, ANN enhanced with advanced training algorithms are performing well [12]. Moreover, studies showed that SVM is a promising approach for stock market prediction [13,14].
In the field of deep learning recently deep neural network (DNN) gain considerable popularity in the machine learning community. DNN has achieved a profound result in image and speech recognition [15]. However, the use of DNN for financial market prediction is not researched adequately. Xiao Ding et al. [16] proposed a deep learning approach for event-driven stock market prediction and achieved overall 6% improvement in S&P index and individual stock forecasting. Matthew Dixon et al. [17] have proposed a deep neural network method to classify financial market movement. Their approach is tested on 43 CME commodities and future mid-prices and achieved an average of 42% accuracy.
Research showed that classification based stock market prediction outperform the level based models [18]. Herein, we are motivated to train a deep convolutional neural network with a recurrence plot of financial time series to classify market movement direction (Up or Down). Recurrence plot is a technique to reveal and visualize a complex deterministic pattern in a dynamic system [19]. Recurrence plot and recurrence quantification analysis (RQA) are popular methodologies for a nonlinear and dynamic system [20]. They are able to confirm the change in a dynamical system. However, to the best of the author knowledge, there has been no attempt to train a machine learning model using a recurrence plot for algorithmic trading. Thus, we divide a historical financial time series into segments of multiple time frames using a fixed sliding window size and then generate a corresponding recurrence plot for each time frame data which we use to train our deep convolutional neural network (CNN). More recently, Dawei Song et al. [21] have used recurrence plots to train Convolutional neural network for affective state recognition based on physiological signal and gained a 5% to 9% performance over the baseline.
The overall contribution of this work is to study the performance of deep CNN to learn a pattern from recurrence plot of financial time series. A single deep learning model is used to classify financial market movement direction. The output of the model is either 0 interpreted as down or 1 interpreted as up.
The rest of this paper is organized as follow. The following section describes the nature of data used in this work. The methods, Section 3, introduces the theoretical background of the methods we follow. Also, it describes the deep learning model architecture (in our case deep CNN). Section 4 provides experimental result measuring the performance of our model. Finally, section 5 gives a concluding remark.

The Data
This research work considered seven data signals as shown in Table 1. All data represents a daily prices. The closing price of IBM, General Electrics and S&P 500 is obtained from Yahoo historical data source by the address [22]. 54 years historical closing price is taken for both IBM and General Electrics as dipicted in Figure 1 and Figure 2 respectiely. Since the market is less liquid before 1981, 35 years price data is considered for S&P 500 index price. 48 years Gold price data is obtained from Deutsche Bundesbank Data Repository [23]. The source of the rest of experimental data is Qundle financial data repository [24].  The Standard and Poor's 500 composite stock price index known as S&P 500 and the world's largest information technology IBM are a well-known time series. The later is mentioned by Box et al. [25].
From autocorrelation and histogram plot in Figure 3 and Figure 4 we can see the nature of sample data from our experimental data set. As the value of lag increases, autocorrelation function coefficient drops to zero. These tells us    The data set preprocessing includes segmentation of the raw signal into multiple time frames. Algorithm 1 dipicted below is used to generate segments of the data and corresponding labels. The raw signal is segmented with a fixed sliding window and shifting size. Since the goal of this research is to predict a weekly market movement direction, a shift size of seven is employed. No further data preprocessing is applied to the raw signal.

Methods
Below, we present the methodology of this work. We discuss a recurrence plot and briefly explain how the training data set is generated. Furthermore, the proposed deep CNN architecture is presented.

Recurrence Plot
Recurrence plot is first introduced by Takens [26] in 1987 as a method to visualize recurrences of a dynamical system. A given time series data can be represented in its phase space using a method called time-delay embedding [27]. For a time series vector x(t i ), i = 1.....N , time-delay embedding method can construct the attractor of the following lagged coordinate.
Where, m is dimension of reconstructed phase space, and τ is the length of lag. Selecting time-delay parameters τ and m is a non-trivial task. According to Taken's embedding theorem (1980) and other related theorems, m > (2D + 1), where D is an observable dimension of the underlying system, provide an attractor that represent the same dynamical system as the original attractor in different coordinate systems. The brute force approach of choosing an appropriate lag length is a try and error. However, from literature review, a lag length τ = 1 is found to be more suitable for financial time series [28]. There are a lot of approaches to determine the minimum sufficient embedding dimension (for instance, [29,30]). The false nearest neighborhood is one of the algorithms to determine the minimum embedding dimension by measuring the percentage of false nearest neighbours [31]. The capable dimension is which the proportion of false nearest neighbors drop to zero for a given tolerance level. This work considered false nearest neighborhood to select an appropriate embedding dimension. Besides, the delay length of τ = 1 is utilized. A recurrence matrix can be constructed from embedding vector as follow, Where . is an L2-norm and is a fixed threshold value. There are several rules of thumbs to select an appropriate threshold. In this work, after multiple experiments, we set a threshold value to be 5% of the maximal phase space diameter [32]. Figure 5 shows generated recurrence plots for segmented signals with a window size of 300 data points. After the recurrence plot is produced for each segment, the next phase is treated as a classification problem. 1 The pattern formed by a recurrence plot provide a useful insight into the system. However, interpretation of this information is often challenging and subjective.

Convolutional Neural Network
Convolutional neural network (CNN) becomes an important model for image classification in machine learning because of several reasons. It is invariant for tilt, scaling or other forms of alteration in the input images. Besides, hand crafted feature extraction, which is the tedious part of computer vision, can be accomplished by convolution layers in CNN. A sample convolutional neural network architecture is depicted in Figure 6. A deep CNN comprises of several layers which contain a convolution layer followed by subsampling operation. The convolution operation is performed on the input image or feature maps from previous layers using a particular kernel (filter). Moving a filter across the image extracts a feature that is independent of the position of the feature in the input image. A pooling layer performs a subsampling operation to reduce the resolution of the feature map. The last layer in CNN is a fully connected layer (FC). This layer performs the final high label reasoning based on the features from the previous layer. The overall advantage of deep learning is, its ability to learn a non-linear and complex problem.
Training a deep network is a real challenging task [33,34]. The challenge is not only its computational cost but a careful hyper-parameter optimization has to be conducted. Hyper-parameter optimization is a method to find an appropriate combination of parameters that gives the best validation performance. There are many available approaches for hyperparameter optimization such as a grid search, random search, and some optimized techniques [35]. However, parameter tuning is the most tedious stage of deep learning. In this research, we proposed to use a deep CNN. The architecture of our model is summarized in Table 2. The convolution layers and the first two fully connected layers in this model have a tanh activation function as nonlinearity.
Tanh activation function is chosen as a result of hyperparameter optimization. The objective function is cross entropy.
The final data set of our experiment contain a 64×64 pixel recurrence plot images and then was fed to the deep CNN.  Figure 7 summarize the performance of our approach, utilizing deep CNN and recurrence plot on financial market movement prediction. The average accuracy for considered seven data set is 53.25%. For S&P index price and gold price, a maximum of 56.62% and 56% accuracy is achieved respectively. We believe that the result shows a deep CNN is able to learn the complex pattern of a financial recurrence plot. As a first attempt to train CNN with recurrence plot, the result suggests that a well trained deep CNN can learn a recurrence plot and predict a financial market direction. Figure 8 further shows the filters of the first convolutional layers.

Result and Discussion
In this study, we trained a single deep CNN to classify the market movement direction using a generated recurrence plot from financial time series. A future work has to improve the classification performance of CNN. A deep learning requires a big data to converge. However, we have faced a lack of data in this study. Consequently, one can improve the model accuracy by utilizing different forms of data augmentation to increase the number of training set.

Conclusion
In this study, we explored deep CNN and recurrence plot for a weekly financial market movement prediction. We segmented the time series data set and generated recurrence plot image to train a deep CNN. To the best of our knowledge, this study was the first attempt to train a deep neural network with a recurrence plot of financial time series . The system achieved an accuracy in the range of 50.68% to 56.62% in our dataset. We observed that a deep CNN can learn recurrence plot from historical price data and can make a market movement prediction.
Overall, this work has demonstrated the performance of deep CNN to learn a recurrence plot pattern and predict market movement direction. The future work will focus on improving the accuracy of the model. Besides, we will explore the performance to a large number of historical dataset.