Regional collaborative forecast of primary energy consumption in China, Japan and South Korea based on multi-source data combination

This study aims at improving the forecast accuracy of primary energy consumptions in China, Japan and South Korea and verifying the correlation in primary energy consumptions among the neighboring countries. Considering the diversity of primary energy composition, this study selects 6 components of primary energy, including oil, coal, natural gas, nuclear energy, hydropower and renewable energy as characteristic variables. A collaborative prediction model based on SVR for primary energy consumption prediction is proposed to explore the correlation of primary energy consumption among three countries in China, Japan and South Korea. The results show that there is a strong correlation between primary energy consumption when multiple countries make collaborative prediction, among which the primary energy consumption of South Korea has the largest impact on the primary energy consumption of China and Japan. In the primary energy cooperation of China-Japan-South Korea, a primary energy cooperation system with the South Korea as the link should be established through regional coordination to alleviate the shortage of traditional fossil energy.


I. INTRODUCTION
Regional cooperation in primary energy consumption is another collaborative approach to resolve unequal resources distribution besides international energy cooperation. As primary energy plays an important role in economic development, countries need to timely adjust energy consumption forms to cope with market fluctuations [1]. Energy strategy [2], energy reserve [3], productivity [4] and geopolitics [3] are crucial for each country engaging in comprehensive energy cooperation. In particular, for primary energy with different strategic status and commodity characteristics, the consumption in different regions is also different, that is, a primary energy consumption directly reflects its strategic status. For example, in East Asia, China, Japan and South Korea have become the world's largest region in terms of primary energy consumption [5].When considering energy partners, the three countries tend to choose the countries with good economic and political relations in the crude oil market to ensure the reliability and safety of cooperation, while they tend to pay more attention to price and transportation costs. All these considerations will be reflected in the local collaborative patterns. Driven by these factors, energy cooperation in primary energy consumption in China, Japan and South Korea will evolve towards global and regional collaborative pattern. Therefore, studying the correlation pattern of interregional primary energy consumption is a necessary condition for further formulating cooperation strategies between countries in the energy market [6]. To understand the correlation of primary energy consumption between different countries can provide targeted suggestions for alleviating the shortage of fossil fuels.
When it comes to the prediction of primary energy consumption, regional collaborative forecast has been rarely used for analyzing regional correlations. For example, Giray Gozgor(2018), for example, took the composition data of primary energy consumption of 29 OECD(Organization of Economic Cooperation and Development) countries from 1990 to 2013 as panel data to compare and analyze the impact of renewable energy and non-renewable energy consumption on economic growth [7]. T. Chen (2015) established a collaborative fuzzy neural network in his study, and combined different models to construct new models to effectively analyze global carbon dioxide concentration [8].However, these researchers lack of targeted exploration of the internal causes and estimation of the future when conducting collaborative research. The regional collaborative prediction used in this paper is based on the combination of multi-source data and provides a new research method to analyze the interaction between regions. It pays special attention to the relationship between regions, predicts primary energy consumption of each country through different data combination forms, and judges the correlation between primary energy consumption of different combination forms [9].
This study analyzes the internal correlation between primary energy consumptions of China, Japan and South Korea with focusing on the collaborative mode and mutual dependence between the three countries, and contributes to collaborative development of primary energy of the three countries. In this paper, machine learning method is used to establish a support vector regression (SVR) regional joint model to predict the primary energy consumption of China, Japan and South Korea [10].Starting from the input variables with consideration of the degree of correlation of primary energy consumptions between the three countries, the study combines the primary energy consumption data in different forms as input variables to predict, and compares the prediction errors of these different combinations. In addition to further improving the prediction accuracy, this study observes the interaction of primary energy between different countries, and analyzes the interaction between regions. Combining the predicted results, it provides targeted policy recommendations for the coordinated development of primary energy among regions.

A. Primary energy consumption forecast
In recent years, scholars around the world have studied national and regional primary energy consumption predictions. The research is mainly divided into two categories. On one hand, it uses different characteristic variables to improve the prediction accuracy of primary energy consumption. Gokhan, A (2015) employed economic indicators and demographic statistics as eigenvalues when predicting the primary energy consumption of Turkey [40]. Chen, G.Q.(2015) used production, consumption and international trade as input variables to analyze the global variation trend of primary energy consumption, and further studied the correlations and interactions between global primary energy consumption in production, consumption and international trade [15]. Vincenzo Bianco (2015) compared the influence of different fuels and carbon prices and analyzed the primary energy consumption of Italy's thermal power sector and optimized the proportion of different fuels while meeting primary energy demand. [16]. Akpinar, M.(2016) divided the natural gas consumption of cities into seasonal categories, which is used as a forecasting characteristic index to predict natural gas demand [42]. Brinda Mahalingam (2018) analyzed data on US gross domestic product (GDP) and energy consumption, and proposed the correlation between energy consumption and GDP [13]. Li JR (2018) took GDP, population, export and import data as inputs for the prediction of the total petroleum consumption of China [18]. On the other hand, building hybrid model is to increase the prediction accuracy of primary energy. Torrini, F.C. (2016) predicted Brazil's energy consumption by using fuzzy forecast [39]. Chaoqing Yuan (2016) combined ARIMA model and GM(1,1) gray model to forecast the primary energy consumption of China, resulting in higher prediction accuracy

B. COLLABORATIVE FORECAST
On the prediction study of the problem, researchers will use the method of collaborative prediction to improve the prediction accuracy and explore the correlation between eigenvalues [37]. Toly Chen (2013) proposed an effective fuzzy collaborative prediction method, which used each round of fuzzy artificial neural network training to generate the upper and lower bounds of the operating cycle time, and effectively predicted the working cycle time through the cooperation between the upper and lower bounds [20]. UsamaAl-mulali (2014) used the combination of six eigenvalues to predict carbon dioxide emissions [24]. Can Eksoz(2014) proposed a conceptual framework with aiming at the related factors of collaborative prediction of food supply chain, which improving the long-term and accurate collaborative prediction accuracy of seasonal, perishable, promotional and newly launched products between manufacturers and retailers through cooperative prediction between manufacturers and retailers [21] T.Chen (2015) established a cooperative fuzzy neural network in his study, combined different models to construct a new model to effectively analyze and predict global carbon dioxide concentration [22].Sayyed Mahdi Ziaei (2015) established a grey prediction model, and carbon dioxide emissions of European, east Asian and oceanian countries were used as input variables to analyze their impact on energy consumption and financial indicators [23]. Yanling Liu (2018) combined several prediction methods together for the prediction of natural gas consumption of China [38]. Giray Gozgor (2018) used the data on primary energy consumption composition of 29 member countries in the Organization of Economic Cooperation and Development (OECD) from 1990 to 2013 as the panel data to collaboratively analyze the influence of renewable and non-renewable energy consumptions on economic growth, and predicted the future trend of renewable energy consumption [25]. Mert Topcu (2018) adopted a group framework to discuss the inhomogeneity and trans-sectoral dependence between trade and energy consumption, so as to investigate the trade energy consumption relationship of OECD countries during 2015 [26].

C. SUPPORT VECTOR REGRESSION
Support vector machine (SVM) is a rigorous supervised learning model that can classify and regress. This model can introduce nonlinear classifiers for processing various types of input data. Based on SVM, Smola and Vapnik propose regression techniques for support vector regression [27]. It is suitable for prediction and less affected by dimensions. SVR-based engineering applications have been widely applied to the prediction on non-linear data. Yongbao Chen (2015) used SVR to calculate and predict short-term power load and then to calculate the power demand of office buildings [33]. YouLong Yang (2016) described the incremental model for SVR, which optimized the prediction of power load [32]. André s Garcí a-Floriano (2018) built the SVR model, predicted the software enhancement work and formulated the software enhancement plan [28]. Wentao Yang (2018) established the temporal-spatial SVR model to predict the hourly pm2.5 concentration, and solve spatial heterogeneity [29]. Petra Vrablecová (2018) predicted short-term power load by using SVR based on on-line processing of massive data generated by smart grid, and maximally reduced grid imbalance [30]. Gelayol Golkarnarenji (2018) developed an intelligent prediction model for the carbon fiber industry based on support vector regression (SVR) prediction model and genetic algorithm (GA) to reduce energy consumption [31]. In financial field, SVR is also widely used in data analysis and prediction. Abdolreza Nazemi (2018) applied multi-factor SVR model for the analysis of corporate bonds, predicted corporate bond recovery rate [34]. A.Khosravi (2018) combined neuro-fuzzy inference system, feedforward neural network and SVR model to predict the wind direction and velocity of Bushehr. In terms of prediction of regression models, many researchers have proposed unique regression models to explore the relationship between dependent variables and independent variables according to different research objects and fields [35].

III.SUPPORT VECTOR REGRESSION MODEL
SVR is built upon statistical learning and emphasizes minimization of structural risk. This technique can overcome the defects of conventional statistical pattern recognition methods, including neural network. SVR model derives an approximation function g(χ) from miscellaneous data sample. G = ( , ) ( =1) SVR model is defined as follows: can be estimated by the minimum function: Where λ is a standardized constant; function | ( ) − | is defined as follows: In addition to formula (2), minimum function can be also expressed in the following form [36]: Here, α * = 0, , * ≧ 0, = 1, … , Moreover, kernel function defines the inner product of −dimensional feature space: Coefficients i  and * are given by The constraint conditions are ( ) In the present study, programming language is used to realize SVR and the continuous value of primary energy consumption is predicted.

A. Data Sources
Due to the Geographical similarity, the primary energy consumption of China, Japan and South Korea from 1965 to 2017 is chosen as the research object. The data sources include total annual primary energy consumptions and annual consumptions of the six primary energy sources respectively (namely, petroleum, coal, natural gas, nuclear energy, hydropower and renewable energy) for the three countries. All these data are referred from the Statistical Yearbooks of British Petroleum Company.
The data of total primary energy consumption and its components in these three countries from 1965 to 2009 were divided into training samples, 45 in total. The primary energy consumptions from 2010 to 2017 were the testing samples, 8 in total (TABLE I).

B. Multi-source data collaboration SVR model
Multi-source data collaboration approach is used for the forecast, which is divided into five steps. Firstly, the consumptions of each of the six primary energy sources for the target country are the input variables of the prediction model, and the total primary energy consumption of this country is the output variables. Then SVR is applied to predict. Secondly, the consumptions of each of the six primary energy sources for the target country plus those of another country (input country 1) are taken as the input variables, 12 in total; the total primary energy consumption of the target country is the output variable. Then SVR is also applied to predict. Thirdly, the consumptions of each of the six primary energy sources for the target country plus those of the last country (input country 2) are taken as the input variables, 12 in total; the total primary energy consumption of the target country is the output variable which is done by SVR. Fourthly, the consumptions of each of the six primary energy sources for the above three countries are taken as the input variables, 18 in total; the total primary energy consumption of the target country is the output variable which is still done by SVR. Finally, the forecast values of primary energy consumption of various combination schemes are compared, and the optimized values will be selected for testing. The experimental steps are illustrated in Fig. 1. where is the total primary energy consumption predicted by the model, and is the actual total primary energy consumption. RMSE and MAE are scale dependent measures based on squared and absolute error values. MSE and MAE are less sensitive to outliers, as they accept absolute values without square of the error, but they are easy to calculate. MAE is one of the most common metrics used to calculate error and for comparing different methods on the same datasets. MAPE is based on percentage error and is an independent metric. Therefore, MAPE is the best method for comparing the predictive performance of SVR model on different datasets.

C.ANALYSIS OF RESULTS
In this study, multi-source data is used for combination forecasting; therefore a unified performance measurement standard is needed to measure the prediction accuracy of different combinations. MAPE solves this problem. The predictive results vary when using different combinations of primary energy consumption data. When there is a strong correlation between input indicator and output indicator, MAPE will be low.
According to the above design steps, the total annual primary energy consumptions of China are predicted based on multi-source data collaborative forecasting method, which is used for experiments on four combination schemes, namely, China, China+Japan, China+South Korea, and China+South Korea+Japan. The error indicators are MSE, RMSE, MAE and MAPE. The errors of the four combinations for the prediction of the total annual primary energy consumption of China (of four schemes) are shown in Table 2. It can be seen that when using the combination of China+South Korea, the errors are the smallest. That is to say, when the primary energy consumption forecasting of China implements the coordinated forecasting of the secondary energy consumption combination of China+Korea, the improvement effect is very obvious. For the given time span (1965 to 2017), the comparison of the predictive values and actual values of the total primary energy consumption of China by using the four combinations is shown in Fig. 2.

China's primary energy consumption
When Japan is the target country, four combinations, namely, Japan, Japan+China, Japan+South Korea, and Japan+China+South Korea, are used for the collaborative forecast based on SVR. As above, MSE, RMSE. MAE and MAPE are the error indicators. The errors of the four combinations for the prediction of the total annual primary energy consumptions of Japan are shown in Table 3. It can be seen that when using the combination of Japan+South Korea, the errors are the smallest. That is to say, when the primary energy consumption forecast of Japan implements the coordinated prediction of the two-source energy consumption portfolio of Japan + Korea, the improvement effect is very obvious. For the given time span (1965 to 2017), the comparison of predictive values and actual values of the total primary energy consumption of Japan by using the four combinations is shown in Fig. 3.   Table  4. It can be observed when using the combination of Japan+South Korea, the errors are the smallest. Therefore, the multi-source data collaboration approach does not achieve a much better effect in the case of South Korea. For the given time span (1965 to 2017), the comparison of the predictive and actual values of South Japan's total primary energy consumption using the four combinations is shown in Fig. 4.

V. DISCUSSION
It can be observed from the experiment that the prediction of China's primary energy consumption is reduced with the cooperation of South Korea. It is clear to see from Table 2 that the multi-source data collaborative prediction model can reduce the prediction error more effectively than the primary energy prediction of a single country. Similarly, in the forecast of Japan's total primary energy consumption, the combination of Japan's and South Korea's primary energy consumption can achieve the best forecast result (table 3). Finally, in the prediction of South Korea, when the primary energy composition of South Korea is only used in this paper to predict the total primary energy consumption, MAPE is the lowest (table 4), indicating that the primary energy consumption of South Korea is not affected by the primary energy consumption of China and Japan.
This finding can be explained by the energy imports and exports among the three countries. Take South Korea as an example, the fossil energy in South Korea is very scarce, and the primary energy consumption mainly includes oil, coal, natural gas and nuclear energy (figure 2), of which coal, oil and natural gas are mostly imported. But South Korea has a large and well-established petroleum industry, and petroleum products have long been its major export commodities. China and Japan are faced with shortage of petroleum but a huge demand for petroleum, and these two countries rely on the imports of petroleum [43]. As the neighboring country of China and Japan and also the major producer of petroleum products, South Korea is the exporter of petroleum products for China and Japan (Fig. 5). The change of South Korea's primary energy consumption influences the change of South Korea's oil export products. China and Japan import a lot of south Korean petroleum products, and the change of south Korean petroleum products affects the primary energy consumption of China and Japan.

FIGURE 5. South Korea imports and exports oil products to
China and Japan SVR model is built in this paper to study the correlations with primary energy consumption of China and Japan as continuous variables and oil products exported by South Korea to China and Japan as response variable. The primary energy consumptions of South Korea and China are directly proportional to the exports of petroleum products to China. (Fig. 6). In other words, South Korea's primary energy consumption influences that of China, the former plays an important role in China's primary energy forecast. (Table 2). In the structure of energy trade between China and South Korea, it is necessary to attach importance to the oil trade between China and South Korea, improve trade policies, increase the proportion of oil products imported by South Korea, and establish a china-south Korea oil product trade system with lower transportation costs. For Japan, South Korea's primary energy consumption is directly proportional to the exports of petroleum products to Japan. Japan's primary energy consumption is inversely proportional to the exports of petroleum products to Japan. (Fig. 7). In other words, South Korea's primary energy consumption influences that of Japan, and the former has a strong predictive power on the latter (Table 3). For South Korea, 95% of fossil energy is imported, and the main importers are the Middle East countries such as Saudi Arabia, Kuwait, the United Arab Emirates and Qatar. Therefore, primary energy consumptions of neighboring countries have little impact on that of South Korea, which explains the small predictive power of primary energy consumptions of China and Japan on that of South Korea (Table 4). In energy trade among the three countries, South Korea is the major exporter of petroleum products and hub of regional energy cooperation, as well as plays an important role in solving petroleum shortage of the three countries. In this paper, the prediction accuracy of primary energy consumption in China, Japan and South Korea is improved by using multi-source data for collaborative prediction, and the correlation between primary energy consumption is verified. Through the analysis, it is found that the energy consumption of South Korea influences the primary energy consumption of China and Japan, which is mainly related to the large amount of oil products exported by South Korea, and China and Japan are important importers. While maintaining sufficient oil reserves by imports from other countries, China and Japan will export excess oil to South Korea at a low price. South Korea, with its complete petroleum processing system, manufactures petroleum products in large quantities, which are then exported to China and Japan at a low price. This regional energy cooperation system, dominated by petroleum cooperation, can help lessen the shortage of fossil energy of the three countries. The research has demonstrated the correlations about primary energy consumptions of the three countries through accurate prediction of primary energy consumptions based on the multi-source data collaboration.

VI.CONCLUSION
China, Korea and Japan, three countries in East Asia, are selected as the research objects. The primary energy consumption of different countries is forecast collaboratively using different combinations of primary energy components by SVR. The analysis shows that South Korea's primary energy consumption has a major predictive power on that of Japan and China. We further propose recommendations for upgrading regional energy cooperation mode. By predicting the primary energy consumptions of the three countries, the following conclusions are drawn: (1) Regional collaborative prediction model is built, and combinations of primary energy consumption composition data of different countries are used as the characteristic variables. Correlations between primary energy consumptions of neighboring countries are observed, which can be utilized to predict primary energy consumptions of single target country. Research on regional collaborative forecast of primary energy consumptions based on different combinations of countries can be furthered in the following respects: First, it is important to identify which energy consumption contributes most to the prediction in different primary energy consumption components in different countries. Second, deep learning model can be introduced in order to discard the choice of different energy consumption among different countries. That way, the model will automatically extract valuable information for the forecast. Third, multi-source data collaboration approach can be extended to a larger scope in the Asia-Pacific region, and the correlations between primary energy consumptions of other countries can be studied in the future.