ARTICLE | doi:10.20944/preprints202309.1565.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: photovoltaic power forecasting; deterministic forecasting; probability interval forecasting; ensemble learning; feature rise-dimensional method
Online: 22 September 2023 (11:37:45 CEST)
Due to the intermittency and fluctuation of photovoltaic (PV) output power, a high proportion of grid-connected PV power generation systems has a significant impact on power systems. Accurate PV power forecasting can alleviate the uncertainty of the PV power and is of great significance for the stable operation and scheduling of the power systems. Therefore, in this study, a feature rise-dimensional (FRD) two-layer ensemble learning (TLEL) model for short-term PV power deterministic forecasting and probability forecasting is proposed. First， based on the eXtreme Gradient Boosting (XGBoost), Random Forest (RF), CatBoost, and Long-short-term memory (LSTM) models, a TLEL model is constructed utilizing the ensemble learning algorithm. Meanwhile, the FRD method is introduced to construct the FRD-XGBoost-LSTM (R-XGBL), FRD-RF-LSTM(R-RFL), and FRD- CatBoost - LSTM (R-CatBL) models. Subsequently, the above models are combined to construct the FRD-TLEL model for deterministic forecasting, and perform probability interval forecasting based on quantile regression(QR). Finally, the performance of the proposed model is demonstrated with a real-world dataset. By comparing with other models, the proposed model displays better forecasting accuracy for deterministic forecasting and reliable forecasting intervals for probability forecasting, and good generalization ability in the datasets of different seasons and weather types.
ARTICLE | doi:10.20944/preprints202311.0806.v1
Subject: Business, Economics And Management, Accounting And Taxation Keywords: Forecasting; accounting; earnings
Online: 13 November 2023 (10:03:25 CET)
We propose a generalized, practitioner-oriented operating leverage model for predicting operating income using Standard and Poor’s Compustat items: SALE (net sales), COGS (cost of sales), DP (total depreciation and amortization), XSGA (selling, general, and administrative expenses), and OIADP (operating income after depreciation and amortization). Prior research finds that OIADP = SALE - COGS - DP - XSGA; hence, our model includes all aggregate revenues and expenses comprising OIADP. Also, prior research finds COGS is “much less” sticky than DP and XSGA; hence, we use COGS as a proxy for total variable costs and DP and XSGA as proxies for sticky fixed costs. We introduce a new adjustment to the textbook operating leverage model so that SALE-to-COGS remains constant for the reference and forecast periods. Also, inspired by prior research, we introduce adjustments to DP and XSGA for cost stickiness. We find our generalized operating leverage model improves estimates of changes in next-quarter and next-year OIADP compared to textbook operating leverage predictions, which are special cases of our model.
ARTICLE | doi:10.20944/preprints202308.1371.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: AI; Forecasting; RES
Online: 18 August 2023 (11:49:50 CEST)
Forecasting electricity demand is of utmost importance for ensuring the stability of the entire energy sector. However, predicting the future electricity demand and its value poses a formidable challenge due to the intricate nature of the processes influenced by renewable energy sources. Within this piece, we have meticulously explored the efficacy of fundamental deep-learning models designed for electricity forecasting. Among the deep learning models, we have innovatively crafted recursive neural networks (RNNs) predominantly based on LSTM and combined architectures. The data-set employed was procured from a SolarEdge designer. The data-set encompasses daily records spanning the past year, encompassing an exhaustive collection of parameters extracted from solar farm (based on location in Central Europe (Poland Swietokrzyskie Voivodeship)). The experimental findings unequivocally demonstrated the exceptional superiority of the LSTM models over other counterparts concerning forecasting accuracy. Consequently, we compared multilayer DNN architectures with results provided by the simulator.
REVIEW | doi:10.20944/preprints202305.1534.v1
Subject: Engineering, Energy And Fuel Technology Keywords: Predictive models; Weather research and forecasting (WRF); Solar irradiance forecasting; Solar PV power forecasting; Renewable energy sources.
Online: 23 May 2023 (02:32:03 CEST)
Accurately predicting the power of solar power generation can greatly reduce the impact of the randomness and volatility of power generation on the stability of the power grid system, which is beneficial for the balanced operation and optimized dispatch of the power grid system, and reduces operating costs. Solar PV power generation depends on weather conditions, which are prone to large fluctuations under different weather conditions. Its power generation is characterized by randomness, volatility and intermittency. Recently, the demand for further investigation and effective use on the uncertainty of short-term solar PV power generation prediction has been getting increasing attention in many application of renewable energy sources. In order to improve the predictive accuracy of output power of solar PV power generation and develop a precise predictive model, the authors worked predictive algorithms for the output power of a solar PV power generation system. Moreover, since short-term solar PV power forecasting is one of the important aspects for optimizing the operation and control of renewable energy systems and electricity markets, this review focuses on the predictive models of solar PV power generation, which can be verified in the daily planning and operation of a smart grid system. In addition, the predictive methods in the reviewed literature are classified according to the input data source used for accurate predictive models, and the case studies and examples proposed are analyzed in detail. The contributions, advantages and disadvantages of the predictive probabilistic methods are compared. Finally, the future studies of short-term solar PV power forecasting is proposed.
ARTICLE | doi:10.20944/preprints202309.0843.v1
Subject: Business, Economics And Management, Business And Management Keywords: textile; apparel; clothe; forecasting
Online: 13 September 2023 (08:56:58 CEST)
Almost 115 million tons of fibers of which almost 90 million tons of chemical fibers were produced in the world in 2021, which are mainly used for the production of clothing and footwear. 30% of textile and apparel products are never sold, which means an extreme waste production. This article points out the possibilities of forecasting the sales of clothing in the case of one relatively large online store. Inadequate stocks of textile products in the company lead to loss and the need to sell products at a discount, which is undesirable for the company. The study in this article points to the calculation of the sales forecast for 2019 for the selected textile products, finding the analogy of the followed product sale. Sales for the years 2017 and 2018 serve as input data. The problem with textile products is that they have a short life cycle, i.e. the length of the life cycle is approximately half a year, and a high seasonality is also presented there. Therefore, the seasonal indices and Holt-Winters methods (multiplication and additional approaches) were used for products forecasting. Ultimately, this model could contribute to reducing the loss of unsold goods and thus reduce the waste of resources and increase the use of goods in other similar companies.
ARTICLE | doi:10.20944/preprints202205.0386.v1
Subject: Engineering, Electrical And Electronic Engineering Keywords: Lower upper bound estimation; random forest; feature selection; probabilistic forecasting; photovoltaic generation forecasting
Online: 30 May 2022 (05:10:06 CEST)
Photovoltaic power generation has high variability and uncertainty because it is affected by uncertain factors such as weather conditions. Therefore, probabilistic forecasting is useful for optimal operation and risk hedging in power systems with large amounts of photovoltaic power generation. However, deterministic forecasting is the mainstay of photovoltaic generation forecasting; there are few studies on probabilistic forecasting and feature selection from weather or time-oriented features in such forecasting. In this study, prediction intervals were generated by the lower upper bound estimation using neural networks with two outputs to make probabilistic predictions. The objective was to improve prediction interval coverage probability (PICP), mean prediction interval width (MPIW), and loss, which is the integration of these two metrics, by removing unnecessary features through feature selection. When features with high gain were selected by random forests (RF), in the forecast of 14.7-kW PV systems, loss improved by 1.57 kW, PICP by 0.057, and MPIW by 0.12 kW on average over two weeks compared to the case where all features were used without feature selection. Therefore, the low gain features from RF act as noise in LUBE and reduce the prediction accuracy.
ARTICLE | doi:10.20944/preprints202001.0387.v1
Subject: Engineering, Energy And Fuel Technology Keywords: forecasting; clustering; energy systems; classification
Online: 31 January 2020 (13:28:01 CET)
This paper proposes an ARIMA approach to battery health forecasting with accuracy improvement by K shape-based clustered predictors. The health prediction of the battery pack is an important function of a battery management system in data centers. Accurate forecasting of battery life turns out to be very difficult without failure data to train a good forecasting model in real life. The conventional ARIMA model is compared with total and clustered predictors for battery health forecasting. Results show that the forecasting accuracy of the ARIMA model significantly improved by utilizing the results of the clustered predictors for 40 batteries in a real data center. One year of actual historical data of 40 batteries of large scale datacenter is presented to validate the effectiveness of the proposed methodology.
ARTICLE | doi:10.20944/preprints201810.0679.v1
Subject: Physical Sciences, Applied Physics Keywords: seismic forecasting; foreshocks; stochastic model
Online: 29 October 2018 (11:47:36 CET)
An increase of seismic activity is often observed before large earthquakes. Events responsible for this increase are usually named foreshock and their occurrence probably represents the most reliable precursory pattern. Many foreshocks statistical features can be interpreted in terms of the standard mainshock-to-aftershock triggering process and are recovered in the Epidemic Type Aftershock Sequence ETAS model. Here we present a statistical study of instrumental seismic catalogs from four different geographic regions. We focus on some common features of foreshocks in the four catalogs which cannot be reproduced by the ETAS model. In particular we find in instrumental catalogs a significantly larger number of foreshocks than the one predicted by the ETAS model. We show that this foreshock excess cannot be attributed to catalog incompleteness. We therefore propose a generalized formulation of the ETAS model, the ETAFS model, which explicitly includes foreshock occurrence. Statistical features of aftershocks and foreshocks in the ETAFS model are in very good agreement with instrumental results.
REVIEW | doi:10.20944/preprints201810.0098.v2
Subject: Environmental And Earth Sciences, Environmental Science Keywords: flood prediction; machine learning; forecasting
Online: 26 October 2018 (11:56:27 CEST)
Floods are among the most destructive natural disasters, which are highly complex to model. The research on the advancement of flood prediction models has been contributing to risk reduction, policy suggestion, minimizing loss of human life and reducing the property damage associated with floods. To mimic the complex mathematical expressions of physical processes of floods, during the past two decades, machine learning (ML) methods have highly contributed in the advancement of prediction systems providing better performance and cost effective solutions. Due to the vast benefits and potential of ML, its popularity has dramatically increased among hydrologists. Researchers through introducing the novel ML methods and hybridization of the existing ones have been aiming at discovering more accurate and efficient prediction models. The main contribution is to demonstrate the state of the art of ML models in flood prediction and give an insight over the most suitable models. The literature where ML models are benchmarked through a qualitative analysis of robustness, accuracy, effectiveness, and speed have been particularly investigated to provide an extensive overview on various ML algorithms usage in the field. The performance comparison of ML models presents an in-depth understanding about the different techniques within the framework of a comprehensive evaluation and discussion. As the result, the paper introduces the most promising prediction methods for both long-term and short-term floods. Furthermore, the major trends in improving the quality of the flood prediction models are investigated. Among them, hybridization, data decomposition, algorithm ensemble, and model optimization are reported the most effective strategy in improvement of the ML methods. This survey can be used as a guideline for the hydrologists as well as climate scientists to assist them choosing the proper ML method according to the prediction task conclusions.
CASE REPORT | doi:10.20944/preprints202308.0535.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Machine Learning; Time-Series Forecasting; Demand Forecasting; PM Gati-Shakti; Ministry of Power; Delhi
Online: 7 August 2023 (10:45:42 CEST)
PM-Gati-Shakti Initiative, integration of ministries, including railways, ports, waterways, logistic infrastructure, mass transport, airports, and roads. Aimed at enhancing connectivity and bolstering the competitiveness of Indian businesses, the initiative focuses on six pivotal pillars known as "Connectivity for Productivity": comprehensiveness, prioritization, optimization, synchronization, analytical, and dynamic. In this study, we explore the application of these pillars to address the problem of "Maximum Demand Forecasting in Delhi." Electricity forecasting plays a very significant role in the power grid as it is required to maintain a balance between supply and load demand at all times, to provide a quality electricity supply, for Financial planning, generation reserve, and many more. Forecasting helps not only in Production Planning but also in Scheduling like Import / Export which is very often in India and mostly required by the rural areas and North Eastern Regions of India. As Electrical Forecasting includes many factors which cannot be detected by the models out there, We use Classical Forecasting Techniques to extract the seasonal patterns from the daily data of Maximum Demand for the Union Territory Delhi. This research contributes to the power supply industry by helping to reduce the occurrence of disasters such as blackouts, power cuts, and increased tariffs imposed by regulatory commissions. The forecasting techniques can also help in reducing OD and UD of Power for different regions. We use the Data provided by a department from the Ministry of Power and use different forecast models including Seasonal forecasts for daily data.
ARTICLE | doi:10.20944/preprints202306.0135.v1
Subject: Engineering, Energy And Fuel Technology Keywords: Energy consumption prediction; Time-series forecasting; Forecasting Building Energy Consumption; Long Short-Term memory
Online: 2 June 2023 (05:11:04 CEST)
The global demand for energy has been steadily increasing due to population growth, urbanization, and industrialization. Numerous researchers worldwide are striving to create precise forecasting models for predicting energy consumption to manage supply and demand effectively. In this research, a time-series forecasting model based on multivariate multilayered long short-term memory (LSTM) is proposed for forecasting energy consumption and tested using data obtained from commercial buildings in Melbourne, Australia: the Advanced Technologies Center, Advanced Manufacturing and Design Center, and Knox Innovation, Opportunity, and Sustainability Center buildings. This research specifically identifies the best forecasting method for subtropical conditions and evaluates its performance by comparing it with the most used methods at present, including LSTM, bidirectional LSTM, and linear regression. The proposed multivariate multilayered LSTM model was assessed by comparing mean average error (MAE), root-mean-square error (RMSE), and mean absolute percentage error (MAPE) values with and without labeled time. Results indicate that the proposed model exhibits optimal performance with improved precision and accuracy. Specifically, the proposed LSTM model achieved a decrease in MAE by 30%, RMSE by 25%, and MAPE by 20% compared to the LSTM method. Moreover, it outperformed the bidirectional LSTM method with a reduction in MAE by 10%, RMSE by 20%, and MAPE by 18%. Furthermore, the proposed model surpassed linear regression with a decrease in MAE by 2%, RMSE by 7%, and MAPE by 10%. These findings highlight the significant performance increase achieved by the proposed multivariate multilayered LSTM model in energy consumption forecasting.
ARTICLE | doi:10.20944/preprints202210.0004.v1
Subject: Engineering, Electrical And Electronic Engineering Keywords: Electrical Power Grids; Fault Forecasting; Long Short-Term Memory; Time Series Forecasting; Wavelet Transform
Online: 3 October 2022 (10:36:14 CEST)
The electric power distribution utility is responsible for providing energy to consumers in a continuous and stable way, failures in the electrical power system reduce the reliability indexes of the grid, directly harming its performance. For this reason, there is a need for failure prediction to reestablish power in the shortest possible time. Considering an evaluation of the number of failures over time, this paper proposes to perform a failure prediction during the first year of the pandemic in Brazil (2020) to verify the feasibility of using time series forecasting models for fault prediction. The Long Short-Term Memory (LSTM) model will be evaluated to obtain a forecast result that can be used by the electric power utility to organize the maintenance teams. The Wavelet transform shows to be promising in improving the predictive ability of the LSTM, making the Wavelet LSTM model suitable for the study at hand. The results show that the proposed approach has better results regarding the evaluation of the error in prediction and has robustness when a statistical analysis is performed.
ARTICLE | doi:10.20944/preprints202110.0037.v1
Subject: Engineering, Electrical And Electronic Engineering Keywords: photovoltaic generation forecast; probabilistic forecast; prediction interval; ensemble forecast; day ahead forecasting; multiple PV forecasting
Online: 4 October 2021 (09:55:37 CEST)
Photovoltaic (PV) generation is potentially uncertain. Probabilistic PV generation forecasting methods have been proposed with prediction intervals (PIs). However, several studies have dealt with geographically distributed PVs in a certain area. In this study, a two-step probabilistic forecast scheme is proposed for geographically distributed PV generation forecasting. Each step of the proposed scheme adopts ensemble forecasting based on three different machine-learning methods. In this case study, the proposed scheme was compared with conventional non-multistep forecasting. The proposed scheme improved the reliability of the PIs and deterministic PV forecasting results through 30 days of continuous operation with real data in Japan.
REVIEW | doi:10.20944/preprints201812.0217.v1
Subject: Engineering, Electrical And Electronic Engineering Keywords: Intelligent Load Forecasting; Demand-Side Management; Pattern Similarity; Hierarchical Forecasting; Feature Selection; Weather Station Selection
Online: 18 December 2018 (10:38:10 CET)
Electricity demand forecasting has been a real challenge for power system scheduling in the different levels of the energy sectors. Various computational intelligence techniques and methodologies have been employed in the electricity market for load forecasting; although, scant evidence is available about the feasibility of each of these methods considering the type of data and other potential factors. This work introduces several scientific, technical rationale behind intelligent forecasting methods, based on the work of previous researchers in the field of energy. The fundamental benefits and main drawbacks of the aforementioned methods are discussed in order to depict the efficiency of each approach in various situations. In the end, a proposed hybrid strategy is represented.
ARTICLE | doi:10.20944/preprints202311.1248.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: AI; Energy Price Forecasting; LSTM; DNN
Online: 20 November 2023 (14:03:57 CET)
In the quest for sustainable energy solutions, predicting electricity prices for renewable energy sources plays a pivotal role in efficient resource allocation and decision-making. This article presents a novel approach to forecasting electricity prices for renewable energy sources using deep learning models, leveraging historical data from the Power System Operator (PSE). The proposed methodology encompasses data collection, preprocessing, feature engineering, model selection, training, and evaluation. By harnessing the power of recurrent neural networks (RNNs) and other advanced deep learning architectures, the model captures intricate temporal relationships, weather patterns, and demand fluctuations that impact renewable energy prices. The study demonstrates the applicability of this approach through empirical analysis, showcasing its potential to enhance energy market predictions and aid in the transition to more sustainable energy systems. The outcomes underscore the importance of accurate renewable energy price predictions in fostering informed decision-making and facilitating the integration of renewable sources into the energy landscape. As governments worldwide prioritize renewable energy adoption, this research contributes to the arsenal of tools driving the evolution towards a cleaner and more resilient energy future.
ARTICLE | doi:10.20944/preprints202209.0092.v1
Subject: Social Sciences, Political Science Keywords: elections; time series; forecasting; Chile; ARIMA
Online: 6 September 2022 (12:40:50 CEST)
This article presents the results of reviewing the predictive capacity of Google trends for national elections in Chile. The electoral results of the elections between Michelle Bachelet and Sebastián Piñera in 2006, Sebastián Piñera and Eduardo Frei in 2010, Michelle Bachelet and Evelyn Matthei in 2013, Sebastián Piñera and Alejandro Guillier in 2017, and Gabriel Boric and José Antonio Kast in 2021 were reviewed. The time series analysed were organised on the basis of relative searches between the candidacies, assisted by R software, mainly with the gtrendsR and forecast libraries. With the series constructed, forecasts were made using the ARIMA technique to check the weight of one presidential option over the other. The ARIMA analyses were performed on 3 ways of organising the data: the linear series, the series transformed by moving average and the series transformed by Hodrick-Prescott. The result indicates that the method offers optimal pre-dictive ability.
ARTICLE | doi:10.20944/preprints202202.0143.v1
Subject: Engineering, Electrical And Electronic Engineering Keywords: photovoltaic (PV) power forecast; multiple PV forecasting; short-term PV forecasting; motion estimation; optical flow; smart grid
Online: 10 February 2022 (02:22:32 CET)
The power-generation capacity of grid-connected photovoltaic (PV) power systems is increasing. As output power forecasting is required by electricity market participants and utility operators for the stable operation of power systems, several methods have been proposed using physical and statistical approaches for various time ranges. A short-term (30 min ahead) forecasting method has been previously proposed by our laboratory for geographically distributed PV systems using motion estimation. This study focuses on an important parameter for estimating the proposed motion and optimizing the parameter. This parameter is important because it is associated with the smoothness of the vector field, which is the result of motion estimation and influences the forecasting accuracy. In the periods with drastic power output changes, the evaluation was conducted on 101 PV systems located within a circle of 15-km radius in the Kanto region of Japan. The results indicate that the absolute mean error of the proposed method with the optimized parameter is 10.3%, whereas that of the persistent prediction method is 23.7%. Therefore, the proposed method is effective in forecasting for periods when PV output changes drastically in a short time.
ARTICLE | doi:10.20944/preprints202308.2129.v1
Subject: Public Health And Healthcare, Health Policy And Services Keywords: epidemiology; COVID-19; agent-based model; forecasting
Online: 31 August 2023 (09:37:07 CEST)
Background. We created agent-based model for short- and longterm forecasting of COVID-19 and for evaluation how the actions of the regulator affected the human and material resources of the healthcare system. Methods. The model was implemented in the AnyLogic software. It includes two state charts – social network and disease transmission. The COVID-19 Essential Supplies Forecasting Tool (COVID-ESFT, version 2.0) was used to determine healthcare resources needed. Results. Satisfactory results were obtained with long-term (up to 50 days) forecasting in the case of a monotonous change in total cases curve. However, if periods of relative stability are accompanied by sudden outbreaks, relatively satisfactory results were obtained with short-term forecasting, up to 10 days. Simulation of various scenarios showed that the most important place for the spread of infection are families. Wherein the maximum number of cases of COVID-19 is observed in the age group of 26-59 years. Due to a set of measures taken by government agencies, the number of cases in Karaganda city was 3.2 times less than was predicted in “no intervention” scenario. Economic effect is estimated at 40 %. Conclusion. Performed model is an attempt to consider as much as possible the peculiarities of the socio-demographic situation in the country. In the future, we will be prepared to some extent for challenges like those we have experienced in the past three years.
ARTICLE | doi:10.20944/preprints202307.0599.v1
Subject: Biology And Life Sciences, Animal Science, Veterinary Science And Zoology Keywords: Swine; mortality; data-wrangling; forecasting; machine-learning
Online: 10 July 2023 (10:40:52 CEST)
The performance of 5 forecasting models was investigated for predicting nursery mortality using the master table built for 3,242 groups of pigs (~ 13 million animals) and 42 variables, which concerned the pre-weaning phase of production and conditions at placement in growing sites. After training and testing each model’s performance through cross-validation, the model with the best overall prediction results was the Support Vector Machine model in terms of Root Mean Squared Error (RMSE=0.406), Mean Absolute Error (MAE=0.284), and Coefficient of Determination (R2=0.731). Subsequently, the forecasting performance of the SVM model was tested on a new dataset containing 72 new groups, simulating ongoing and near real-time forecasting analysis. Despite a decrease in R2 values on the new dataset (R2=0.554), the model demonstrated high accuracy (77.78%) for predicting groups with high (5>%) or low (5<%) nursery mortality. This study demonstrated the capability of forecasting models to predict the nursery mortality of commercial groups of pigs using pre-weaning information and stocking conditions variables collected post-placement in nursery sites.
ARTICLE | doi:10.20944/preprints202305.2114.v1
Subject: Environmental And Earth Sciences, Atmospheric Science And Meteorology Keywords: hailstorm; convective environment; statistical correlation; hail forecasting
Online: 30 May 2023 (10:36:08 CEST)
Using a database of 378 hail days between 20981 and 2020, the climatic characteristics of 23 convective parameters from sounding data and ERA5 data were statistically analyzed. The goal of this work is to evaluate the usefulness and representativeness of convective parameters derived from sounding data and reanalysis data for the operational forecast of the hail phenomenon. As a result, average values from 12:00 UTC were 433J / Kg for CAPE in the case of data from ERA5 and 505 J/kg from rawinsonde respectively. The Spearman correlation coefficients matrix between the values of the parameters indicates high correlations between the parameters calculated based on the parcel theory, the humidity indices, and the complex indices. The probability for large hail maximizes with high low level and boundary layer moisture, high CAPE, and a high lifted condensation level height.
ARTICLE | doi:10.20944/preprints202204.0295.v1
Subject: Computer Science And Mathematics, Probability And Statistics Keywords: Forecasting; SARIMA; Holt-Winters; Climate; Big Data
Online: 29 April 2022 (08:44:28 CEST)
As its capital, Jakarta plays a critical role in boosting Indonesia’s economic growth and setting the precedent for broader change outside of the city. One crucial avenue of inquiry to better understand, and prepare for, the future of a country so heavily impacted by disastrous weather events is understanding the effects of climate change through data. This study investigates meteorological data collected from 1996 to 2021 and compares the application of the SARIMA and the Holt-Winters methods to predict the future influence of climatic parameters on Jakarta’s weather. The performance of the SARIMA method is proven to provide better results than the Holt-Winter models and both methods showed the best performances when forecasting the humidity data. The results of the forecast are able to demonstrate the characteristic of the climate in Jakarta, with dry season ranging from May to October and wet season ranging from November to April.
ARTICLE | doi:10.20944/preprints201810.0625.v2
Subject: Environmental And Earth Sciences, Remote Sensing Keywords: terrestrial modeling; real-time forecasting/monitoring; workflows
Online: 16 November 2018 (08:06:18 CET)
Operational weather and also flood forecasting has been performed successfully for decades and is of great socioeconomic importance. Up to now, forecast products focus on atmospheric variables, such as precipitation, air temperature and, in hydrology, on river discharge. Considering the full terrestrial system from groundwater across the land surface into the atmosphere, a number of important hydrologic variables are missing especially with regard to the shallow and deeper subsurface (e.g. groundwater), which are gaining considerable attention in the context of global change. In this study, we propose a terrestrial monitoring/forecasting system using the Terrestrial Systems Modeling Platform (TSMP) that predicts all essential states and fluxes of the terrestrial hydrologic and energy cycles from groundwater into the atmosphere. Closure of the terrestrial cycles provides a physically consistent picture of the terrestrial system in TSMP. TSMP has been implemented over a regional domain over North Rhine-Westphalia and a continental domain over European in a real-time forecast/monitoring workflow. Applying a real-time forecasting/monitoring workflow over both domains, experimental forecasts are being produced with different lead times since the beginning of 2016. Real-time forecast/monitoring products encompass all compartments of the terrestrial system including additional hydrologic variables, such as plant available soil water, groundwater table depth, and groundwater recharge and storage.
ARTICLE | doi:10.20944/preprints201708.0086.v1
Subject: Computer Science And Mathematics, Data Structures, Algorithms And Complexity Keywords: seasonality; forecasting; pull and push models; denoising
Online: 25 August 2017 (08:21:40 CEST)
In this paper we develop a forecasting algorithm for recurrent patterns in consumer demand. We study this problem in two different settings: pull and push models. We discuss several features of the algorithm concerning sampling, periodic approximation, denoising and forecasting.
ARTICLE | doi:10.20944/preprints201609.0031.v1
Subject: Computer Science And Mathematics, Computer Science Keywords: electricity price forecasting; ensemble model; expert selection
Online: 8 September 2016 (11:52:52 CEST)
Day-ahead forecasting of electricity prices is important in deregulated electricity markets for all the stakeholders: energy wholesalers, traders, retailers, and consumers. Electricity price forecasting is an inherently difficult problem due to its special characteristic of dynamicity and non-stationarity. In this paper, we present a robust price forecasting mechanism that shows resilience towards aggregate demand response effect and provides highly accurate forecasted electricity prices to the stakeholders in a dynamic environment. We employ an ensemble prediction model in which a group of different algorithms participate in predicting the price for each hour of a day. We propose two different strategies, namely, Fixed Weight Method (FWM) and Varying Weight Method (VWM), for selecting each hour's expert algorithm from the set of participating algorithms. In addition, we utilize a carefully engineered set of features selected from a pool of features derived from information such as past electricity price data, weather data, and calendar data. The proposed ensemble model offers better results than both the Pattern Sequence-based Forecasting (PSF) method and our own previous work using Artificial Neural Networks (ANN) alone do on the datasets for New York, Australian, and Spanish electricity markets.
REVIEW | doi:10.20944/preprints202304.0917.v1
Subject: Engineering, Energy And Fuel Technology Keywords: Predictive models; Weather research and forecasting (WRF); Uncertainty; Wind forecasting; Ultra short term and Short term; Wind power generation
Online: 25 April 2023 (10:10:36 CEST)
The prediction of wind power output is part of the basic work of power grid dispatching and energy distribution. At present, the output power prediction is mainly obtained by fitting and regressing the historical data. The medium- and long-term power prediction results exhibit large deviations due to the uncertainty of wind power generation. In order to meet the demand for accessing large-scale wind power into the electricity grid and to further improve the accuracy of short-term wind power prediction, it is necessary to develop models for accurate and precise short-term wind power prediction based on advanced algorithms for studying the output power of a wind power generation system. This paper summarizes the contribution of the current advanced wind power forecasting technology and delineates the key advantages and disadvantages of various wind power forecasting models. These models have different forecasting capabilities, update the weights of each model in real time, improve the comprehensive forecasting capability of the model, and have good application prospects in wind power generation forecasting. Furthermore, the case studies and examples in the literature for accurately predicting ultra-short-term and short-term wind power generation with uncertainty and randomness are reviewed and analyzed. Finally, we present prospects for future studies that can serve as useful directions for other researchers planning to conduct similar experiments and investigations.
ARTICLE | doi:10.20944/preprints202312.0581.v1
Subject: Engineering, Other Keywords: Artificial Intelligence; Load Forecasting; Feature Selection; Outlier Rejection
Online: 8 December 2023 (12:42:36 CET)
Recently, the application of Artificial Intelligence (AI) in many areas of life has allowed raising the efficiency of systems and converting them into smart ones, especially in the field of energy. Integrating AI with power systems allows electrical grids to be smart enough to predict the future load, which is known as Intelligent Load Forecasting (ILF). Hence, suitable decisions for power system planning and operation procedures can be taken accordingly. Moreover, ILF can play a vital role in electrical demand response, which guarantees a reliable transitioning of power systems. This paper introduces a Perfect Load Forecasting Strategy (PLFS) for predicting future load in smart electrical grids based on AI techniques. The proposed PLFS consists of two sequential phases, which are; Data Preprocessing Phase (DPP) and Load Forecasting Phase (LFP). In the former phase, input electrical load dataset is prepared before the actual forecasting takes place through two essential tasks, namely feature selection and outlier rejection. Feature selec-tion is done using Advanced Leopard Seal Optimization (ALSO) as a new natural inspired opti-mization technique, Citation: To be added by editorial staff during production. Academic Editor: Firstname Last-name Received: date Revised: date Accepted: date Published: date Copyright: © 2023 by the authors. Submitted for possible open access publication under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). While outlier rejection is accomplished through Interquartile Range (IQR) as a measure of statis-tical dispersion. On the other hand, actual load forecasting takes place in LFP using a new pre-dictor called; Weighted K-Nearest Neighbor (WKNN) algorithm. The proposed PLFS has been tested through excessive experiments. Results have shown that PLFS outperforms recent load forecasting techniques as it introduces the maximum prediction accuracy with the minimum root mean square error.
ARTICLE | doi:10.20944/preprints202310.1799.v1
Subject: Computer Science And Mathematics, Probability And Statistics Keywords: stochastic processes; time series; ARIMA; ODE; price forecasting
Online: 30 October 2023 (07:37:48 CET)
The distribution laws of various natural and anthropogenic processes in the world around us are stochastic in nature. The development of mathematics and, in particular, of stochastic modelling allows us to study regularities in such processes. In practice, stochastic modelling finds a huge number of applications in various fields, including finance and economics. In this work, some particular applications of stochastic processes in finance are examined in the conditions of financial crisis. More specifically, Autoregressive Integrated Moving Average (ARIMA) models and Modified Ordinary Differential Equations (ODE) models, which have been previously developed by the authors to predict assets’ prices of four Bulgarian companies, are validated on a time period during the crisis. Estimated rates of return are calculated from the models for one period ahead. The errors are estimated and the models are compared. The predicted return values with each of the two approaches are used to derive optimal risk portfolios based on the Markowitz Model. The resulting portfolios are compared in terms of distribution (weights of the stocks), risk and rate of return.
ARTICLE | doi:10.20944/preprints202309.1191.v1
Subject: Computer Science And Mathematics, Computer Science Keywords: prophet; LSTM; GRU; meteorological data; electricity consumption forecasting
Online: 19 September 2023 (08:33:08 CEST)
This study proposes a short- and medium-term electricity consumption prediction algorithm by combining the GRU model suitable for long-term forecasting and the Prophet model suitable for seasonality and event handling. (1) Manufacturing Company B's Electricity consumption data and meteorological data in Naju, Jeollanam-do, South Korea are collected and preprocessed. (2) The preprocessed data proposes the Prophet model in the first step for seasonality and event handling prediction. (3) In the second step, seven multivariate data are experimented with GRU. Specifically, the seven multivariate data consist of six meteorological data and the residuals between the predicted data from the proposed Prophet model in Step 1 and the observed data. These are utilized to predict electricity consumption at 15-minute intervals. (4) Electricity consumption is predicted for short-term (2 days and 7 days) and medium-term (15 days and 30 days) scenarios. The experimental results demonstrate that the proposed method outperforms the conventional Prophet model by more than 23 times and the modified GRU model by more than 2 times in terms of MAPE.
ARTICLE | doi:10.20944/preprints202308.1321.v1
Subject: Engineering, Electrical And Electronic Engineering Keywords: Energy Forecasting; Modeling; Electricity Mix; Machine Learning Algorithms
Online: 18 August 2023 (09:26:27 CEST)
This study focused on the predictive models incorporating machine learning techniques that induce new dynamics for forecasting energy generation, enabling effective planning, financing, and system monitoring. The research developed a machine learning-based power generation prediction model tailored explicitly for Kenya's Garissa solar power plant. The selected model demonstrated a root mean squared error of 5.23 during evaluation, resulting in a prediction accuracy of 90.42%. This high accuracy indicates that the model can be relied upon for precise generation prediction, facilitating effective planning, and system performance monitoring
ARTICLE | doi:10.20944/preprints202308.0427.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: deep neural networks; time series forecasting; covariates; retailing
Online: 4 August 2023 (10:37:47 CEST)
Retailers must have accurate sales forecasts to operate their businesses efficiently and effectively and to remain competitive in the marketplace. Global forecasting models like RNNs can be a powerful tool for forecasting in retail settings, where multiple time series are often interrelated and influenced by a variety of external factors. By including covariates in a forecasting model, we can better capture the various factors that can influence sales in a retail setting. This can help improve the accuracy of our forecasts and enable better decision-making for inventory management, purchasing, and other operational decisions. In this study we investigate how the accuracy of global forecasting models is affected by the inclusion of different potential demand covariates. To ensure the significance of the study's findings, we used the M5 forecasting competition's openly accessible and well-established dataset. The results obtained from the DeepAR models, which were trained on various combinations of features including time, price, events, and IDs, suggest that individually only the features corresponding to IDs improve the baseline model. However, when all the features are used together, the best performance is achieved, indicating that the individual relevance of each feature is emphasized when the information is given jointly. Comparing the model with features to the model without features, there is an improvement of 1.76\% for MRMSSE and 6.47\% for MMASE.
ARTICLE | doi:10.20944/preprints202103.0269.v1
Subject: Business, Economics And Management, Accounting And Taxation Keywords: forecasting methods; statistical learning; high-frequency order book
Online: 9 March 2021 (12:24:12 CET)
This paper proposes a forecast-centric adaptive learning model that engages with the past studies on the order book and high-frequency data, with applications to hypothesis testing. In line with the past literature, we produce brackets of summaries of statistics from the high-frequency bid and ask data in the CSI 300 Index Futures market and aim to forecast the one-step-ahead prices. Traditional time series issues, e.g. ARIMA order selection, stationarity, together with potential financial applications are covered in the exploratory data analysis, which pave paths to the adaptive learning model. By designing and running the learning model, we found it to perform well compared to the top fixed models, and some could improve the forecasting accuracy by being more stable and resilient to non-stationarity. Applications to hypothesis testing are shown with a rolling window, and further potential applications to finance and statistics are outlined.
ARTICLE | doi:10.20944/preprints201911.0149.v1
Subject: Engineering, Control And Systems Engineering Keywords: car sharing; forecasting; machine learning; socio-demographic; weather
Online: 13 November 2019 (12:31:49 CET)
Free Floating Car Sharing (FFCS) services are a flexible alternative to car ownership. These transportation services show highly dynamic usage both over different hours of the day, and across different city areas. In this work, we study the problem of predicting FFCS demand patterns -- a problem of great importance to an adequate provisioning of the service. We tackle both the prediction of the demand i) over time and ii) over space. We rely on months of real FFCS rides in Vancouver, which constitute our ground truth. We enrich this data with detailed socio-demographic information obtained from large open-data repositories to predict usage patterns. Our aim is to offer a thorough comparison of several machine learning algorithms in terms of accuracy and easiness of training, and to assess the effectiveness of current state-of-art approaches to address the prediction problem. Our results show that it is possible to predict the future usage with relative errors down to 10%, and the spatial prediction can be estimated with relative errors of about 40%. Our study also uncovered the socio-demographic features that most strongly correlate with FFCS usage, providing interesting insights for providers opening service in new regions.
ARTICLE | doi:10.20944/preprints201811.0096.v1
Subject: Computer Science And Mathematics, Information Systems Keywords: machine learning; stacking; forecasting; regression; sales; time series
Online: 5 November 2018 (09:54:54 CET)
In this paper, we study the usage of machine learning models for sales time series forecasting. The effect of machine learning generalization has been considered. A stacking approach for building regression ensemble of single models has been studied. The results show that using stacking technics, we can improve the performance of predictive models for sales time series forecasting.
ARTICLE | doi:10.20944/preprints201811.0010.v1
Subject: Physical Sciences, Optics And Photonics Keywords: forecasting; complex dynamics; fiber laser; chaos; ordinal patterns
Online: 2 November 2018 (04:21:25 CET)
Being able to forecast events is of great importance in many fields, from brain behavior to earthquakes or stock markets. Because each dynamical system has intrinsic features, different statistical tools have to be used for each system. Here we study the time series of the output intensity of a fiber laser with an ordinal patterns analysis, and we look for temporal correlations in order to statistically forecast the most intense events. We set two thresholds, a low one and a high one, to distinguish between low intensity versus high intensity events. We find that when the time series is performing events below the low threshold it shows some preferred temporal patterns before performing events above a high threshold.
ARTICLE | doi:10.20944/preprints201808.0163.v1
Subject: Engineering, Electrical And Electronic Engineering Keywords: traffic flow forecasting; cluster computing; LSTM neural networks
Online: 8 August 2018 (08:50:09 CEST)
Accurate and fast traffic flow forecasting is vital in intelligent transportation system because many of the advanced features in intelligent transportation systems are based on it. However, existing methods have poor performance regarding accuracy and computational efficiency in long-term traffic flow forecasting under big data. Hence, we propose an improved Long short-term memory (LSTM) Network and its cluster computing implementation in this paper to address the above challenge. We propose a singular point probability LSTM (SDLSTM) algorithm. The method discards the units of the network according to the singular point probability during the training process and amends the SDLSTM by Autoregressive Integrated Moving Average Model (ARIMA) to achieve the accurate prediction of 24-hour traffic flow data. Furthermore, the paper designs a scheme for implementing this method through cluster computing to shorten the calculation time and improve the system's operating speed. Theoretical analysis and experimental results show that SDLSTM gains a higher accuracy rate and better stability in the long-term traffic flow forecasting compared with previous methods.
ARTICLE | doi:10.20944/preprints201801.0216.v1
Subject: Engineering, Energy And Fuel Technology Keywords: Electricity Demand; ANN; PSO; GA; Hybrid Optimization; Forecasting
Online: 23 January 2018 (15:30:10 CET)
In the present study, a hybrid optimizing algorithm has been proposed using Genetic Algorithm (GA)and Particle Swarm Optimization (PSO) for Artificial Neural Network (ANN) to improve the estimation of electricity demand of the state of Tamil Nadu in India. The GA-PSO model optimizes the coefficients of factors of gross state domestic product (GSDP) , electricity consumption per capita, income growth rate and consumer price index (CPI) that affect the electricity demand. Based on historical data of 25 years from 1991 till 2015 , the simulation results of GA-PSO models are having greater accuracy and reliability than single optimization methods based on either PSO or GA. The forecasting results of ANN-GA-PSO are better than models based on single optimization such as ANN-BP, ANN-GA, ANN-PSO models. Further the paper also forecasts the electricity demand of Tamil Nadu based on two scenarios. First scenario is the "as-it-is" scenario , the second scenario is based on milestones set for achieving goals of "Vision 2023" document for the state. The present research also explores the causality between the economic growth and electricity demand in case of Tamil Nadu. The research indicates that a direct causality exists between GSDP and the electricity demand of the state.
ARTICLE | doi:10.20944/preprints201711.0190.v2
Subject: Engineering, Energy And Fuel Technology Keywords: electricity demand; ANN; PSO; GA; hybrid optimization; forecasting
Online: 16 January 2018 (07:44:04 CET)
In the present study, a hybrid optimizing algorithm has been proposed using Genetic Algorithm (GA)and Particle Swarm Optimization (PSO) for Artificial Neural Network (ANN) to improve the estimation of electricity demand of the state of Tamil Nadu in India. The GA-PSO model optimizes the coefficients of factors of gross state domestic product (GSDP) , electricity consumption per capita, income growth rate and consumer price index (CPI) that affect the electricity demand. Based on historical data of 25 years from 1991 till 2015, the simulation results of GA-PSO models are having greater accuracy and reliability than single optimization methods based on either PSO or GA. The forecasting results of ANN-GA-PSO are better than models based on single optimization such as ANN-BP, ANN-GA, ANN-PSO models. Further the paper also forecasts the electricity demand of Tamil Nadu based on two scenarios. First scenario is the “as-it-is” scenario, the second scenario is based on milestones set for achieving goals of “Vision 2023” document for the state. The present research also explores the causality between the economic growth and electricity demand in case of Tamil Nadu. The research indicates that a direct causality exists between GSDP and the electricity demand of the state.
ARTICLE | doi:10.20944/preprints201710.0053.v1
Subject: Computer Science And Mathematics, Computer Science Keywords: ARIMA; forecasting; fuzzy partition; fuzzy transform; time series
Online: 9 October 2017 (16:58:43 CEST)
We define a new seasonal forecasting method based on fuzzy transforms. We use the best interpolating polynomial for extracting the trend of the time series and generate the inverse fuzzy transform on each seasonal subset of the universe of discourse for predicting the value of a an assigned output. Like first example, we use the daily weather dataset of the municipality of Naples (Italy) starting from data collected from 2003 till to 2015 making predictions on the following outputs: mean temperature, max temperature and min temperature, all considered daily. Like second example, we use the daily mean temperature measured at the weather station “Chiavari Caperana” in the Liguria Italian Region. We compare the results with our method, the average seasonal variation, ARIMA and the usual fuzzy transforms concluding that the best results are obtained under our approach in both examples.
REVIEW | doi:10.20944/preprints201812.0235.v1
Subject: Engineering, Electrical And Electronic Engineering Keywords: Intelligent Load Forecasting 1; Demand-Side Management 2; Pattern Similarity 3; Hierarchical Forecasting 4; Feature Selection 5; Weather Station Selection 6
Online: 19 December 2018 (12:19:14 CET)
Electricity demand forecasting has been a real challenge for power system scheduling in the different levels of the energy sectors. Various computational intelligence techniques and methodologies have been employed in the electricity market for load forecasting; although, scant evidence is available about the feasibility of each of these methods considering the type of data and other potential factors. This work introduces several scientific, technical rationale behind intelligent forecasting methods, based on the work of previous researchers in the field of energy. The fundamental benefits and main drawbacks of the aforementioned methods are discussed in order to depict the efficiency of each approach in various situations. In the end, a proposed hybrid strategy is represented.
REVIEW | doi:10.20944/preprints202309.1764.v2
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Machine-learning; Weather prediction; Climate prediction; Survey; Meteorological Forecasting
Online: 30 October 2023 (17:09:12 CET)
With the rapid development of artificial intelligence, machine learning is gradually becoming popular in predictions in all walks of life. In meteorology, It is gradually competing with traditional climate predictions dominated by physical models. This survey aims to consolidate the current understanding of Machine Learning (ML) applications in weather and climate prediction—a field of growing importance across multiple sectors including agriculture and disaster management. Building upon an exhaustive review of more than 20 methods highlighted in existing literature, this survey pinpointed eight techniques that show particular promise for improving the accuracy of both short-term weather and medium-to-long-term climate forecasts. According to the survey, while ML demonstrates significant capabilities in short-term weather prediction, its application in medium-to-long-term climate forecasting remains limited, constrained by factors such as intricate climate variables and data limitations. Current literature tends to focus narrowly on either short-term weather or medium-to-long-term climate forecasting, often neglecting the relationship between the two, as well as general neglect of modelling structure and recent advances. By providing an integrated analysis of models spanning different time scales, this survey aims to bridge these gaps, thereby serving as a meaningful guide for future interdisciplinary research in this rapidly evolving field.
ARTICLE | doi:10.20944/preprints202306.0240.v1
Subject: Environmental And Earth Sciences, Pollution Keywords: deep learning; PM10; environmental forecasting; chaotic time series; Arctic
Online: 5 June 2023 (05:04:24 CEST)
In this study, we present a statistical forecasting framework and assess its efficacy using a range of established machine learning algorithms for predicting Particulate Matter (PM) concentrations in the Arctic, specifically in Pallas (FI), Reykjavik (IS), and Tromso (NO). Our framework leverages historical ground measures and 24-hour predictions from 9 models provided by the Copernicus Atmosphere Monitoring Service (CAMS) to provide PM predictions for the following 24 hours. Furthermore, we compare the performance of various memory cells based on artificial neural networks (ANN), including recurrent neural networks (RNNs), gated recurrent units (GRUs), long short-term memory networks (LSTMs), echo state networks (ESNs), and windowed multi-layer perceptrons (MLPs), which are commonly employed in time series forecasting tasks. Irrespective of the chosen memory cell type, our results demonstrate that the proposed framework consistently outperforms the CAM models in terms of mean squared error (MSE), with average improvements ranging from 25% to 40%. Additionally, we investigate the impact of outliers on the overall model performance.
ARTICLE | doi:10.20944/preprints202305.0623.v1
Subject: Computer Science And Mathematics, Computational Mathematics Keywords: Bayesian forecasting; stochastic simulation; parameter uncertainty; two-level simulation
Online: 9 May 2023 (09:24:32 CEST)
When there is uncertainty in the value of parameters of the input random components of a stochastic simulation model, two-level nested simulation algorithms are used to estimate the expectation of performance variables of interest. In the outer level of the algorithm (n) observations are generated for the parameters, and in the inner level (m) observations of the simulation model are generated with the value of parameters fixed at the value generated in the outer level. In this article, we consider the case in which the observations at both levels of the algorithm are independent, showing how the variance of the observations can be decomposed into the sum of a parametric variance and a stochastic variance. Next, we derive central limit theorems that allow us to compute asymptotic confidence intervals to assess the accuracy of the simulation-based estimators for the point forecast and the variance components. Under this framework, we derive analytical expressions for the point forecast and the variance components of a Bayesian model to forecast sporadic demand; and we use these expressions to illustrate the validity of our theoretical results by performing simulation experiments using this forecast model.
ARTICLE | doi:10.20944/preprints202111.0510.v1
Subject: Environmental And Earth Sciences, Environmental Science Keywords: Flood Early Warning; forecasting; hydrological extremes; Machine Learning; Andes
Online: 26 November 2021 (13:30:09 CET)
Flood Early Warning Systems (FEWSs) using Machine Learning (ML) has gained worldwide popularity. However, determining the most efficient ML technique is still a bottleneck. We assessed FEWSs with three river states, No-alert, Pre-alert, and Alert for flooding, for lead times between 1 to 12 hours using the most common ML techniques, such as Multi-Layer Perceptron (MLP), Logistic Regression (LR), K-Nearest Neighbors (KNN), Naive Bayes (NB), and Random Forest (RF). The Tomebamba catchment in the tropical Andes of Ecuador was selected as case study. For all lead times, MLP models achieve the highest performance followed by LR, with f1-macro (log-loss) scores of 0.82 (0.09) and 0.46 (0.20) for the 1- and 12-hour cases, respectively. The ranking was highly variable for the remaining ML techniques. According to the g-mean, LR models correctly forecast and show more stability at all states, while the MLP models perform better in the Pre-alert and Alert states. Future efforts are recommended to enhance the input data representation and develop communication applications to boost the awareness of the society for floods.
ARTICLE | doi:10.20944/preprints202009.0518.v1
Subject: Computer Science And Mathematics, Information Systems Keywords: aquaculture water quality; dissolved oxygen (DO); forecasting; EEMD; LSTM
Online: 22 September 2020 (10:02:09 CEST)
Dissolved Oxygen (DO) concentration is a vital parameter that indicates water quality. DO short term forecasting using time series analysis on data collected from an aquaculture pond is presented here. This can provide data support for an early warning system for an improved management of the aquaculture farm. The conventional forecasting approaches are commonly characterized by low accuracy and poor generalization problems. In this article, we present a novel hybrid DO concentration forecasting method with ensemble empirical mode decomposition (EEMD) based LSTM (Long short-term memory) neural network (NN). With this method, first, the sensor data integrity is improved through linear interpolation and moving average filtering methods of data preprocessing. Next, the EEMD algorithm is applied to decompose the original sensor data into multiple intrinsic mode functions (IMFs). Finally, the feature selection is used to carefully select IMFs that are strongly correlated with the original sensor data and integrate both into inputs for the NN. The hybrid EEMD-based LSTM forecasting model is then constructed. Performance of this proposed model in training and validation sets was compared with the observed real sensor data. To obtain the exact evaluation accuracy of the forecasted results of the hybrid EEMD-based LSTM forecasting model, three statistical performance indices were adopted: MAE, MSE, and RMSE. Results presented for short term (12-hour period) and long term (1-month period) give a strong indication of suitability of this method for forecasting DO values.
ARTICLE | doi:10.20944/preprints201810.0494.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: unsupervised training; features learning; deep learning; time series forecasting
Online: 22 October 2018 (12:24:43 CEST)
A continuous Deep Belief Network (cDBN) with two hidden layers is proposed in this paper, focusing on the problem of weak feature learning ability when dealing with continuous data. In cDBN, the input data is trained in an unsupervised way by using continuous version of transfer functions, the contrastive divergence is designed in hidden layer training process to raise convergence speed, an improved dropout strategy is then implemented in unsupervised training to realize features learning by de-cooperating between the units, and then the network is fine-tuned using back propagation algorithm. Besides, hyper-parameters are analysed through stability analysis to assure the network can find the optimal. Finally, the experiments on Lorenz chaos series, CATS benchmark and other real world like CO2 and waste water parameters forecasting show that cDBN has the advantage of higher accuracy, simpler structure and faster convergence speed than other methods.
ARTICLE | doi:10.20944/preprints201710.0129.v1
Subject: Engineering, Civil Engineering Keywords: hydropower; errors; multi-step ahead forecasting; recursive method; simulations
Online: 19 October 2017 (02:34:27 CEST)
Multi-step ahead streamflow forecasting is of practical interest for the operation of hydropower reservoirs. We provide generalized results on the error evolution in multi-step ahead forecasting by conducting several large-scale experiments based on simulations. We also present a multiple-case study using monthly time series of streamflow. Our findings suggest that some forecasting methods are more useful than others. However, the errors computed at each time step of a forecast horizon within a specific case study strongly depend on the case examined and can be either small or large, regardless of the forecasting method used and the time step of interest.
ARTICLE | doi:10.20944/preprints201710.0051.v1
Subject: Computer Science And Mathematics, Computational Mathematics Keywords: cloud computing; workload model; workload-aware resource forecasting model
Online: 9 October 2017 (12:40:34 CEST)
The primary attraction of IaaS is providing elastic resources on demand. It becomes imperative that IaaS-users have an effective methodology for learning what resources they require, how many resources and for how long they need. However, the heterogeneity of resources, the diversity resource demands of different cloud applications and the variation of application-user behaviors pose IaaS-users big challenge. In this paper, we purpose a unified resource demand forecasting model suiting for different applications, various resources and diverse time-varying workload patterns. With the model, taking input from parameterized applications, resources and workload scenarios, the corresponding resources demands during any time interval can be deduced as output. The experiments configure concrete functions and parameters to help understanding the above model.
ARTICLE | doi:10.20944/preprints201707.0013.v1
Subject: Environmental And Earth Sciences, Atmospheric Science And Meteorology Keywords: renewable energy; solar; wind; interannual variability; seasonal forecasting; teleconnections
Online: 7 July 2017 (03:55:54 CEST)
Solar and wind resources available for power generation are subject to variability due to meteorological factors. Here, we use a new global climate reanalysis product, Version 2 of the NASA Modern-Era Retrospective Analysis for Research and Applications (MERRA-2), to quantify interannual variability of monthly-mean solar and wind resource from 1980 to 2016 at a resolution of about 0.5 degrees. We find an average coefficient of variation (CV) of 11% for monthly-mean solar radiation and 8% for windspeed. Mean CVs were about 25% greater over ocean than over land, and, for land areas, were greatest at high latitude. The correlation between solar and wind anomalies was near zero in the global mean, but markedly positive or negative in some regions. Both wind and solar variability were correlated with values of climate modes such as the Southern Oscillation Index and Arctic Oscillation, with correlations in the Northern Hemisphere generally stronger during winter. We conclude that reanalysis solar and wind fields could be helpful in assessing variability in power generation due to interannual fluctuations in the wind and solar resource. Skillful prediction of these fluctuations seems to be possible, particularly for certain regions and seasons, given persistence or predictability of climate modes with which these fluctuations are associated.
ARTICLE | doi:10.20944/preprints201703.0231.v1
Subject: Chemistry And Materials Science, Materials Science And Technology Keywords: copper resources; demand forecasting; system dynamics model; sustainability development
Online: 31 March 2017 (10:50:56 CEST)
Copper demand for a country's copper industry has a greater pull effect. China's copper consumption in 2015 has accounted for 50% of the world. The scientific forecast of China's copper demands trend is also an important basis for analyzing its future environmental impact. This paper assumes that China's economy will be developing high, medium and low scenarios, and forecasts economic and social indicators such as total GDP, population and per capita GDP in China from 2016 to 2030. Then, predicted the demand of copper resources in China from 2016 to 2030 by the combination of system dynamics model, vector autoregressive moving average model and inverted U-type empirical model. The results show that: (1) in 2020, 2025 and 2030, China's refined copper demand will be 13 Mt, 15 Mt and 15.5 Mt. (2) China's copper demand growth slowed down significantly from 2016-2030. (3) 2025-2030, China's copper resource demand is stable, into the platform of demand growth, the highest peak value in 2027 will be 15.5 Mt. (4) 2030 years later, China's copper resource demand will enter a slow decline.
ARTICLE | doi:10.20944/preprints201608.0204.v1
Subject: Business, Economics And Management, Economics Keywords: logistics industry; sustainability; data envelopment analysis (DEA); grey forecasting
Online: 25 August 2016 (10:12:27 CEST)
Logistics plays an important role in globalized companies and contributes to the development of foreign trade. A large number of external conditions, such as recession and inflation, affect logistics. Therefore, managers should find ways to improve operational performance, enabling them to increase efficiency while considering environmental sustainability due to the industry’s large scale of energy consumption. Based on data collected from the financial reports of top global logistics companies, this study uses a DEA model to calculate corporate efficiency by implementing a Grey forecasting approach to forecast future sustainability values. Consequently, the study addresses the problem of how to enhance operational performance while accounting for the impact of external conditions. This research can help logistics companies develop operation strategies in the future that will enhance their competitiveness vis-à-vis rivals in a time of global economic volatility.
REVIEW | doi:10.20944/preprints202310.0057.v1
Subject: Environmental And Earth Sciences, Sustainable Science And Technology Keywords: Scientific mapping; greenhouse effect; renewable energy; energy generation; forecasting; review
Online: 2 October 2023 (11:59:51 CEST)
Higher concentrations of greenhouse gases resulting from anthropogenic actions associated with energy generation are one of the causes of climate change. In view of this, several efforts have been undertaken in the search for more sustainable alternatives, and photovoltaic (PV) technology has stood out among the different possibilities. However, PV generation is highly sensitive to future climate variability, which is a source of uncertainty that can complicate energy planning and compromise the viability of systems. This theme has received attention from the academic community, but some challenges to map and identify relevant literature have been encountered. Therefore, this study was conducted to analyze and identify relevant aspects of international scientific production on the impacts of climate change on the potential of photovoltaic production, CC-PVP, through bibliometric techniques. For this, 3900 articles from the Web of Science and Scopus databases, published between 1960 and 2021, were retrieved and analyzed through a bibliometric approach, using the SciMAT tool. Among the results obtained, it is worth pointing out that the CC-PVP research field (i) has moderate maturity, (ii) is concentrated in the areas of energy, fuels and technology, as well as environmental sciences and meteorology, (iii) has the most studied themes currently related to energy and the forecasting of photovoltaic energy production and electric energy consumption in the world, especially when considering climate change, and (iv) is more researched by Chinese, North Americans and Australians.
ARTICLE | doi:10.20944/preprints202308.2068.v1
Subject: Environmental And Earth Sciences, Atmospheric Science And Meteorology Keywords: Raindrop Size Distribution; Microphysical Characteristics; LSMT Neural Network; Precipitation Forecasting
Online: 30 August 2023 (09:47:49 CEST)
Raindrop size distribution (RSD) is an index for reflecting precipitation characteristics. Analyzing the differences of RSD plays an important role in understanding the precipitation microphysical processes and improving quantitative precipitation prediction of radar. In this paper, the RSD data of Dafang (57708) station, Majang (57828) station, and Luodian (57916) station (with an altitude of 1722.7m, 985.0m, and 441.5m, respectively) in Guizhou are analyzed according to the precipitation microphysical characteristics. First,Particles with particle size less than 1mm contributes the highest value to the density of particle number , and decrease with the descending altitude. Second, the GAMMA distribution fit shows a better fit compare to M-P distribution fit, and GAMMA distribution fit increases with the ascending altitude. Third, the mass-weighted average diameter is not sensitive enough to the change of precipitation intensity with correlation coefficients of 54.84%, but there is a obvious relationship between the average volume diameter and the precipitation intensity with correlation coefficients of 69.15%. Then, the precipitation prediction model is established using LSTM neural network after fusing the representative microphysical characteristics of RSD with radar and rain gauge data. The precipitation prediction model is applied to the Dafang (57708) site to predict precipitation for time range of 0-180 minutes, and it is found that for the prediction of convective cloud and stratiform cloud precipitation, the 60-minute prediction results are the most consistent with the actual precipitation, where the correlation coefficients are 92.87% and 92.57%, respectively. Conclusively, the results demonstrate that combining with the RSD base data could improve the reliability of precipitation forecasting.
ARTICLE | doi:10.20944/preprints202308.0693.v1
Subject: Engineering, Mechanical Engineering Keywords: wind energy; solar energy; renewable energy; machine learning; forecasting ensembles
Online: 9 August 2023 (10:56:29 CEST)
In this paper, solar irradiance and wind speed forecasts were performed considering time horizons ranging from 10 min to 60 min, under a 10 min time-step. Global horizontal irradiance (GHI) and wind speed were computed using four forecasting models (Random Forest, k-Nearest Neighbours, Support Vector Regression, and Elastic Net) to compare their performance against two alternative dynamic ensemble methods (windowing and arbitrating). Forecasting models and dynamic forecasting ensembles were implemented in Python for performance evaluation. The performance comparison between the prediction models and the dynamic ensemble methods was carried out by evaluating the RMSE, MAE, R² and MAPE, to evaluate whether the dynamic ensemble forecasting method obtained greater. According to the results obtained windowing dynamic ensemble method was the most efficient among the tested. For the wind speed data, by varying its parameter λ (from 1 to 100), a variable performance profile was obtained, where from λ =1 to λ = 74, windowing proved to be the most efficient, reaching maximum efficiency for λ = 19. Windowing was the best method for the GHI analysis, reaching its best performance for λ = 1. The efficiency gain using windowing was 0.56% when using the wind speed model and 1.96% for GHI.
ARTICLE | doi:10.20944/preprints202306.1744.v1
Subject: Business, Economics And Management, Business And Management Keywords: Macroeconomic Forecasting Model; Scheduling Algorithm; Enterprise Benefit Optimization; Resource Management
Online: 26 June 2023 (05:34:15 CEST)
For an enterprise, the most critical aspect of development is resource management, which has a significant impact on all aspects of the enterprise. Therefore, enterprises must pay attention to resource management allocation, which can better promote the sustainable development of the enterprise and obtain optimal benefits. In the production and development of modern enterprises, the acquisition of benefits also involves the allocation of resources in enterprise management. This paper proposed a benefit optimization scheduling algorithm based on a macroeconomic prediction model under auction mechanism and a grid resource scheduling algorithm driven by a benefit function to allocate resources reasonably. This article used macroeconomic forecasting models to fully understand the resource needs of various departments and the resources held by enterprises, and rationally allocate various resources in various departments. This can improve the work efficiency of various departments, thereby reducing the cost of the enterprise, and achieving optimal benefits for the enterprise. The experimental results in this paper showed that the cost of resource management allocation for the benefit optimization scheduling algorithm and the grid resource scheduling algorithm based on the benefit function driven under the auction mechanism was 105.6 yuan and 46.8 yuan respectively when the task volume was 125 under the multi user environment. The time allocated for resource management was 36.6s and 18.9s respectively. It can be seen that the efficiency function driven grid resource scheduling algorithm had a lower cost and time for resource management allocation, so the efficiency function driven grid resource scheduling algorithm can achieve enterprise efficiency optimization.
ARTICLE | doi:10.20944/preprints202305.0445.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: GDP; deep learning; time fusion transformers; multi-horizon forecasting; interpretability
Online: 8 May 2023 (04:47:10 CEST)
This paper applies a new artificial intelligence architecture, the Temporal Fusion Transformer (TFT), for the joint GDP forecasting of 25 OECD countries at different time horizons. This new attention-based architecture offers significant advantages over other deep learning methods. First, results are interpretable since the impact of each explanatory variable on each forecast can be calculated. Second, it allows to visualize persistent temporal patterns and to identify significant events and different regimes. Third, it provides quantile regressions and permits to train the model on multiple time series from different distributions. Results suggest that TFTs outperform regression models, especially in periods of turbulence such as the COVID-19 shock. Interesting economic interpretations are obtained depending on whether the country is domestic demand-led or export-led growth. In essence, TFT is revealed as a new tool that artificial intelligence provides to economists and policy makers, with enormous prospects for the future.
ARTICLE | doi:10.20944/preprints202304.0144.v1
Subject: Engineering, Other Keywords: COVID-19; Machine learning; Demand Forecasting; Neural network; LSTM
Online: 10 April 2023 (04:27:16 CEST)
Accurate demand forecasting plays a critical role in most furniture businesses’ operational, tactical, and strategic decisions, as the demand in the furniture business is considered seasonal and becomes more complex in crises. In this work, a neural network model using the Long Short-Term Memory (LSTM) method was developed to forecast the demand for specific product groups. LSTM is a leading deep learning model for time series prediction, particularly seasonal, multi-item, and non-linear situations. The developed model was used to predict the demand based on old data before the Covid-19 pandemic and recent data of the first months of the pandemic as a fast response to the crisis. In addition, a comparison study was conducted between the developed model and the traditional planning inventory used by furniture businesses that provided us with the data. The results showed that the Covid-19 pandemic significantly impacted demand forecasting. Also, the fast response to Covid-19 pandemic has slightly increased the model performance. Finally, the comparison study demonstrated that our model is robust and better than the traditional demand forecasting method. Therefore, the developed model may help the business improve inventory and production planning to create a more flexible supply chain.
ARTICLE | doi:10.20944/preprints202301.0006.v1
Subject: Business, Economics And Management, Business And Management Keywords: Financial distress; Dual system banking; Loan Loss Provission; forecasting; econometrics
Online: 3 January 2023 (07:19:41 CET)
Nowadays, many Muslim-majority countries have implemented a dual banking system, namely the sharia and conventional systems. The development of Islamic banks is to fulfill the Muslims' need for the existence of halal transactions in financial institutions. However, in some countries, it turns out that conventional banks still dominate the country's economy. Because of that, it is necessary to see whether there are differences in financial risk and Earnings management between Islamic and conventional banks. The samples are conventional and Islamic banks in Southeast Asia, analyzed by the purposive sampling method from 2010-2019. The analytical tool used is the statistical difference test and economometrics analysis using generalized least square (GLS) regression with panel data (time series and cross-sectional data). These models are intended to forecasting the macroeconomics effects in applying dual banking system in one country or region. The results using non parametrics means difference test showed that the first hypothesis is accepted It means that Earnings management in conventional banks is greater than in Islamic banks. The Random Model Effect (REM) for second and third hypotheses testing on Conventional banks shows the Bankruptcy Risk and NPL do not affect the dependent variable Earnings Management (LLP). While fixed effect model testing on Islamic banks, the second and third hypothesis testing is rejected. Therefor Islamic Banks the value of Bankruptcy Risk (z-score) and the value of Non-Performing Loans (NPL) do not affect Earnings management. It also means that hypothesis 2 and 3 are rejected both in conventional as well as Islamic Banking. Sensitivity analysis for conventional as well as Islamic banking altogether using fixed effect model shows that the second and third hypotheses show that the independent variables (Bankruptcy Risk and NPL) do not affect the dependent variable Earnings Management (LLP). These results can be concluded that Islamic bank are enganged in less earnings management. Therefor in the the long run there are still more research that should conduct in comparing dual banking system in one region.
ARTICLE | doi:10.20944/preprints202211.0278.v1
Subject: Computer Science And Mathematics, Computer Science Keywords: Long Short-Term Memory; time series forecasting; commodities; technical analysis
Online: 15 November 2022 (07:00:55 CET)
This article presents the implementation of a model to estimate the future price of commodities in the Brazilian market from time series of short-term technical evaluation. For this, data from two databases were used, one referring to the foreign market (opening values, maximum, minimum, closing, closing adjustment and volume) and the other, from the Brazilian market (the price of the day), considering commodities, sugar, cotton, corn, soybean and wheat. Subsequently, the technical indicators were calculated from the TA-Lib technical analysis library. Pearson’s correlation coefficient was applied, records with low correlation were removed, and then the database was consolidated. From the pre-processed data, Long Short-Term Memory (LSTM) recurrent neural networks were used to perform data prediction at the one and three day interval. These models were evaluated using the mean square error (MSE), obtaining results between 0.00010 and 0.00037 on test data one day ahead, and from 0.00017 to 0.00042 three days ahead. However, based on the results obtained, it was observed that the developed model obtained a promising forecasting performance for all the commodities evaluated. As a main contribution, there is the consolidation of databases that can be used in future scientific research. Furthermore, based on its interpretation, it can assist in decision making regarding the buying and selling of commodities to increase financial gains.
REVIEW | doi:10.20944/preprints202208.0031.v1
Subject: Engineering, Electrical And Electronic Engineering Keywords: LSTM; GMDH; ANFIS; Ensemble Learning Models; Wavelet; Time Series Forecasting
Online: 2 August 2022 (03:50:14 CEST)
To improve the monitoring of the electrical power grid it is necessary to evaluate the influence of contamination in relation to leakage current and its progression to a disruptive discharge. In this paper, insulators were tested in a saline chamber to simulate the increase of salt contamination on their surface, and to evaluate the supportability of these components. From the time series forecasting of the leakage current, it is possible to evaluate the development of the fault before a flashover occurs. Choosing which method to use is always a difficult task since some models may have higher computational effort. In this paper, for a complete evaluation, the long short-term memory (LSTM), group method of data handling (GMDH), adaptive neuro-fuzzy inference system (ANFIS), bootstrap aggregation (bagging), sequential learning (boosting), random subspace, and stacked generalization (stacking) ensemble learning models are analyzed. A review and comparison of these well-established methods for time series forecasting is performed. From the results of the best structure of the model, the hyperparameters are evaluated and the Wavelet transform is used to obtain an enhanced model.
ARTICLE | doi:10.20944/preprints202203.0037.v1
Subject: Environmental And Earth Sciences, Atmospheric Science And Meteorology Keywords: machine learning; neural network; forecasting system; western Pacific subtropical high.
Online: 2 March 2022 (07:41:45 CET)
The ridge line of the western Pacific subtropical high (WPSHRL) plays an important role in determining the shift of the summer rain belt in eastern China. In this study, we developed a forecast system for the June WPSHRL index based on the latest autumn and winter sea surface temperature (SST). Considering the adverse condition of the small observed sample size, a very simple neural network (NN) model was selected to extract the non-linear relationship between input predictors (SST) and target predictands (WPSHRL) in the forecast system. In addition, some techniques are used to deal with the adverse condition, enhance the stabilization of forecast skills, and analyze the interpretability of the forecast system. The forecast experiments show that the linear correlation coefficient between the predictions from the forecast system and their corresponding observations is around 0.6, and about three-fifths of the observed abnormal years (the years with an obviously high or low WPSHRL index) are successfully predicted. Furthermore, sensitivity experiments show that the forecast system is relatively stable in terms of forecast skill. The above evaluations suggest that the forecast system is valuable in a real application sense.
ARTICLE | doi:10.20944/preprints202202.0192.v1
Subject: Medicine And Pharmacology, Obstetrics And Gynaecology Keywords: uterine cervical neoplasms; mortality; age-period-cohort analysis; forecasting; Brazil
Online: 16 February 2022 (05:03:16 CET)
Cervical cancer is a public health issue with high disease burden and mortality in Brazil. The objectives of the present study were analyzing age, period, and cohort effects on cervical cancer mortality in women 20 years old or older from 1980 to 2019 in the North, South, and Southeast Regions of Brazil; and evaluating whether the implementation of a national screening program and the expansion of access to public health services caused impacts over the examined period and reduced the risk of death over the past years and among younger cohorts. The effects were estimated by applying Poisson regression models with estimable functions. The highest mortality rate by 100,000 women was found in Amazonas (24.13), and the lowest in São Paulo (10.56). A positive gradient was obtained for death rates as women’s age increased. The states in the most developed regions (South and Southeast) showed a reduction in the risk of death in the period that followed the implementation of the screening program and in cohort from the 1960s onwards. The North Region showed a decreased risk of death only in Amapá (2000–2004) and Tocantins (1995–2004; 2010–2019). The findings indicated that health inequities remain in Brazil and suggested that the health system has limitations regarding decreasing mortality associated with this type of cancer in regions with lower socioeconomic development.
ARTICLE | doi:10.20944/preprints202003.0158.v1
Subject: Engineering, Energy And Fuel Technology Keywords: energy; demand; forecasting; deep; learning; machine; convolutional; artificial; neural; networks
Online: 10 March 2020 (03:40:31 CET)
This paper investigates the use of deep learning techniques to perform energy demand forecasting. Specifically, the authors have adapted a deep neural network originally thought for image classification and composed of a convolutional neural network (CNN) followed by a multilayered fully connected artificial neural network (ANN). The convolutional part of the network was fed with a grid of temperature forecasting data distributed in the area of interest in order to extract a featured temperature. The subsequent ANN is then fed with this calculated temperature along with other data related to the timing of the forecast. The proposed structure was first trained and then used in a real setting aimed to provide the French energy demand forecast using ARPEGE forecasting weather data. The results show that the performance of this approach is in the line of the performance provided by the reference RTE subscription-based service, which opens the possibility to obtain high accuracy forecasting using widely accessible deep learning techniques through open-source machine learning platforms.
ARTICLE | doi:10.20944/preprints202003.0096.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Deep learning; Energy demand; Temporal convolutional network; Time series forecasting
Online: 5 March 2020 (15:02:37 CET)
Modern energy systems collect high volumes of data that can provide valuable information about energy consumption. Electric companies can now use historical data to make informed decisions on energy production by forecasting the expected demand. Many deep learning models have been proposed to deal with these type of time series forecasting problems. Deep neural networks, such as recurrent or convolutional, can automatically capture complex patterns in time series data and provide accurate predictions. In particular, Temporal Convolutional Networks (TCN) are a specialised architecture that has advantages over recurrent networks for forecasting tasks. TCNs are able to extract long-term patterns using dilated causal convolutions and residual blocks, and can also be more efficient in terms of computation time. In this work, we propose a TCN-based deep learning model to improve the predictive performance in energy demand forecasting. Two energy-related time series with data from Spain have been studied: the national electric demand, and the power demand at charging stations for electric vehicles. An extensive experimental study has been conducted, involving more than 1900 models with different architectures and parametrisations. The TCN proposal outperforms the forecasting accuracy of Long Short-Term Memory (LSTM) recurrent networks, which are considered the state-of-the-art in the field.
ARTICLE | doi:10.20944/preprints201806.0365.v1
Subject: Engineering, Control And Systems Engineering Keywords: ARIMA model; data forecasting; multi-objective genetic algorithm; regression model
Online: 24 June 2018 (07:48:49 CEST)
The aim of this study has been to develop a novel two-level multi-objective genetic algorithm (GA) to optimize time series forecasting data for fans used in road tunnels by the Swedish Transport Administration (Trafikverket). Level 1 is for the process of forecasting time series cost data, while level 2 evaluates the forecasting. Level 1 implements either a multi-objective GA based on the ARIMA model or a multi-objective GA based on the dynamic regression model. Level 2 utilises a multi-objective GA based on different forecasting error rates to identify a proper forecasting. Our method is compared with using the ARIMA model only. The results show the drawbacks of time series forecasting using only the ARIMA model. In addition, the results of the two-level model show the drawbacks of forecasting using a multi-objective GA based on the dynamic regression model. A multi-objective GA based on the ARIMA model produces better forecasting results. In level 2, five forecasting accuracy functions help in selecting the best forecasting. Selecting a proper methodology for forecasting is based on the averages of the forecasted data, the historical data, the actual data and the polynomial trends. The forecasted data can be used for life cycle cost (LCC) analysis.
ARTICLE | doi:10.20944/preprints201609.0103.v1
Subject: Computer Science And Mathematics, Computational Mathematics Keywords: Maximum entropy model; K-means clustering; accuracy; classification; sports forecasting
Online: 27 September 2016 (11:10:50 CEST)
Predicting the outcome of a future game between two National Basketball Association (NBA) teams poses a challenging problem of interest to statistical scientists as well as the general public. In this article, we formalize the problem of predicting the game results as a classification problem and apply the principle of maximum entropy to construct NBA maximum entropy (NBAME) model that fits to discrete statistics for NBA games, and then predict the outcomes of NBA playoffs by the NBAME model. The best NBAME model is able to correctly predict the winning team 74.4 percent of the time as compared to some other machine learning algorithms which is correct 69.3 percent of the time.
ARTICLE | doi:10.20944/preprints201607.0001.v1
Subject: Business, Economics And Management, Finance Keywords: PUN, artificial intelligence models, regression tree, bootstrap aggregation, forecasting error
Online: 2 July 2016 (03:48:36 CEST)
Electricity price forecasting has become a crucial element for both private and public decision-making. This importance has been growing since the wave of deregulation and liberalization of energy sector worldwide late 1990s. Given these facts, this paper tries to come up with a precise and flexible forecasting model for the wholesale electricity price for the Italian power market on an hourly basis. We utilize artificial intelligence models such as neural networks and bagged regression trees that are rarely used to forecast electricity prices. After model calibration, our final model is bagged regression trees with exogenous variables. The selected model outperformed neural network and bagged regression with single price used in this paper, it also outperformed other statistical and non-statistical models used in other studies. We also confirm some theoretical specifications of the model. As a policy implication, this model might be used by energy traders, transmission system operators and energy regulators for an enhanced decision-making process.
ARTICLE | doi:10.20944/preprints202311.0977.v1
Subject: Engineering, Electrical And Electronic Engineering Keywords: machine learning; neural networks; PV power forecasting; smart meters; solar energy
Online: 15 November 2023 (09:52:40 CET)
The power system has rapidly grown and expanded over the past decades and has been experiencing major changes and challenges. The increase in energy demand and the modern advancements in the smart grid, such as solar and wind energies and electric vehicles, have led to complexity and complications for utilities. A further layer of complexity and difficulty was added by the rapid expansion of behind-the-meter (BTM) photovoltaic (PV) systems with various designs and characteristic features. The rapid increase and the invisible solar power (BTM) have led to fluctuations in power grid stability and reliability and to inefficiency. Accurate forecasting of load generation will help to assure optimal planning, minimize the negative effects of the PV systems, and minimize the operational and maintenance costs. The authors propose a solution that uses combinations of K-means clustering with neural network machine learning models, AMI real-world PV load generation, and weather data to forecast the generation load at customer locations to achieve a 2.49% error between actual and predicted generation load.
ARTICLE | doi:10.20944/preprints202311.0438.v1
Subject: Business, Economics And Management, Accounting And Taxation Keywords: Artificial Intelligence; Forecasting; Business Failure; Financial Sustainability; Financial Indicators; Transport Sector
Online: 7 November 2023 (11:19:43 CET)
The transport sector is pivotal and indispensable in our daily existence, being the exclusive conduit for the intercontinental conveyance of commodities and individuals. Hence, comprehending the phenomenon of business failure within the transport industry assumes paramount significance in delineating an effective competitive policy for this sector. The primary objective of this research paper is to execute a comparative investigation between statistical models forecasting business failure and models founded on artificial intelligence, within the context of the transport sector. This analysis spans the temporal expanse from 2014 to 2021 and encompasses the nations of Portugal, Spain, France, and Italy, aiming to ascertain superior efficacy among these models. The dataset employed for this endeavor encompassed a final cohort of 4866 companies, comprising 2881 that endured as going concerns and 1985 that succumbed to business failure. The models developed for this study exhibited the capacity to accurately categorize a proportion of companies ranging from 71% to 73%. Nonetheless, upon comparative scrutiny of these outcomes with those derived from the statistical models dedicated to business failure prediction, it becomes evident that the latter demonstrate an enhanced predictive prowess, manifesting fewer errors in the classification of the scrutinized companies.
ARTICLE | doi:10.20944/preprints202309.0354.v1
Subject: Biology And Life Sciences, Animal Science, Veterinary Science And Zoology Keywords: sheep milk constituents; Brix refractometer; milk protein percentages; Refractometer forecasting ability.
Online: 6 September 2023 (10:12:17 CEST)
In this study, 737 individual sheep milk samples were collected to evaluate the relationships between Brix refractometer measurements and milk constituents—particularly, protein and fat percentages—with the aim of verifying the ability of the refractometer to predict milk constituents. The Pearson's simple (rSP) and partial (rPP) correlations between milk constituents were calculated, and several first- and second-order regressions were tested to predict the protein and fat percentages. The results of the forecasts can be considered satisfactory only for the simple regression that predicted the percentage of milk protein through the measurements read with the Brix refractometer (PRT = -2.996 + 0.639*Brix), while the regression that predicted the percentage of fat + milk protein presented a weak forecasting capacity, which was probably due to the absence of partial correlations between the Brix refractometer measurements and the fat percentage.
REVIEW | doi:10.20944/preprints202308.1578.v1
Subject: Engineering, Mechanical Engineering Keywords: forecasting; prevision; wind speed; wind power; renewable energy; Scopus base; Bibliometrix
Online: 23 August 2023 (07:27:24 CEST)
The most important step for the installation of a wind farm is to know the wind regime in the region, since an error in estimating this wind speed causes an error proportional to the cube of power, resulting in financial losses for investors. Therefore, knowing the methods used for predicting wind speed becomes important and the knowledge of how research and studies in this area are going helps map the subject and outline strategies for developing research in strategic areas. For this purpose, the Scopus database was used considering some keywords, such as ("forecast" OR "prevision") AND "wind" AND ("turbine" OR "power" OR "energy" or "velocity" or "speed"), considering the period since 2019, and analyzing the data of the documents found using the Bibliometrix package. With the results found, it was possible to map researchers, institutions that are developing work in this area, in addition to the most cited articles, among other aspects analyzed.
ARTICLE | doi:10.20944/preprints202305.0934.v1
Subject: Computer Science And Mathematics, Applied Mathematics Keywords: Time Series; Forecasting, Deep Learning; Genetic Algorithm; Long Short-Term Memory
Online: 12 May 2023 (11:07:28 CEST)
Fluctuating stock prices make it difficult for investors to see investment opportunities. One tool that can help investors overcome this is forecasting techniques. Long Short-Term Memory (LSTM) is one of deep learning methods used in forecasting time series. The training and success of deep learning is strongly influenced by the selection of hyperparameters. This research uses a hybrid method between the Genetic Algorithm (GA) and LSTM to find a suitable model for predicting stock prices. GA is used in optimizing the architecture such as the number of epochs, window size, and the number of LSTM units in the hidden layer. Tuning optimizer is also carried out using several optimizers to achieve the best value. From method that has been applied, it shows that the method has a good level of accuracy with MAPE values below 10% in every optimizer used. The error rate generated is quite low, in case-1 with a minimum RMSE value of 93.03 and 94.40, & in case-2 with an RMSE value of 104.99 and 150.06 during training and testing. A fairly stable and small value is generated by setting it using the Adam Optimizer.
ARTICLE | doi:10.20944/preprints202108.0268.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Fuzzy collaborative intelligence; Dynamic random access memory; Fuzzy weighted intersection; Forecasting
Online: 11 August 2021 (18:08:46 CEST)
In a collaborative forecasting task, experts may have unequal authority levels. However, this has rarely been considered reasonably in the existing fuzzy collaborative forecasting methods. In addition, experts may not be willing to discriminate their authority levels. To address these issues, an auto-weighting fuzzy weighted intersection (FWI) fuzzy collaborative intelligence approach is proposed in this study. In the proposed auto-weighting FWI fuzzy collaborative intelligence approach, experts’ authority levels are automatically and reasonably assigned based on their past forecasting performances. Subsequently, the auto-weighting FWI mechanism is established to aggregate experts’ fuzzy forecasts. The theoretical properties of the auto-weighting FWI mechanism have been discussed and compared with those of the existing fuzzy aggregation operators. After applying the auto-weighting FWI fuzzy collaborative intelligence approach to a case of forecasting the yield of a DRAM product from the literature, its advantages over several existing methods were clearly illustrated.
ARTICLE | doi:10.20944/preprints202104.0138.v1
Subject: Business, Economics And Management, Accounting And Taxation Keywords: Energy consumption; BRICS; GM (1, 1); Fractional-order; GREY; Forecasting accuracy
Online: 5 April 2021 (13:51:38 CEST)
Brazil, Russia, China, India, and the Republic of South Africa (BRICS) represent developing economies facing different energy and economic development challenges. The current study aims to forecast energy consumption in BRICS at aggregate and disaggregate levels using the annual time series data set from 1992 to 2019 and to compare results obtained from a set of models. The time-series data are from the British Petroleum (BP-2019) Statistical Review of World Energy. The forecasting methodology bases on a novel Fractional-order Grey Model (FGM) with different order parameters. This study contributes to the literature by comparing the forecasting accuracy and the forecasting ability of the FGM(1,1) with traditional ones, like standard GM(1,1) and ARIMA(1,1,1) models. Also, it illustrates the view of BRICS's nexus of energy consumption at aggregate and disaggregates levels using the latest available data set, which will provide a reliable and broader perspective. The Diebold-Mariano test results confirmed the equal predictive ability of FGM(1,1) for a specific range of order parameters and the ARIMA(1,1,1) model and the usefulness of both approaches for energy consumption efficient forecasting.
ARTICLE | doi:10.20944/preprints202012.0619.v1
Subject: Engineering, Automotive Engineering Keywords: Adaptive Island; Minimum spanning tree; Load forecasting; Controllable switches; Distributed generations
Online: 24 December 2020 (12:35:42 CET)
The power system vulnerability leads to faults and the severity of the fault may lead to prolonged load-shedding. The power system needs to be configured in extreme failure scenarios for protecting the network from further contingencies and prolonged load-shedding. Distributed generation resources (DGs) can be useful to form intentional islands after faults to maintain the continuity of power supply to loads based on their weightage during faulty periods and to reduce overall load shedding duration. Power system is bound to collapses and secondary collapse in the formed island is possible. This research represents a novel method of impedance based path finding for intentional islanding, which adapts itself with the changes in the demands, DGs outputs or further severities during restoration period. In this adaptive islanding approach, network adjusts itself with the changes in either the load demands or renewable DGs outputs and rearranges the restoration plan by curtailing or adding some of the loads through controllable switches. Further a secondary collapse in the existing island is studied by injecting multiple faults at various positions of the network to validate the system resilience to cope with severities. A short-term load forecasting approach is used to predict changes in load demands and variations in DG outputs during the islanding scheme. During the restoration period, these variations are tracked and the islands are modified accordingly. In order to minimize the overall generation cost by using less fuel, an economical approach is used in the selection of controllable DGs. The proposed approach is formulated as a multi-objective, that incorporates several operational constraints and simulation is carried out using the modified IEEE 69-bus distribution system to assess the efficacy of the proposed model.
ARTICLE | doi:10.20944/preprints202010.0565.v1
Subject: Biology And Life Sciences, Anatomy And Physiology Keywords: ARIMA, CPUE, Fish biomass landings, Forecasting, Lake Malombe, Time series approach
Online: 27 October 2020 (21:04:35 CET)
Lake Malombe fish stocks have been depleted by chronic overfishing. Various management approaches (co-management, command control, and ecosystem-based management to fisheries) have been used to manage the fishery. However, the lack of an accurate predictive model has hampered their success. Therefore, we developed and tested a time series model for Lake Malombe fishery. The seasonal fish biomass and CPUE trends were first observed and both were non-stationary. The second-order differencing was applied to transform the non-stationary data into stationary. Autocorrelation functions (AC), partial autocorrelation function (PAC), and Akaike information criterion (AIC) were estimated, which led to the identification and construction of autoregressive integrated moving average (ARIMA) models, suitable in explaining the time series and forecasting. The results showed that ARIMA (1,2,1) provided a better prediction than its counterparts. The model satisfactorily predicted that by 2032, both fish biomass and CPUE will decrease to 3204.6 tons and 59.672 respectively, signifying the potential threat to Lake Malombe fishery. The model justified the necessity of taking precautionary measures to avoid the total collapse of the fishery.
ARTICLE | doi:10.20944/preprints201811.0490.v1
Subject: Computer Science And Mathematics, Probability And Statistics Keywords: forecasting; time series; vector autoregression (VAR), bayesian VAR; collinearity and autocorrelation
Online: 20 November 2018 (09:03:49 CET)
The goal of VAR or BVAR is the characterization of the dynamics and endogenous relationships among time series. Also the VAR models are known for their applications to forecasting and policy analysis. This paper compare the performance of VAR and Sims-Zha Bayesian VAR models when the multiple time series are jointly influenced by different levels of collinearity and autocorrelation in the short term (T=16, 32, 64 and 128). Five levels (-0.9,-0.5, 0,+0.5,+0.9) of collinearity and autocorrelation were considered and the results from the simulation study revealed that VAR(2) model dominated for no and moderate levels of autocorrelation (-0.5, 0, +0.5) irrespective of the collinearity level except in few cases when T=16. While the BVAR models dominated for high autocorrelation levels (-0.9 and +0.9) irrespective of the collinearity level except in few cases when T=128. The performance of the models varies at different levels of the collinearity and autocorrelated error, and also varies with the short term periods. Furthermore, the values of the RMSE and MAE criteria decrease as a result of increase in the time series length. In conclusion, the performance of the forecasting models depend on the time series data structure and the time series length. It is therefore recommended that the data structure and series length should be considered in using an appropriate model for forecasting.
ARTICLE | doi:10.20944/preprints201807.0440.v2
Subject: Engineering, Automotive Engineering Keywords: BEV; ownership cost analysis; design of experiments; forecasting; Monte Carlo simulation
Online: 17 August 2018 (12:59:31 CEST)
This study evaluates eight-year ownership costs for battery electric vehicles (BEV) versus non-plugin hybrid vehicles using forecasting to estimate future electricity and conventional gasoline prices and incorporating these in a multiple design of experiments simulation. Results suggest that while electric vehicles are statistically dominant in terms of variable costs over an 8-year life-span, high-performance hybrid non-plugins achieve variable fuel costs nearly as good as low-performing electric vehicles (those attaining only 3 miles per kilowatt hour) and that these hybrid acquisition costs are (on average) lower yet the vehicles retain higher residual values. In general, the six smallest ownership costs are split evenly between hybrid and electric vehicles; however, inflation for conventional regular gasoline is estimated to outstrip inflation per kilowatt hour. Thus, non-plugin hybrid cars are likely to require considerably more advanced engineering to keep pace.
ARTICLE | doi:10.20944/preprints201609.0104.v1
Subject: Computer Science And Mathematics, Computer Science Keywords: icing forecasting; fireworks algorithm; least square support vector machine; feature selection
Online: 27 September 2016 (11:16:44 CEST)
Accurate forecasting of icing thickness has a great significance for ensuring the security and stability of power grid. In order to improve the forecasting accuracy, this paper proposes an icing forecasting system based on fireworks algorithm and weighted least square support vector machine (W-LSSVM). The method of fireworks algorithm is employed to select the proper input features with the purpose of eliminating the redundant influence. In addition, the aim of W-LSSVM model is to train and test the historical data-set with the selected features. The capability of this proposed icing forecasting model and framework is tested through the simulation experiments using real-world icing data from monitoring center of key laboratory of anti-ice disaster, Hunan, South China. The results show that the proposed W-LSSVM-FA method has a higher prediction accuracy and it may be a promising alternative for icing thickness forecasting.
ARTICLE | doi:10.20944/preprints202311.1712.v1
Subject: Engineering, Electrical And Electronic Engineering Keywords: graph-based neural network, traffic forecasting, Internet of Things, Contiki operating system
Online: 27 November 2023 (15:00:05 CET)
The paper illustrates a general framework in which a neural network application can be easily integrated and proposes a traffic forecasting approach that uses neural networks based on graphs. The method minimizes the communication network (between vehicles and the database servers) load and represents a reasonable trade-off between communication network load and forecasting accuracy. Traffic prediction leads to the choice of less congested routes and therefore to the reduction of energy consumption. The traffic is forecasting using a LTSM neural network with a regression layer. The inputs of the neural network are sequences - obtained from graph that represent the road network - at specific moments of time that are read from traffic sensors or the outputs of neural network (forecasting sequences). The input sequences can be filtered to improve the forecasting accuracy. This general framework is based on Contiki IoT operating system that ensure support for wireless communication and efficient implementation of processes in a resource constrained system and it is particularized to implement a graph neural network. Two cases are studied: one case in which the traffic sensors are periodically read and the other case in which the traffic sensors are read when their values changes are detected. A comparison between the cases is made and the influence of filtering is evaluated. The obtained accuracy is very good, very close to the accuracy obtained in infinite precision simulation, and the computation time is low enough and the system can work in real time.
ARTICLE | doi:10.20944/preprints202309.1966.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: wind speed forecasting; deep learning; LSTM; GRU; wind energy; CEEMDAN; EMD; VMD
Online: 28 September 2023 (11:41:50 CEST)
Advancements in technology, policies, and cost reductions have led to rapid growth in wind power production. One of the major challenges in wind energy production is the instability of wind power generation due to weather changes. Efficient power grid management requires accurate power output forecasting. New wind energy forecasting methods based on deep learning are better than traditional methods, like numerical weather prediction, statistical models, and machine learning models. This is more true for short-term prediction. Since there is a relationship between methods, climates, and forecasting complexity, forecasting methods do not always perform the same depending on the climate and terrain of the data source. This paper proposes a novel model that combines the variational mode decomposition method with a long short-term memory model, developed for next-hour wind speed prediction in a hot desert climate, such as the climate in Saudi Arabia. We compared the proposed model performance to two other hybrid models, six deep learning models, and four machine learning models using different feature sets. Also, we tested the proposed model on data from different climates, Caracas and Toronto. The proposed model showed a forecast skill between 61% to 74% based on mean absolute error, 64% to 72% based on root mean square error, and 59% to 68% based on mean absolute percentage error for locations in Saudi Arabia.
ARTICLE | doi:10.20944/preprints202309.1637.v1
Subject: Environmental And Earth Sciences, Water Science And Technology Keywords: Streamflow Data Assimilation; Flood forecasting; Tropical Andes; Satellite Precipitation Products; GR4H model
Online: 25 September 2023 (09:00:46 CEST)
Flood modeling and forecasting are key to managing and preparing for extreme flood events. Hydrological flood forecasting aims to predict the system response to different input changes with minimum uncertainties. In that sense, streamflow Data Assimilation (DA) seeks to combine errors between hydrological model and water discharge observations through the update of model states. This paper aims to assess a sub-daily flood forecast system in a basin of the Peruvian Tropical Andes using two sequential data assimilation algorithms called the Ensemble Kalman Filter (EnKF) and the Particle Filter (PF). The study was conducted in the Vilcanota River basin during the rainiest months in 2022 to assess recent potential river floods. This basin is in the southern Peruvian Andes and was selected because it is continually affected by river floods such as occurred in 2010. For this purpose, the lumped GR4H rainfall-runoff model was run forward with 100 ensemble members using two different Satellite Precipitation sources (IMERG-E' and GSMaP-NRT'). Also, four DA experiments (IMERG-E'+EnKF, IMERG-E'+PF, GSMaP-NRT'+EnKF, and GSMaP-NRT'+PF) were conducted by assimilating real-time hourly discharges at the Pisac stream gauge station to examine the improvement of forecast accuracy for lead times of 1—24 hours. Results display good forecast performances during the first 10 hours, especially for the GSMaP'+EnKF scheme. Finally, this work benchmarks the application of streamflow DA in and Andean basin of Peru with sparse data availability and will support the development of more accurate climate services in Peru through hydrologic ensemble predictions.
ARTICLE | doi:10.20944/preprints202309.1400.v1
Subject: Business, Economics And Management, Econometrics And Statistics Keywords: hierarchical forecasting; short time series; local approach; global approach; household gas consumption
Online: 21 September 2023 (05:39:44 CEST)
This study presents a novel approach for predicting hierarchical short time series. In this article, our objective was to formulate long-term forecasts for household natural gas consumption by considering the hierarchical structure of territorial units within a country's administrative divisions. For this purpose, we utilized natural gas consumption data from Poland. The length of the time series was an important determinant of the data set. We contrast global techniques, which employ a uniform method across all time series, with local methods that fit a distinct method for each time series. Furthermore, we compare the conventional statistical approach with a machine learning (ML) approach. Based on our analyses, we devised forecasting methods for short time series that exhibit exceptional performance. We have demonstrated that global models provide better forecasts than local models. Among ML models, neural networks yielded the best results, with the MLP network achieving comparable performance to the LSTM network while requiring significantly less computational time.
REVIEW | doi:10.20944/preprints202303.0451.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Accurate predictions; Deep Learning; Energy management; Machine Learning; Renewable En-ergy Forecasting
Online: 27 March 2023 (07:56:16 CEST)
This article presents a review of current advances and future prospects in the field of forecasting renewable energy generation using machine learning (ML) and deep learning (DL) techniques. With the increasing penetration of renewable energy sources (RES) into the electricity grid, accurate forecasting of their generation becomes crucial for efficient grid operation and energy management. Traditional forecasting methods have limitations, and thus ML and DL algorithms have gained popularity due to their ability to learn complex relationships from data and provide accurate predictions. This paper reviews the different approaches and models that have been used for renewable energy forecasting and discusses their strengths and limitations. It also highlights the challenges and future research directions in the field, such as dealing with uncertainty and variability in renewable energy generation, data availability, and model interpretability. Finally, this paper emphasizes the importance of developing robust and accurate renewable energy forecasting models to enable the integration of RES into the electricity grid and facilitate the transition towards a sustainable energy future.
ARTICLE | doi:10.20944/preprints202212.0534.v1
Subject: Environmental And Earth Sciences, Environmental Science Keywords: Indoor air quality; forecasting; machine learning; IoT; Covid-19; environmental mapping; pandemic.
Online: 28 December 2022 (09:10:46 CET)
The current COVID19 pandemic has raised huge concerns for outdoor air quality due to expected lungs deterioration. These concerns include the challenges in the scalable prediction of harmful gases like carbon dioxide, iterative/repetitive inhaling due to mask and environmental temperature harshness. Even in the presence of air quality sensing devices, these challenges lead to failed planning and strategy against respiratory diseases, epidemics, and pandemics in severe cases. In this work, a dual time-series with bi-cluster sensor data-stream-based novel optimized regression algorithm was proposed with optimization predictors and optimization responses that use automated iterative optimization of the model based on the similarity coefficient index. The algorithm was implemented over SeReNoV2 sensor nodes data, i.e. multi-variate dual time-series of environmental and US Environmental Protection Agency standard sensor variables for air quality index measured from air quality sensors with geospatial profiling. The SeReNoV2 systems were placed at four locations that were 3 km apart to monitor air quality and their data was collected at Ubidots IoT platform over GSM. Results have shown that the proposed technique achieved a root mean square error (RMSE) of 1.0042 with a training time of 469.28 seconds for normal and RMSE of 1.646 in the training time of 28.53 seconds for optimization. The estimated R-Squared error of 0.03 with Mean-Square Error for temperature 1.0084 ᵒC and 293.98 ppm for CO2 was observed. Furthermore, the Mean-Absolute Error (MAE) for temperature 0.66226 ᵒC and 10.252 ppm for CO2 at a prediction speed of ~5100 observations/second for temperature 45000 observations/second for CO2 due to iterative optimization of the training time 469.28 seconds for temperature and 28.53 seconds for CO2 was very promising in forecasting COVID19 countermeasures before time.
ARTICLE | doi:10.20944/preprints202004.0257.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: COVID-19; predictive analytics; forecasting models; Machine Learning Method; prediction; epidemic; pandemic
Online: 16 April 2020 (05:54:13 CEST)
Globally, there is massive uptake and explosion of data and challenge is to address issues like scale, pace, velocity, variety, volume and complexity of the data. Considering the recent epidemic in China, modeling of COVID-19 epidemic for cumulative number of infected cases using data available in early phase was big challenge. Being COVID-19 pandemic during very short time span, it is very important to analyze the trend of these spread and infected cases. This predictive analytics can be empowered using Information, Communication and Technologies (ICT) services, tools and applications. This paper presents medical perspective of COVID-19 towards epidemiological triad and the study of state-of-the-art. The main aim this paper is to present different predictive analytics techniques available for trend analysis, different models and algorithms and their comparison. Finally, this paper concludes with prediction of COVID-19 using Prophet algorithm indicating more faster spread in short term. These predictions will be useful to government and healthcare communities to initiate appropriate measures to control this outbreak in time.
ARTICLE | doi:10.20944/preprints201609.0053.v1
Subject: Engineering, Control And Systems Engineering Keywords: electricity markets; price forecasting; multi-output models; random forests; conditional inference trees
Online: 18 September 2016 (06:16:19 CEST)
Predicting electricity prices is a very important issue in modern society, because the associated decision process under uncertainty requires accurate forecasts for the economic agents involved. In this paper, we apply the decision tree extension of Random Forests to the prediction of electricity prices in Spain, but with the novelty of modeling prices jointly with demand, with the purpose of achieving greater accuracy than with univariate response Random Forests, particularly in price prediction, as well as understanding the effect of the input variables (lagged values of price and demand, current production levels of available energy sources) on the joint of the two outputs. The results are very encouraging, providing significant increase in price prediction accuracy. Also, interesting methodological challenges appear as far as the appropriate choice of the relative weights of price and demand in the joint modeling is concerned and a new procedure to provide the importance variable ranking is proposed. The partykit (package of R software) library allowing for multivariate Random Forests has been used.
CASE REPORT | doi:10.20944/preprints202309.1851.v1
Subject: Engineering, Control And Systems Engineering Keywords: Iron- and steel-making enterprise; Oxygen system; Forecasting model; Scheduling model; Energy-saving
Online: 28 September 2023 (04:21:44 CEST)
Due to the imbalance between the supply and demand of oxygen, the oxygen system of iron- and steel-making enterprises in China has problems with high oxygen emission and high pressure in the pipelines, resulting in the energy consumption of oxygen production being high. To relieve this problem, using a large-scale iron- and steel-making mill as a case study, the research on demand forecasting and optimal scheduling of the oxygen system was carried out. The ARIMA model and the GABP model are employed to forecast oxygen demand. Based on the forecast results, an optimal scheduling model for the oxygen system was developed to conduct optimal scheduling. The case study shows that based on the oxygen demand forecast and the optimal scheduling, the oxygen emission and the pipeline pressure in the studied iron- and steel-making enterprise can be significantly reduced, thereby achieving considerable energy-saving effects and economic benefits. Specifically, the following conclusions are obtained: (1) For the oxygen demand forecast, the prediction accuracy of the GABP model is better than that of the ARIMA model. The average MAPE of the 12 sets of data of the ARIMA and GABP models are 23.8% and 20.2%, respectively. (2) By comparing the scheduling results and the field data, it is found that after the scheduling, the amount of oxygen emission has decreased by 6.32%, the pipeline pressure has decreased by 0.61%, and the energy consumption of oxygen compression has decreased by 1.6%. Considering both the oxygen emission loss and the energy consumption of oxygen compression, the total power consumption of the oxygen system is reduced by 1.38%, which saves the electricity cost of about 9.03 million RMB per year.
ARTICLE | doi:10.20944/preprints202308.1979.v1
Subject: Business, Economics And Management, Finance Keywords: deep learning; system engineering; stock price forecasting; aggregate dynamic behavior; generative adversarial network
Online: 30 August 2023 (03:03:52 CEST)
Current stock market forecasting methods encompass fundamental, technical, emotional, and bargaining factors. Predominantly, price prediction hinges on order volume and price, although correlating these two within existing models proves challenging. This study employs Cycle Generative Adversarial Network (Cycle GAN) to unravel the intricate price-volume relationship, combining it with Bollinger Bands for trading signal analysis, overcoming hurdles in short-term forecasting prevalent in numerical analysis and AI. Focusing on TSMC (2330.TW) stock price, the research leverages Cycle GAN in deep learning to master the price-volume nexus, juxtaposed with LSTM and RNN. Historical TSMC closing prices and transaction counts are model inputs, scrutinizing their interconnectedness for predictions. This innovative approach aligns stock price, volume, market value, taxes, and prior changes via system engineering. By intertwining Bollinger Bands with stock price forecasts, trading signals are distilled, factoring in extended index %b for a comprehensive market picture.
ARTICLE | doi:10.20944/preprints202110.0302.v1
Subject: Engineering, Electrical And Electronic Engineering Keywords: load forecasting; extreme learning machine (ELM); ant lion optimization (ALO) ；parameter optimization; model.
Online: 21 October 2021 (09:34:56 CEST)
The load of power system changes with the development of economy, short-term load forecasting play a very important role in dispatching and management of power system. In this paper, the Ant Lion Optimizer (ALO) is introduced to improve the input weights and hidden-layer Matrix of extreme learning machine (ELM), after the parameters of ELM are optimized by ALO, then input nodes, hidden layer nodes and output nodes are determined, so a load forecasting model based on ALO-ELM combined algorithm is established. The proposed method is illustrated based on the historical load data of a city in China. The results show that the average absolute error of short-term load demand predicted by ALO-ELM model is 1.41, while that predicted by ELM is 4.34, the proposed ALO-ELM algorithm is superior to the ELM and meet the requirements of engineering accuracy, which proves the effectiveness of proposed method.
ARTICLE | doi:10.20944/preprints202104.0398.v1
Subject: Computer Science And Mathematics, Applied Mathematics Keywords: COVID-19; epidemic modeling; time series forecasting; nonlinear growth models; Prais-Winsten estimation
Online: 15 April 2021 (07:44:35 CEST)
Since December 2019, the coronavirus disease (COVID-19) has rapidly spread worldwide. The Mexican government has implemented public safety measures to minimize the spread of the virus. In this paper, the authors use statistical models in two stages to estimate the total number of coronavirus (COVID-19) cases per day at the state and national level in Mexico. Two types of models are proposed: first, a polynomial model of the growth for the first part of the outbreak until the inflection point of the pandemic curve and then a second nonlinear growth model is used to estimate the middle and the end of the outbreak. Model selection will be performed using Vuong’s test. The proposed models show overall fit similar to predictive models (e.g. time series, and machine learning); however, the interpretation of parameters is less complex for decision-makers and the residuals follow the expected distribution when fitting the models without autocorrelation being an issue.
ARTICLE | doi:10.20944/preprints202009.0228.v1
Subject: Engineering, Control And Systems Engineering Keywords: multilayer perceptron; support vector machine; COVID19; SarsCov2; forecasting; machine learning; public health; pandemic
Online: 10 September 2020 (08:05:49 CEST)
This paper presents a Multilayer Perceptron and Support Vector Machine algorithms approach to predict the number of COVID19 infections in different countries of America. It intends to serve as a tool for decision-making and tackling the pandemic that the world is currently facing. The models were trained and tested using open data from the European Union repository where a time series of confirmed contagious cases was modeled until May 25, 2020. The hyperparameters as number of neurons per layer were set up using a tabu list algorithm. The countries selected to carry out the study were Brazil, Chile, Colombia, Mexico, Peru and the United States. The metrics used are Pearson's correlation coefficient (CP), Mean Absolute Error (MAE), and Mean Percentage Error (MPE). For the testing stage we obtained the following results: Brazil, CP=0.65, MAE=2508 and MPE=17%; Chile, CP=0.64, MAE=504, MPE=16%; Colombia, CP=0.83, MAE=76, MPE=9%; Mexico, CP=0.77, MAE=231, MPE=9%; Peru, CP=0.76, MAE=686, MPE=18% and the United States of America, CP=0.93, MAE=799, MPE=4%. This resulted in powerful machine learning tools although it is necessary to use specific algorithms depending on the data and the stage of the country’s pandemic.
ARTICLE | doi:10.20944/preprints202201.0107.v1
Subject: Engineering, Energy And Fuel Technology Keywords: Very short term load forecasting; VSTLF; Short term load forecasting; STLF; deep learning; RNN; LSTM; GRU; machine learning; SVR; random forest; extreme gradient boosting, energy consumption; ARIMA; time series prediction.
Online: 10 January 2022 (12:17:35 CET)
Commercial buildings are a significant consumer of energy worldwide. Logistics facilities, and specifically warehouses, are a common building type yet under-researched in the demand-side energy forecasting literature. Warehouses have an idiosyncratic profile when compared to other commercial and industrial buildings with a significant reliance on a small number of energy systems. As such, warehouse owners and operators are increasingly entering in to energy performance contracts with energy service companies (ESCOs) to minimise environmental impact, reduce costs, and improve competitiveness. ESCOs and warehouse owners and operators require accurate forecasts of their energy consumption so that precautionary and mitigation measures can be taken. This paper explores the performance of three machine learning models (Support Vector Regression (SVR), Random Forest, and Extreme Gradient Boosting (XGBoost)), three deep learning models (Recurrent Neural Networks (RNN), Long Short-Term Memory (LSTM), and Gated Recurrent Unit (GRU)), and a classical time series model, Autoregressive Integrated Moving Average (ARIMA) for predicting daily energy consumption. The dataset comprises 8,040 records generated over an 11-month period from January to November 2020 from a non-refrigerated logistics facility located in Ireland. The grid search method was used to identify the best configurations for each model. The proposed XGBoost models outperform other models for both very short load forecasting (VSTLF) and short term load forecasting (STLF); the ARIMA model performed the worst.
ARTICLE | doi:10.20944/preprints202310.0310.v3
Subject: Business, Economics And Management, Econometrics And Statistics Keywords: forecasting aggregate demand; clustering time series; pivot clustering; ARMA model; order-up-to policy
Online: 25 October 2023 (13:20:27 CEST)
In this paper we compare the effects of forecasting demand using individual (disaggregated) components versus first aggregating the components either fully or into several clusters. Demand streams are assumed to follow autoregressive moving average (ARMA) processes. Using individual demand streams will always lead to a superior forecast compared to any aggregates, however we show that if several aggregated clusters are formed in a structured manner then these subaggregated clusters will lead to a forecast with minimal increase in mean-squared forecast error. We show this result based on theoretical MSFE obtained directly from the models generating the clusters as well as estimated MSFE obtained directly from simulated demand observations. We suggest a pivot-algorithm, that we call Pivot Clustering, to create these clusters. We also provide theoretical results to investigate sub-aggregation, including for special cases such as, aggregating MA(1) streams and ARMA streams with similar or same parameters.
ARTICLE | doi:10.20944/preprints202309.0717.v1
Subject: Environmental And Earth Sciences, Environmental Science Keywords: Interevent time; Probability distributions; probabilistic forecasting; seismic cycle; statistical seismology; statistical methods; Bayesian inference
Online: 12 September 2023 (10:13:49 CEST)
The probability distribution of the interevent time between two successive earthquakes has been subject of numerous studies due to its key role in seismic hazard assessment. In recent decades, many distributions have been considered and there has been a long debate about the possible universality of the shape of this distribution when the interevent times are suitably rescaled. In this work we aim to find out if there is a link between the different phases of a seismic cycle and the variations in the distribution that best fits the interevent times. To do this, we consider the seismic activity related to the Mw 6.1 L’Aquila earthquake that occurred on April 6, 2009 in central Italy by analyzing the sequence of events recorded from April 2005 to July 2009, and then the seismic activity linked to the sequence of the Amatrice-Norcia earthquakes of Mw 6 and 6.5 respectively and recorded in the period from January 2009 to June 2018. We take into account some of the most studied distributions in the literature: q-exponential, q-generalized gamma, gamma and exponential distributions and, according to the Bayesian paradigm, we compare the value of their posterior marginal likelihood in shifting time windows with a fixed number of data. The results suggest that the distribution providing the best performance changes over time and its variations may be associated with different phases of the seismic crisis.
ARTICLE | doi:10.20944/preprints202308.1581.v1
Subject: Business, Economics And Management, Other Keywords: Demand forecasting methods; Fuzzy ELECTRE I; Extended modified discordance matrix; Hadamard product; Closeness coefficient
Online: 22 August 2023 (12:44:02 CEST)
The selectin of a demand forecasting method is critical for companies aiming to avoid manufacturing overproduction or shortages in pursuit of sustainable development. Various qualitative and quantitative criteria with different weights must be considered during the evaluation of a forecasting method. The qualitative criteria and criteria weights are usually assessed in linguistic terms. Aggregating these various criteria and linguistic weights for evaluating and selecting demand forecasting methods in sustainable manufacturing is a major challenge. This paper proposes an extension of fuzzy Elimination and Choice Translating Reality (ELECTRE) I to resolve this problem. In the proposed method, fuzzy weighted ratings are defuzzified with the signed distance to develop a crisp ELECTRE I model. Moreover, an extension to ELECTRE I is developed by suggesting an extended modified discordance matrix and a closeness coefficient for ranking alternatives. The proposed extension can overcome the problem of information loss which can lead to incorrect ranking result when using the Hadamard product to combine concordance and modified discordance matrices. A comparison is conducted to show the advantage of the proposed extension. Finally, a numerical example is used to demonstrate the feasibility of the proposed method; furthermore, a numerical comparison is made to display the advantage of the proposed method.
ARTICLE | doi:10.20944/preprints202308.1240.v1
Subject: Business, Economics And Management, Econometrics And Statistics Keywords: Attention Mechanism; Stock Forecasting; Deep Learning; Technical Analysis Method; Lightweight Automated Stock Trading System
Online: 17 August 2023 (09:04:02 CEST)
Individual investors often struggle to predict stock prices due to the limitations imposed by the computational capacities of personal laptop Graphics Processing Units (GPUs) when running intensive deep learning models. This study proposes solving these GPU constraints by integrating deep learning models with technical analysis methods. This integration significantly reduces analysis time and equips individual investors with the ability to identify stocks that may yield potential gains or losses in an efficient manner. Thus, a comprehensive buy and sell algorithm, compatible with average laptop GPU performance, is introduced in the study. This algorithm offers a lightweight analysis method that emphasizes factors identified by technical analysis methods, thereby providing a more accessible and efficient approach for individual investors. To evaluate the efficacy of this approach, we analyzed the performance of eight deep learning models: LSTM (4 layers), CNN, BiLSTM (4 layers), CNN Attention, BiGRU CNN BiLSTM Attention, BiLSTM Attention CNN, CNN BiLSTM Attention, and CNN Attention BiLSTM. These models were used to predict stock prices for Samsung Electronics and Celltrion Healthcare. The CNN Attention BiLSTM model displayed superior performance among these models, with the lowest validation mean absolute error value. In addition, an experiment was conducted using WandB Sweep to determine the optimal hyperparameters for four individual hybrid models. These optimal parameters were then implemented in each model to validate their back-testing rate of return. The CNN Attention BiLSTM hybrid model emerged as the highest-performing model, achieving an approximate rate of return of 5 percent. Overall, this study offers valuable insights into the performance of various deep learning and hybrid models in predicting stock prices. These findings can assist individual investors in selecting appropriate models that align with their investment strategies, thereby increasing their likelihood of success in the stock market.
ARTICLE | doi:10.20944/preprints202308.0292.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: crop production; agricultural production; time series forecasting; artificial intelligence; transformer; machine learning; deep learning
Online: 3 August 2023 (10:34:48 CEST)
Accurate prediction of crop production is essential in effectively managing agricultural countries' food security and economic resilience. This study evaluates the performance of statistical and machine learning-based methods for large-scale crop production forecasting. We predict the quarterly production of 325 crops (including fruits, vegetables, cereals, non-food, and industrial crops) across 83 provinces in the Philippines. Using a comprehensive dataset of 10,949 time series over 13 years, we demonstrate that a global forecasting approach using a state-of-the-art deep learning architecture, the transformer, significantly outperforms traditional local forecasting approaches built on statistical and baseline methods. By leveraging cross-series information, our proposed way is scalable and works well even with time series that are short, sparse, intermittent, or exhibit structural breaks/regime shifts. The results of this study further advance the field of applied forecasting in agricultural production and provide a practical and effective decision-support tool for policymakers that oversee the farm sector on a national scale.
ARTICLE | doi:10.20944/preprints202305.0111.v1
Subject: Engineering, Electrical And Electronic Engineering Keywords: Greek power system; electric load forecasting; block-diagonal neurons; fuzzy neural network; internal feedback
Online: 3 May 2023 (07:34:20 CEST)
A block-diagonal fuzzy neural network for short-term load forecasting is proposed. DBD-FELF consists of fuzzy rules with consequent parts that are neural networks with internal recurrence. These networks have a hidden layer which consists of pairs of neurons with feedback connections between them. The overall fuzzy model partitions the input space in partially overlapping fuzzy regions, where the recurrent neural networks of the respective rules operate. The partition of the input space and determination of the fuzzy rule base is performed by use of Fuzzy C-Means clustering algorithm and the RENNCOM constrained optimization method is applied for consequent parameter tuning. The electric load time-series of the Greek power system is examined, and hourly-based forecasting for the whole year is performed. The performance of DBD-FELF is tested via extensive experimental analysis and the results are promising, since an average percentage error of 1.18% is attained, along with an average yearly absolute error of 76.2 MW. Moreover, DBD-FELF is compared with Deep Learning, fuzzy and neurofuzzy rivals, such that its particular attributes are highlighted.
ARTICLE | doi:10.20944/preprints202206.0086.v1
Subject: Computer Science And Mathematics, Computational Mathematics Keywords: non-parametric modeling; flu; influenza; COVID-19; SARS-CoV-2; Empirical Dynamic Modeling; forecasting
Online: 6 June 2022 (10:24:45 CEST)
The evolution of some epidemics, as influenza, shows common patterns both in different regions and from year to year. On the contrary, epidemics like the novel COVID-19 show quite heterogeneous dynamics and are extremely susceptible to the measures taken to mitigate their spread. In this paper we propose empirical dynamic modeling to predict the evolution of influenza in Spain’s regions. It is a non-parametric method that looks into the past for coincidences with the present to make the forecasts. Here we extend the method to predict the evolution of other epidemics at any other starting territory and we test also this procedure with Spanish COVID-19 data. We finally build influenza and COVID-19 networks to check possible coincidences in the geographical distribution of both diseases. With this, we grasp the uniqueness of the geographical dynamics of COVID-19.
ARTICLE | doi:10.20944/preprints202103.0310.v1
Subject: Biology And Life Sciences, Biochemistry And Molecular Biology Keywords: Gelatinous zooplankton; Scyphozoa; Pelagia noctiluca; Rhizostoma pulmo; Forecasting system; Mitigation tool; Coastal zone management
Online: 11 March 2021 (10:57:35 CET)
Science is addressing global societal challenges and, due to limitations in research financing, scientists are turning to public at large to jointly tackle specific environmental issues. Citizens are therefore increasingly involved in monitoring programs, appointed as citizen scientists with potential to delivering key data at near no cost to address environmental challenges, so fostering scientific knowledge and advise policy- and decision-makers. One of the first and most successful example of marine citizen science in the Mediterranean is represented by the integrative and collaborative implementation of several jellyfish spotting campaigns in Italy, Spain, Malta, Tunisia started in 2009. Altogether, in terms of time coverage, geographic extent, and number of citizen records, these represent the most effective marine citizen science campaign so far implemented in the Mediterranean Sea. Here we analyzed a collective database merging records over the above four Countries, featuring more than 100,000 records containing almost 25,000 observations of jellyfish specimens, collected over a period of 3 to 7 years (from 2009 to 2015) by citizen scientists participating in any of the national citizen science programs included in this analysis. Such a wide citizen science exercise demonstrates to be one of the so far available most valuable and cost-effective tools to understanding ecological drivers of jellyfish proliferations over the Western and Central Mediterranean basins, and a powerful contribute to develop tailored adaptation and management strategies, mitigate jellyfish impacts on human activities in coastal zones, and support implementation of marine spatial planning, Blue Growth and conservation strategies.