Preprint
Article

This version is not peer-reviewed.

Optimal Portfolio and Trading Strategy Using Machine Learning

Submitted:

07 January 2025

Posted:

08 January 2025

You are already at the latest version

Abstract
This research presents machine learning models for forecasting the future returns of a portfolio from NASDAQ semiconductors assets by financial analysis, optimization, and technical analysis to form a trading strategy. The performance of the portfolio is evaluated by back-testing. Data were collected from 2011 to 2019 for the sector of semiconductor companies listed on Nasdaq. The project consists of 4 sub-tasks. The first sub-task is to use the annual financial ratios of each company under the sector of semiconductors from 2011 to 2018 to project the company returns in 2019 using machine learning algorithms. Then, the top 5 highest-return assets would be selected to form a portfolio. After the optimization of the portfolio by Monte Carlo simulation, the classifiers adopt the technical indicators of the portfolio assets from 2011 to 2018 to predict the trading signals (buy or sell) in 2019. The trading actions in 2019 are simulated by back-testing. The result shows that the optimal portfolio using the simulated trading strategy can have a profit of 50%. The profit is worse than the buy-and-hold strategy but better than the portfolio without optimization.
Keywords: 
;  ;  ;  ;  

1. Introduction

Investing can be generally viewed as a method of purchasing a portfolio of assets to gain predictable returns over a certain period. But Random Walk Theory [18] proclaims that the trend of a stock price cannot be predictable based on its past movement. All methods of predicting stock prices are futile in the long run. Another similar theory is the Efficient Market Hypothesis (EMH). EMH postulates that the stock market is efficient in reflecting the true value of the stock with the assumption that all relevant information is freely and widely available to investors. The higher the return, the higher the risk. However, [17] develops an adaptive market hypothesis that the risk premium varies over time and arbitrage opportunities do exist from time to time. On the other hand, EMH is also challenged by momentum investors. [24] mentions successful investors, such as George Soros and Warren Buffett, could use numerical and systematic approaches to beat the market. The common methods to sort and pick assets are fundamental analysis and technical analysis. Fundamental analysts seek to determine an asset’s proper value whereas technical analysts study stock charts to predict the direction of the future trend. With the advent of machine learning, a hybrid model [12] & [9] of applying machine learning to stock price forecasting using financial indicators (financial ratios and technical indicators) is prevalent. This paper presents how to create an optimal portfolio from Nasdaq semiconductors assets using fundamental analysis and form a trading strategy using technical analysis during the period from 2011 to 2018. For technical analysis, the forecast horizon is also considered. There are 5 different forecast horizons (1-day, 5-day, 7-day, 10-day, and 15-day). The portfolio contains 5 assets and is optimized by Monte Carlo simulation. The portfolio’s performance with the specified strategy is assessed by back-testing in 2019 and compared with the buy-and-hold strategy and portfolio without optimization strategy. The rest of this paper is organized as follows: Section 2 reviews the techniques and previous work in using fundamental analysis and technical analysis with machine learning algorithms for stock price forecasting; Section 3 describes the framework and flow modelling; Section 4 implements the models on the framework and presents the results; Section 5 evaluates the performance of the models; The conclusions and future work are presented in Section 6.

2. Related Works

The commonly used methods in forecasting trends and turning points in the stock market are fundamental analysis and technical analysis. Fundamental analysis is significant for 1 year and beyond, with longer time horizons, whereas technical analysis is significant for intraday, 1 week to 3 months trading [3]. Fundamental analysis generally uses financial ratios as predictors to forecast equity returns. The studies [5]; [15]; [21]; [20]; find that the price-to-earnings (PE) ratio causes a statistically significant impact on future stock prices. But [16] found that none of the financial ratios affected the stock prices of the stock exchange of Switzerland in 2015. Another finding by [4] is that some financial ratios could give strong positive and significant relationships to stock price behaviour and trends for the years 2005-2014 in the Kuwaiti financial market. A lot of articles show that applying machine learning approaches with financial ratios could improve the quality of forecasting stock price returns. [1] indicate that the prediction using the Logistic regression model with financial ratios would be up to 89.77 % accurate for the prediction of good and bad performance of stocks. [7] applies the multiple regression analysis with monthly financial ratios of four agriculture companies on the stock exchange of Thailand from June 2005 to June 2015 to show that the current ratios, net profit margin ratio, and total assets turnover ratio positively affect stock prices at the statistical significance level of 0.01 and debt to equity ratio negatively affects stock prices at the statistical significance level of 0.01. [6] conclude 7 machine learning techniques (Random Forest, AdaBoost, Kernel Factory, Neural Networks, Logistic Regression, Support Vector Machines, and K-Nearest Neighbour) with financial ratios of 5767 publicly listed European companies for stock price directions prediction one year ahead and find that Random Forest is the top performer. [20] also compares different machine learning techniques (decision trees, Support Vector Machines with Sequential Minimal Optimization, Random Trees, Random Forest, Logistic regression, Naïve Bayes and Bayesian Networks) and finds that Random Forest is the best performer with the highest F-Score of 0.751 [28]. On the other hand, the application of artificial neural network models and principal component analysis methods with 12 financial ratios plus 8 other financial variables can accurately predict stock prices on the Tehran Stock Exchange over a period from 2006 to March 2012 [22]. Regarding technical analysis, the technical indicators come from both the past movement of stock prices and the volume of trading to predict the future direction of stock prices. The direction provides the timing of buying or selling stocks to investors to make trading strategies. [14] compare 5 different supervised learning techniques (Support Vector Machine, Random Forest, K-Nearest Neighbor, Naïve Bayes, and Softmax) and find that the random forest algorithm performs the best for large datasets and Naïve Bayesian classifier is the best for small datasets. However, the accuracy lies between 50% and 70%. To increase the accuracy, hybrid models combining 2 or more learning techniques and deep learning are going to be drawn. [12] adopted 3 hybrid approaches (SVR with hierarchical clustering, SVR with principal component analysis, and SVR with genetic algorithms) for the prediction of the Shanghai-Shenzhen 300 index and found that SVR-HC outperforms. [2] use a Fuzzy Metagraph (FM) to classify and predict the prices of stocks listed on the Bombay Stock Exchange and have a satisfactory result with a very low-risk error [19]. With technical indicators as predictors, [13] compare 6 different models (Bat algorithm with extreme gradient boosting, random forest, extreme gradient boosting, linear regression, support vector machine, and artificial neural networks) for classifying the prices of Facebook and Apple stocks and obtain that BA-XGB outperforms with a maximum accuracy of 0.96.

3. Framework Design

The framework for creating an optimal portfolio and a trading strategy is shown in Figure 1. It consists of 4 parts: generate a portfolio (top 3 boxes); optimise the portfolio (on the right 3 boxes); form a trading strategy (2 grey boxes); and perform simulations of trading (2 green boxes).
The financial ratios used in this project come from the companies’ annual financial statements and their list is shown In Table 1 [8]. The regression models for predicting expected returns are depicted in Table 2.
When a portfolio of the 5 top highest-return assets is formed, optimizing the portfolio with the maximum risk-adjusted return by the Monte Carlo method is applied. The objective is to determine the weighting of each asset by getting the maximum value of the Sharpe ratio under a given level of risk. Regarding the grey boxes in Figure 1, the description of the technical indicators adopted is listed in Table 3 and the processes for predicting trading signals are shown in Figure 2. Training/ Validation data are the historical data from 2011 to 2018, whereas the test data is the transaction records in 2019 in the Nasdaq market. The cross-validation is based on the method shown in section 7 of Analytics Vidhya [11]. For each asset, there are 252 samples for 1 year and therefore we have 2,016 samples of training/validation data. We use 10 splits for each modelling and the ratio of training to validation is 7 to 3. The grid search classifications include random forest (RF), K-nearest Neighbours (KNN), Adaboost (ABT), gradient boosting (GBT), Support Vector Machine (SVM), and ensemble.
When a portfolio of the 5 top highest-return assets is formed, optimizing the portfolio with the maximum risk-adjusted return by the Monte Carlo method is applied. The objective is to determine the weighting of each asset by getting the maximum value of the Sharpe ratio under a given level of risk. Regarding the grey boxes in Figure 1, the description of the technical indicators adopted is listed in Table 3 and the processes for predicting trading signals are shown in Figure 2. Training/validation data are historical data from 2011 to 2018, whereas the test data is the transaction records from 2019 in the Nasdaq market. The cross-validation is based on the method shown in section 7 of Analytics Vidhya [11]. For each asset, there are 252 samples for 1 year and therefore we have 2,016 samples of training/validation data. We use 10 splits for each modelling and the ratio of training to validation is 7 to 3. The grid search classifications include random forest (RF), K-nearest Neighbours (KNN), Adaboost (ABT), gradient boosting (GBT), Support Vector Machine (SVM), and ensemble [27].
Back-testing is the simulation of portfolio trading in 2019 based on the predicted trading signals. There are 2 labels of trading signals: 1 indicates ‘buy’ and 0 indicates ‘sell’. Initially, a principle will be divided into 5 portions for the portfolio assets according to the optimization result. We will do one trade per trading day and each day in 2019. The trading price is the mean value of the highest and lowest price for convenient calculation. No commission, dividend, and other fees are involved. A faction of shares and short selling are not allowed. For each buy action, the invested money is less than 1 quarter of the portion principle. For each sell action, normally we will sell half of the stocks in hand unless there is the last share. The total trading days in 2019 are 252. Summing up the daily returns is the net result. The net return is the annual return for that year. We shall compare the results with the buy-and-hold strategy and the trading with equal weighting of the portfolio assets [29].

4. Implementation and Results

4.1. Creating a Portfolio

Explanatory features are the annual financial ratios and the response variable is the daily returns. The greater the number of features, the higher the possibility of degrading the performance of machine learning. In this project, principal component analysis (PCA) is the technique for dimensionality reduction. The results of PCA are shown in Figure 3. The screen plot shows that the first 8 components can explain almost 95% of the variance. The features adopted are ‘roa’, ‘roe’, ‘net_profit_margin’, ‘current ratio’, ‘quick ratio’, ‘debt ratio’, ‘debt to equity’ and ‘price to book’.
With the selected features, the data are standardized and then fed to 20 common regression models with the predictive performance shown in Figure 4. The minimum errors of models from different categories are chosen for grid search to get the optimal parameters. The 6 chosen models are ANN, SVR, Bayesian ridge regression, ridge regression, GBR, and XGBR. The predictive and the actual top 5 highest return assets are shown in Figure 5. Based on the minimum error, we choose GBR prediction. As a result, the portfolio contains AMD, ASML, QCOM, KLAC, and NVDA [30].

4.2. Portfolio Optimisation

The optimisation method adopted is called Monte Carlo simulation. A random generation of 10,000 combinations of weights for the 5 assets, the expected return of the portfolio can be determined by taking the mean value of the product of each asset weighting and its average yearly returns from 2011 to 2018. The distribution of different expected returns and expected volatility (standard deviation) is shown in Figure 6. The red dot in Figure 6 shows the maximum value of the Sharpe ratio. The coordinates of the red dot are 0.3134 and 0.4613. With this expected return, the weighting of each asset is 0.118075 (AMD), 0.291511 (ASML), 0.036439 (KLAC), 0.543872 (NVDA), and 0.010103 (QCOM).

4.3. Trading Signals Prediction

Technical indicators are explanatory features movement of stock prices in the response variable. The response variable has 2 labels: 1 (positive daily return) and 0 (negative daily return). There are 5 different forecast horizons (1-day, 5-day, 7-day, 10-day, and 15-day). As technical analysis aims to predict short-term price fluctuations, it had better to keep the forecast horizon to be less than 4 weeks (20 trading days). Therefore, it is appropriate to set the maximum forecast horizon to 15 days.

4.4. Feature Selection

To avoid degrading the machine learning predictive performance, some features without any correlation with the response variable may be deleted. We adopt PCA to reduce the dimensionality. The results of 6 features with the maximum variation (PC1) of PCA for the 5 assets are shown in Figure 7. Almost 90% variance can be explained by 6 principal components.
After feature reduction, the explanatory features undergo standardization and time series cross-validation (referred to in Section 3 of this report) before modelling.

4.5. Modelling

Six different machine learning techniques with grid search are applied to predict the stock price movement. The accuracy of different models of each asset under different forecast horizons for the validation dataset is shown in Figure 8. The blue cells indicate the highest accuracy for each asset.
According to the results of Figure 8, we choose AMD technical indicators with forecast horizons of 15-day, ASML technical indicators with forecast horizons of 10-day, KLAC technical indicators with forecast horizons of 15-day, NVDA technical indicators with forecast horizons of 15-day, and QCOM technical indicators with forecast horizons of 5-day as training data to predict the trading signals in 2019. The confusion matrices of the test dataset of the 5 assets are shown in Figure 9.

4.6. Simulation of Trading

With weighting obtained in Section 4.2 and the predicted signals in Section 4.3, we simulate trading in 2019. Figure 10 plots the simulation of trading. The net return is 0.406312 or 50.13%.

5. Evaluation of Performances and Limitations

5.1. Expected Returns by Machine Learning

We recall Figure 5, the root-mean-square error seems ineffective in assessing the predictive performance of asset returns because the ANN model has the largest error. But this model correctly predicts 4 out of 5 highest-returns assets. However, each model can predict AMD as one of the highest-return assets.

5.2. Optimisation

We recall Figure 6, the expected return of the portfolio is 0.4613, whereas Figure 10 shows 0.5013, with a difference of 8.7%.

5.3. Prediction of Trading Signal

When we use the results of Figure 9, the accuracy of prediction is determined as 0.66 (AMD), 0.74 (ASML), 0.58 (KLAC), 0.72 (NVDA), and 0.61 (QCOM). The precision values are calculated as 0.68 (AMD), 0.75 (ASML), 0.78 (KLAC), 0.73 (NVDA), and 0.61 (QCOM). For recall, the values are 0.96 (AMD), 0.98 (ASML), 0.67 (KLAC), 0.97 (NVDA), and 1.00 (QCOM).

5.4. Simulation

When we add buy-and-hold gain and equal-weight portfolio gain curves to Figure 10, the result is shown in Figure 11. It is observed that the buy-and-hold strategy outperforms, and the performance of the equal-weight portfolio is the worst. The net return of the buy-and-hold and equal-weight portfolio is 0.528 or 69.6% and 0.342 or 40.8% respectively. When the prediction of trading signals is correct, the daily gain of the curves is the same. When the prediction is incorrect the curves will move differently. As the prediction is about 60% correct, the curves are moving in the same pattern for around 60% period.

5.5. Limitations

Both financial ratios and technical indicators rely on historical data which may not necessarily reflect future performance. Past trends may not always recur. The selection of different combinations of ratios and indicators may have varying degrees of influence on stock prices. However, this study does not account for these differences or assign specific weights to them. Changes in accounting policies and procedures can influence the calculation of these ratios. It is unclear whether the companies in this study maintained consistent accounting policies from 2011 to 2019. Additionally, factors like government policies and interest rates can significantly affect investor sentiment in the stock market and finally affect the stock prices [35,36].

6. Conclusion

This research paper describes the ways to apply machine learning for predicting expected returns of assets that originate from the NASDAQ semiconductor industry using financial ratios as explanatory features, to optimize a portfolio that consists of 5 assets by using Monte Carlo simulation, to build classifiers for predicting stock price movements (up or down) using technical indicators as explanatory features, and to simulate trading in 2019 using our developed optimal portfolio and trading rules. Finally, we compare the trading performance of our optimal portfolio with the buy-and-hold strategy and the portfolio without optimization. The result shows that our portfolio has a gain of 50.1% whereas the buy-and-hold strategy has a gain of 69.6% and the portfolio without optimization has a gain of 40.8%. Although our portfolio cannot beat the benchmark, the result is quite encouraging (50.1% profit acceptable). In addition, the gain is higher than the expected return determined by optimisation. Feeding financial ratios into 6 different types of machine learning models for predicting the expected return of assets is implemented. The grid search for hyperparameter tuning is applied while modelling. The performance of the GBR technique is the best based on minimum errors [31,32]. If the purpose is to select the top 5 assets from all the Nasdaq semiconductor assets based on the predicted expected returns, the correctness rating of the models is at least 20% correct. That is not good but acceptable. It is likely that financial ratios can project the expected returns to a certain extent. When Monte Carlo simulation is applied to find the best asset allocation, a graph of efficient frontier is formed. When comparing the trading performance of the optimal portfolio with the equal-weight portfolio by back-testing, the optimal portfolio outperforms. Diversification is important to investing. Feeding technical indicators to 6 different types of modelling techniques is used to classify the stock price movement. Cross-validation and grid search are applied to reduce the effect of overfitting. The result shows that there is no indication of a relationship between the level of accuracy and forecast horizon. Evaluating model performance with the confusion matrix, accuracy ranges from 56% to 64%. These values alone do not indicate whether this investment strategy based on these predictions would be effective. However, the precision and recall values range from 0.61 to 0.78 and 0.67 to 1.00, respectively, suggesting that the predictions tend to favour price increases over decreases. Only back-testing is an effective way to evaluate the trading strategy. The simulation shows the net return would be 50.13% using this predicted strategy [25]. However, our portfolio cannot the buy-and-hold trading strategy. Anyway, we have a such high value of gain because of the upward trend of the stock market in 2019. It is recommended to have a similar approach for taking back-testing in a different year, especially in the downward trend market. Anyway, the accuracy of prediction is not high, only around 60%. A more sophisticated model is recommended for future investigation. Moreover, the addition of sentiment analysis to technical analysis may increase the predictive power in the potential for further investigation [10,24]. In actual investment, we would consider 3 states (buy, hold, and sell). In this project, we only have 2 states (buy and sell). Most investors are not frequent traders. They are holding shares most of the time. For further investigation, it is recommended to design a sophisticated simulation of trading [32,33,34].

References

  1. Ali, S., Mubeen, M., Lal, I. and Hussain, A. (2018). Prediction of stock performance by using logistic regression model: evidence from Pakistan Stock Exchange (PSX). Asian Journal of Empirical Research, 8(7), pp.247-258.
  2. Anbalagan, T. and Maheswari, S. (2015). Classification and Prediction of Stock Market Index Based on Fuzzy. Procedia Computer Science, 47, pp.214 – 221. [CrossRef]
  3. Antonacci, Gary (2014). Dual Momentum Investing: An Innovative Approach for Higher Returns with Lower Risk. New York: McGraw-Hill Education.
  4. Arkan, T. (2016). The Importance of Financial Ratios in Predicting Stock Price Trends: A Case Study in Emerging Markets. Finanse, Rynki Finansowe, Ubezpieczenia nr 1(79). [CrossRef]
  5. Asmirantho, E. and Somantri, O.K. (2017). The Effect of Financial Performance on Stock Price at Pharmaceutical Sub-sector Company Listed in Indonesia Stock Exchange. JIAFE, 3(2) Tahun 2017, Hal. 94-107. [CrossRef]
  6. Ballings, M., Poel, D., Hespeels, N. and Gryp, R. (2015). Evaluating multiple classifiers for stock price direction prediction. Expert Systems with Applications, 42, pp.7046-7056. http://dx.doi.org/10.1016/j.eswa.2015.05.013. [CrossRef]
  7. Banchuenvijit, W. (2016). Financial Ratios and Stock Prices: Evidence from the Agriculture Firms Listed on the Stock Exchange of Thailand. International Journal of Business and Economics, UTCC 8. Available from: http://www.ijbejournal.com/images/files/12731032205c5d46b8bf1ec.pdf.
  8. Carleton, P. and Siegel, R. (2021). 20 Key Financial Ratios [online]. InvestingAnswers. Available from: https://investinganswers.com/articles/financial-ratios-every-investor-should-use [Accessed 28 July 2022].
  9. Cavalcante, R., Minku, L. and Oliveira, A. (2016) FEDD: Feature Extraction for Explicit Concept Drift Detection in Time Series. in Proceedings of the 2016 IEEE International Joining Conference on Neural Networks (IJCNN). IEEE Xplore, Vancouver, Canada, pp. 740-747. [CrossRef]
  10. Christodoulaki, E., Kampouridis, M. and Kanellopoulos, P. (2022). Technical and Sentiment Analysis in Financial Forecasting with Genetic Programming. 2022 IEEE Symposium on Computational Intelligence for Financial Engineering and Economics (CIFEr), 2022, pp. 1-8. [CrossRef]
  11. Hazra, A. 2022. Top 7 Cross-Validation Techniques with Python Code. Analytics Vidhya [Online]. Available from: https://www.analyticsvidhya.com/blog/2021/11/top-7-cross-validation-techniques-with-python-code/ [Accessed 26 August 2022].
  12. Ibidapo, I., Adebiyi, A., and Okesola, O. (2017). Soft Computing Techniques for Stock Market Prediction: A Literature Survey. Covenant Journal of Informatics and Communication Technology, 5(2). Available from: https://journals.covenantuniversity.edu.ng/index.php/cjict/article/view/683.
  13. Jeyakarthic, M. and Punitha, S. (2020). Hybridization of Bat Algorithm with XGBOOST Model for Precise Prediction of Stock Market Directions. International Journal of Engineering and Advanced Technology (IJEAT), Vol 9, 3. [CrossRef]
  14. Kumar, I., Dogra, K., Utreja, C. and Yadav, P. (2018). A comparative study of supervised machine learning algorithms for stock market trend prediction. In 2018 Second International Conference on Inventive Communication and Computational Technologies (ICICCT), pp. 1003-1007. IEEE. [CrossRef]
  15. Kurach, R. and Słonski, T. (2015). The PE Ratio and the Predicted Earnings Growth—The Case of Poland. Folia Oecon. Stetin. 15, pp.127–138. [CrossRef]
  16. Ligocka, M., and Stavárek, D. (2018). The relationship between financial ratios and the stock prices of selected European food companies listed on Stock Exchanges. Acta Universitatis Agriculturae et Silviculturae Mendelianae Brunensis, 67(1): 299-307. [CrossRef]
  17. Lo, A. (2005). Reconciling Efficient Markets with Behaviour Finance: The Adaptive Markets Hypothesis. Journal of Investment Consulting, 7(2), pp.21-44.
  18. Malkiel, B. G. (1973). A Random Walk Down Wall Street - The Time-Test Strategy for Successful Investing. New York: W.W. Norton and Company Inc.
  19. Ghanem, M., Dawoud, F., Gamal, H., Soliman, E., El-Batt, T. and Sharara, H., 2022, September. FLoBC: A decentralized blockchain-based federated learning framework. In 2022 Fourth International Conference on Blockchain Computing and Applications (BCCA) (pp. 85-92). IEEE. [CrossRef]
  20. Milosevic, N., (2016). Equity Forecast: Predicting Long-Term Stock Price Movement using Machine Learning. Journal of Economics Library, 3(2), pp.288-294.
  21. Öztürk, H., and Karabulut, T. A. (2017). The relationship between earnings-to-price, current ratio, profit margin and return: an empirical analysis on Istanbul stock exchange. Accounting and Finance Research 7(1), 109-115. [CrossRef]
  22. Puspitaningtyas, Z. (2017). Is Financial Performance Reflected in Stock Prices? Proceedings of the 2nd International Conference on Accounting, Management, and Economics 2017 (ICAME 2017), Advances in Economics, Business and Management Research, 40, Atlantis Press. Available from: http://creativecommons.org/licenses/by-nc/4.0/ [Accessed 27 June 2022].
  23. Zahedi, J. and Rounaghi, M. (2015). Application of artificial neural network models and principal component analysis method in predicting stock prices on Tehran Stock Exchange. Physica A: Statistical Mechanics and its Applications, 438, pp.178-187. [CrossRef]
  24. Venkatesh, C. K. and Ganesh, L. (2011). Fundamental analysis as a method of share valuation in comparison with technical analysis. International Economics and Finance Journal, 6(1), pp.27-37.
  25. Ghanem, M., Mouloudi, A. and Mourchid, M., 2015. Towards a scientific research based on semantic web. Procedia Computer Science, 73, pp.328-335. [CrossRef]
  26. Nti, I.K., Adekoya, A.F. and Weyori, B.A. (2020). A systematic review of fundamental and technical analysis of stock market predictions. Artificial Intelligence Review, 53(4), pp.3007-3057. [CrossRef]
  27. Paiva, F.D., Cardoso, R.T.N., Hanaoka, G.P. and Duarte, W.M., 2019. Decision-making for financial trading: A fusion approach of machine learning and portfolio selection. Expert Systems with Applications, 115, pp.635-655. [CrossRef]
  28. Pinelis, M. and Ruppert, D., 2022. Machine learning portfolio allocation. The Journal of Finance and Data Science, 8, pp.35-54. [CrossRef]
  29. Loumachi, F.Y. and Ghanem, M.C., 2024. GenDFIR:Advancing Cyber Incident Timeline Analysis Through Rule Based AI and Large Language Models. arXiv preprint arXiv:2409.02572. https://arxiv.org/abs/2409.02572.
  30. Ghanem, M.C., Chen, T.M., Ferrag, M.A. and Kettouche, M.E., 2023. ESASCF: expertise extraction, generalization and reply framework for optimized automation of network security compliance. IEEE Access. [CrossRef]
  31. Farzaan, M.A., Ghanem, M.C. and El-Hajjar, A., 2024. AI-Enabled System for Efficient and Effective Cyber Incident Detection and Response in Cloud Environments. https://arxiv.org/abs/2404.05602.
  32. Ghanem, M.C., Mulvihill, P., Ouazzane, K., Djemai, R. and Dunsin, D., 2023. D2WFP: a novel protocol for forensically identifying, extracting, and analysing deep and dark web browsing activities. Journal of Cybersecurity and Privacy, 3(4), pp.808-829. [CrossRef]
  33. Basnet, A.S., Ghanem, M.C., Dunsin, D. and Sowinski-Mydlarz, W., 2024. Advanced Persistent Threats (APT) Attribution Using Deep Reinforcement Learning. arXiv preprint arXiv:2410.11463. https://doi.org/arXiv:2410.1146.
  34. Dunsin, D., Ghanem, M.C., Ouazzane, K. and Vassilev, V., 2024. Reinforcement learning for an efficient and effective malware investigation during cyber Incident response. arXiv preprint arXiv:2408.01999. [CrossRef]
  35. Ghanem, M., Mouloudi, A. and Mourchid, M., 2015. Towards a scientific research based on semantic web. Procedia Computer Science, 73, pp.328-335. [CrossRef]
  36. Hamouda, D., Ferrag, M.A., Benhamida, N., Seridi, H. and Ghanem, M.C., 2024. Revolutionizing intrusion detection in industrial IoT with distributed learning and deep generative techniques. Internet of Things, 26, p.101149. [CrossRef]
Figure 1. Framework of the processes.
Figure 1. Framework of the processes.
Preprints 145515 g001
Figure 2. Processes of Predicting Trading Signals.
Figure 2. Processes of Predicting Trading Signals.
Preprints 145515 g002
Figure 3. Principal Component Analysis for Financial Ratios.
Figure 3. Principal Component Analysis for Financial Ratios.
Preprints 145515 g003
Figure 4. Performance of 20 Models.
Figure 4. Performance of 20 Models.
Preprints 145515 g004
Figure 5. Predictions of Models by Grid Search.
Figure 5. Predictions of Models by Grid Search.
Preprints 145515 g005
Figure 6. Predictions of Models by Grid Searches.
Figure 6. Predictions of Models by Grid Searches.
Preprints 145515 g006
Figure 7. PCA of Technical Indicators.
Figure 7. PCA of Technical Indicators.
Preprints 145515 g007
Figure 8. Accuracy of Models for Training/Validation Dataset.
Figure 8. Accuracy of Models for Training/Validation Dataset.
Preprints 145515 g008
Figure 9. Confusion Matrices for Test Dataset.
Figure 9. Confusion Matrices for Test Dataset.
Preprints 145515 g009
Figure 10. Predictions of Models by Grid Searches.
Figure 10. Predictions of Models by Grid Searches.
Preprints 145515 g010
Figure 11. Predictions of Models by Grid Searches.
Figure 11. Predictions of Models by Grid Searches.
Preprints 145515 g011
Table 1. Description of Financial Ratios.
Table 1. Description of Financial Ratios.
Profitability Ratios Liquidity Ratios Leverage Ratios Market Ratios Activity Ratios
Return on Assets (roa) Current Ratio Debt Ratio PE to Growth (PEG) Ratio Asset Turnover Ratio
Return on Equity (roe) Quick Ratio Debt to Equity Ratio Price-to-Sales (PS) Ratio Inventory Turnover Ratio
Net Profit Margin Cash Ratio Interest Coverage Ratio Price-to-Book (PB) Ratio Receivable Turnover Ratio
Price-to-Earnings (PE) Ratio Dividend Yield Payables Turnover Ratio
Dividend Payout Ratio Asset Turnover Ratio
Table 2. Automatic Relevance Determination.
Table 2. Automatic Relevance Determination.
Random Forest Linear Regression Bayesian Ridge Bayesian Ridge
Theil-Sen Regression Ridge Regression Kernel Ridge Decision Tree
Artificial Neural Network Nu Support Vector Elastic Net Linear Huber Regression
Support Vectorn Least Angle Regression Gaussian Process Linear Support Vector
Automatic Relevance Determination Extreme Gradient Boosting Orthogonal Matching Pursuit Passive Aggressive Regressor
Table 3. Technical Indicators.
Table 3. Technical Indicators.
Relative strength index (RSI) Vortex indicator Stochastic Oscillator %K %D Momentum indicator (MOM)
Money flow index (MFI) Rate of change (ROC) On balance volume (OBV) Ease of movement (EMV)
Commodity channel index (CCI) Exponential moving average (EMA50) Moving average convergence divergence (MACD)
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.
Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

Disclaimer

Terms of Use

Privacy Policy

Privacy Settings

© 2025 MDPI (Basel, Switzerland) unless otherwise stated