Submitted:
08 May 2025
Posted:
09 May 2025
You are already at the latest version
Abstract
Keywords:
1. Introduction
1.1. The Importance of Using the UCS as a Design Parameter
1.2. Challenges in Predicting UCS Through Traditional Methods
2. Materials and Methods
2.1. Characterization of the Materials Used
2.1.1. Soil Samples
2.1.2. Portland Cement (PC)
2.2. Experimental Program
2.2.1. Sample Preparation
2.2.2. Testing Methodology
2.3. Machine Learning Methodology
2.3.1. Data Preprocessing and Feature Engineering
2.3.2. Algorithm Selection and Initial Screening
2.3.3. Nested Cross-Validation Framework
2.3.4. Hyperparameter Optimization Strategy
- Random Forest: The tuning process focused on optimizing the number of estimators, tree depth, and node splitting criteria, as these parameters significantly impact the model’s ability to balance bias and variance.
- Gradient Boosting: The search space was centered around learning rate, maximum tree depth, subsampling rate, and the number of estimators, as these factors govern the model’s ability to learn from sequential errors.
- XGBoost: Given its greater flexibility, additional hyperparameters, including minimum child weight and column sampling, were explored to optimize feature selection during training.
2.3.5. Model Evaluation Metrics
2.3.6. Final Model Selection and Training
3. Results and Discussion
3.1. Experimental Test Results
3.1.1. Overview of UCS Measurements
3.1.2. Primary Factors Influencing UCS Development
3.1.3. Effect of Cement Content on Strength Development
3.1.4. Effect of Compaction Rate
3.2. Model Performance Comparison
3.2.1. Initial Model Screening Results
3.2.2. Nested Cross-Validation Results
3.2.3. Final Model Performance Analysis
3.3. Model Deployment and Accessibility
3.4. Practical Applications and Limitations
- For applications with moderate strength requirements ( kPa), cement contents in the range of 5-7.5% provide an optimal balance between mechanical performance and economic efficiency.
- For more demanding applications requiring higher strength (UCS > 3000 kPa), a cement content of 10% is recommended, with curing periods of at least 14 days to achieve consistent results.
- Compaction rate control (within the studied range of 0.75-1.25 mm/min) does not significantly impact final strength for most practical scenarios, allowing for more flexible field implementation protocols.
- Quality control measures should be more stringent for mixtures with higher cement contents (), as these exhibited greater variability in strength outcomes due to increased sensitivity to mixing homogeneity and curing conditions.
3.5. Future Research Directions
4. Conclusions
- The experimental program revealed that cement content is the primary determinant of strength development, exhibiting a strong positive correlation (R2 = 0.87) with UCS values. This relationship follows a non-linear pattern with accelerating strength gains at higher cement contents (7.5-10%), which can be attributed to the formation of more continuous cementitious matrices throughout the soil structure.
- Curing period demonstrated a moderate positive correlation with UCS (R2 = 0.50), confirming the time-dependent nature of cement hydration and pozzolanic reactions in stabilized soils. While significant strength development occurred within the first 7 days, continued gains were observed through 28 days, particularly at higher cement contents.
- Within the range investigated (0.75-1.25 mm/min), compaction rate exhibited minimal influence on UCS development () for most combinations of cement content and curing time. Notable exceptions were observed at early curing times (1 day) and high cement contents (10%), suggesting that compaction rate becomes more influential under specific conditions.
- The application of machine learning techniques demonstrated that tree-based ensemble methods significantly outperformed traditional linear models in predicting UCS. The optimized Random Forest model achieved exceptional accuracy (R2 = 0.9825, RMSE = 167.52 kPa), confirming that soil-cement interactions follow complex non-linear patterns that require sophisticated modeling approaches.
- Feature importance analysis from the Random Forest model provided quantitative confirmation of experimental observations, attributing 78.4% of predictive power to cement content, 21.2% to curing period, and only 0.4% to compaction velocity. This alignment between algorithmic feature ranking and experimental correlations strengthens the reliability of both approaches.
- For practical applications, cement contents of 5-7.5% were found to provide an optimal balance between strength enhancement and economic considerations for projects with moderate strength requirements ( kPa). For applications demanding higher strength levels, 10% cement content consistently delivered UCS values exceeding 3000 kPa after 14 days of curing.
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Khan, M.H.A.; Abdallah, A.; Cuisinier, O. Insights into the strength development in cement-treated soils: An explainable AI-based approach for optimized mix design. Computers and Geotechnics 2025, 180, 107103. [Google Scholar] [CrossRef]
- Kitazume, M.; Terashi, M. The deep mixing method; CRC Press: London, 2013. [Google Scholar]
- Nakarai, K.; Yoshida, T. Effect of carbonation on strength development of cement-treated Toyoura silica sand. Soils and Foundations 2015, pp. 857–865. [CrossRef]
- Felt, E. Factors influencing physical properties of soil-cement mixtures. Highway Research Board Bulletin 1955. [Google Scholar]
- Abdallah, A.; Russo, G.; Cuisinier, O. Statistical and Predictive Analyses of the Strength Development of a Cement-Treated Clayey Soil. Geotechnics 2023, 3, 465–479. [Google Scholar] [CrossRef]
- Kang, G.; Kim, Y.; Kang, J. Predictive strength model of cement-treated fine-grained soils using key parameters: Consideration of the total water/cement and soil/cement ratios. Case Studies in Construction Materials 2023, 18, e02069. [Google Scholar] [CrossRef]
- Petchgate, W.; et al. Sustainable soil stabilization: Evaluating the feasibility of hydraulic cement in the deep mixing method. Case Studies in Construction Materials 2025, p. e04394. [CrossRef]
- Zhang, L.; Li, Y.; Wei, X.; Liang, X.; Zhang, J.; Li, X. Unconfined Compressive Strength of Cement-Stabilized Qiantang River Silty Clay. Materials 2024, 17. [Google Scholar] [CrossRef] [PubMed]
- Wan, W. Enhancing unconfined compressive strength of stabilized soil with lime and cement prediction through a robust hybrid machine learning approach utilizing Naive Bayes Algorithm. Journal of Engineering and Applied Science 2024, 71, 84. [Google Scholar] [CrossRef]
- Jeremiah, J.; Abbey, S.; Booth, C.; Kashyap, A. Results of Application of Artificial Neural Networks in Predicting Geo-Mechanical Properties of Stabilised Clays—A Review. Geotechnics 2021, 1, 147–171. [Google Scholar] [CrossRef]
- Onyelowe, K.; et al. Estimating the strength of soil stabilized with cement and lime at optimal compaction using ensemble-based multiple machine learning. Scientific Reports 2024, 14, 15308. [Google Scholar] [CrossRef]
- Thapa, I.; Ghani, S. Advancing earth science in geotechnical engineering: A data-driven soft computing technique for unconfined compressive strength prediction in soft soil. Journal of Earth System Science 2024, 133, 159. [Google Scholar] [CrossRef]
- Mozumber, R.; Laskar, A. Prediction of unconfined compressive strength of geopolymer stabilized clayey soil using Artificial Neural Network. Computers and Geotechnics 2015, 69. [Google Scholar] [CrossRef]
- Deng, D.; Liu, L.; Liu, S.; Zhang, G.; Sun, H.; Zhang, B. Strength enhancement of cement-based stabilized clays via vacuum-assisted filtration. Case Studies in Construction Materials 2023, 18, e02204. [Google Scholar] [CrossRef]
- Guo, X.; Garcia, C.; Valle, A.I.A.; Onyelowe, K.; Villacrés, A.N.Z.; Ebid, A.; Hanandeh, S. Modeling the influence of lime on the unconfined compressive strength of reconstituted graded soil using advanced machine learning approaches for subgrade and liner applications. PLOS ONE 2024, 19, e0301075. [Google Scholar] [CrossRef] [PubMed]
- Chen, Q.; Hu, G.; Wu, J. Prediction of the Unconfined Compressive Strength of a One-Part Geopolymer-Stabilized Soil Using Deep Learning Methods with Combined Real and Synthetic Data. Buildings 2024, 14. [Google Scholar] [CrossRef]
- Eyo, E.; Abbey, S.; Booth, C. Strength Predictive Modelling of Soils Treated with Calcium-Based Additives Blended with Eco-Friendly Pozzolans—A Machine Learning Approach. Materials 2022, 15. [Google Scholar] [CrossRef]
- Wudil, Y.; Al-Najjar, O.A.; Al-Osta, M.; Al-Amoudi, O.S.B.; Gondal, M. Investigating the Soil Unconfined Compressive Strength Based on Laser-Induced Breakdown Spectroscopy Emission Intensities and Machine Learning Techniques. ACS Omega 2023, 8, 26391–26404. [Google Scholar] [CrossRef] [PubMed]
- Zhao, H.; Bing, H. Prediction of the Unconfined Compressive Strength of Salinized Frozen Soil Based on Machine Learning. Buildings 2024, 14. [Google Scholar] [CrossRef]
- IRS. STAS 1913/5-85: Teren de fundare. Determinarea granulozităţii; Institutul Român de Standardizare, 1985. In Romanian.
- CEN. EN ISO 14688-2:2018: Geotechnical investigation and testing – Identification and classification of soil – Part 2: Principles for a classification; European Committee for Standardization, 2018.
- CEN. EN ISO 14688-1:2018: Geotechnical investigation and testing – Identification and classification of soil – Part 1: Identification and description; European Committee for Standardization, 2018.
- CEN. EN 197-1: Cement – Part 1: Composition, specifications, and conformity criteria for common cements; European Committee for Standardization, 2011.
- Taylor, H.F.W. Cement Chemistry; Thomas Telford, 1997.
- De Weerdt, K.; et al. Hydration mechanisms of ternary Portland cements containing limestone powder and fly ash. Cement and Concrete Research 2011, 41, 279–291. [Google Scholar] [CrossRef]
- Shi, C.; Pavel, K.; Roy, D.M. Alkali-activated cements and concretes; CRC Press: London, 2003. [Google Scholar]
- Pol Segura, I.; Ranjbar, N.; Juul Damø, A.; Skaarup Jensen, L.; Canut, M.; Arendt Jensen, P. A review: Alkali-activated cement and concrete production technologies available in the industry. Heliyon 2023, 9, e15718. [Google Scholar] [CrossRef]
- Lothenbach, B.; Le Saout, G.; Gallucci, E.; Scrivener, K. Influence of limestone on the hydration of Portland cements. Cement and Concrete Research 2008, 38, 848–860. [Google Scholar] [CrossRef]
- Sherwood, P.T. Soil stabilization with cement and lime; HMSO: London, 1993. [Google Scholar]
- Tukey, J.W. Exploratory Data Analysis; Addison-Wesley: Reading, MA, 1977. [Google Scholar]
- Aggarwal, C.C. Outlier Analysis; Springer International Publishing AG: Cham, Switzerland, 2017. [Google Scholar]
- Maulud, D.; Abdulazeez, A.M. A Review on Linear Regression Comprehensive in Machine Learning. Journal of Applied Science and Technology Trends 2020, 1, 140–147. [Google Scholar] [CrossRef]
- Hoerl, A.E.; Kennard, R.W. Ridge regression: Biased estimation for nonorthogonal problems. Technometrics 1970, 12, 55–67. [Google Scholar] [CrossRef]
- Tibshirani, R. Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 1996, 58, 267–288. [Google Scholar] [CrossRef]
- Zou, H.; Hastie, T. Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 2005, 67, 301–320. [Google Scholar] [CrossRef]
- Breiman, L.; Friedman, J.; Stone, C.J.; Olshen, R.A. Classification and regression trees; CRC Press: Boca Raton, FL, 1984. [Google Scholar]
- Breiman, L. Random forests. Machine Learning 2001, 45, 5–32. [Google Scholar] [CrossRef]
- Friedman, J.H. Greedy function approximation: A gradient boosting machine. Annals of Statistics 2001, 29, 1189–1232. [Google Scholar] [CrossRef]
- Chen, T.; Guestrin, C. XGBoost: A scalable tree boosting system. In Proceedings of the Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 2016, pp. 785–794. [CrossRef]
- Ke, G.; Meng, Q.; Finley, T.; Wang, T.; Chen, W.; Ma, W.; Ye, Q.; Liu, T.Y. LightGBM: A highly efficient gradient boosting decision tree. In Proceedings of the Advances in Neural Information Processing Systems; 2017; Vol. 30, pp. 3146–3154. [Google Scholar]
- Prokhorenkova, L.; Gusev, G.; Vorobev, A.; Dorogush, A.V.; Gulin, A. CatBoost: Unbiased boosting with categorical features. In Proceedings of the Advances in Neural Information Processing Systems; 2018; Vol. 31, pp. 6638–6648. [Google Scholar]
- Drucker, H.; Burges, C.J.; Kaufman, L.; Smola, A.J.; Vapnik, V. Support vector regression machines. In Proceedings of the Advances in Neural Information Processing Systems; 1997; Vol. 9, pp. 155–161. [Google Scholar]
- Cover, T.; Hart, P. Nearest neighbor pattern classification. IEEE Transactions on Information Theory 1967, 13, 21–27. [Google Scholar] [CrossRef]
- Rasmussen, C.E.; Williams, C.K.I. Gaussian processes for machine learning; MIT Press: Cambridge, MA, 2006. [Google Scholar]
- Freund, Y.; Schapire, R.E. A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences 1997, 55, 119–139. [Google Scholar] [CrossRef]
- Delgado, M.F.; Sirsat, M.S.; Cernadas, E.; Alawadi, S.; Barro, S.; Febrero-Bande, M. An extensive experimental survey of regression methods. Neural Networks 2019, 111, 11–34. [Google Scholar] [CrossRef]
- Vabalas, A.; Gowen, E.; Poliakoff, E.; Casson, A.J. Machine learning algorithm validation with a limited sample size. PLoS ONE 2019, 14. [Google Scholar] [CrossRef]
- Xu, P.; Ji, X.; Li, M.; Lu, W. Small data machine learning in materials science. npj Computational Materials 2023, 9, 1–15. [Google Scholar] [CrossRef]
- Ghasemzadeh, H.; Hillman, R.E.; Mehta, D.D. Toward Generalizable Machine Learning Models in Speech, Language, and Hearing Sciences: Estimating Sample Size and Reducing Overfitting. Journal of Speech, Language, and Hearing Research 2024, 67, 753–781. [Google Scholar] [CrossRef] [PubMed]
- Bischl, B.; Binder, M.; Lang, M.; Pielok, T.; Richter, J.; Coors, S.; Thomas, J.; Ullmann, T.; Becker, M.; Boulesteix, A.; et al. Hyperparameter optimization: Foundations, algorithms, best practices, and open challenges. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery 2021, 13. [Google Scholar] [CrossRef]
- Varma, S.; Simon, R.M. Bias in error estimation when using cross-validation for model selection. BMC Bioinformatics 2006, 7, 91–91. [Google Scholar] [CrossRef]
- Cawley, G.C.; Talbot, N.L. On Over-fitting in Model Selection and Subsequent Selection Bias in Performance Evaluation. Journal of Machine Learning Research 2010, 11. [Google Scholar]
- Tsamardinos, I.; Rakhshani, A.; Lagani, V. Performance-Estimation Properties of Cross-Validation-Based Protocols with Simultaneous Hyper-Parameter Optimization. In Proceedings of the Artificial Intelligence: Methods and Applications; Likas, A.; Blekas, K.; Kalles, D., Eds., Cham, 2014; pp. 1–14.
- Stephen Bates, T.H.; Tibshirani, R. Cross-Validation: What Does It Estimate and How Well Does It Do It? Journal of the American Statistical Association 2024, 119, 1434–1445. [Google Scholar] [CrossRef]
- Allgaier, J.; Pryss, R. Cross-Validation Visualized: A Narrative Guide to Advanced Methods. Machine Learning and Knowledge Extraction 2024, 6, 1378–1388. [Google Scholar] [CrossRef]
- Akiba, T.; Sano, S.; Yanase, T.; Ohta, T.; Koyama, M. Optuna: A Next-generation Hyperparameter Optimization Framework. In Proceedings of the Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, New York, NY, USA, 2019; KDD ’19, p. 2623–2631. [CrossRef]
- Botchkarev, A. A New Typology Design of Performance Metrics to Measure Errors in Machine Learning Regression Algorithms. Interdisciplinary Journal of Information, Knowledge, and Management 2019, 14. [Google Scholar] [CrossRef]
- Plevris, V.; Solorzano, G.; Bakas, N.; El, M.; Seghier, A.B. Investigation of performance metrics in regression analysis and machine learning-based prediction models. In Proceedings of the World Congress in Computational Mechanics and ECCOMAS Congress; 2022. [Google Scholar] [CrossRef]
- Chicco, D.; Warrens, M.J.; Jurman, G. The coefficient of determination R-squared is more informative than SMAPE, MAE, MAPE, MSE and RMSE in regression analysis evaluation. PeerJ Computer Science 2021, 7, e623. [Google Scholar] [CrossRef]
- Moayedi, H.; Rezaei, A. An artificial neural network approach for under-reamed piles subjected to uplift forces in dry sand. Neural Computing and Applications 2019, p. 327–336. [CrossRef]
- Bergstra, J.; Bengio, Y. Random search for hyper-parameter optimization. Journal of Machine Learning Research 2012, 13, 281–305. [Google Scholar]
- Zhang, J.; Li, D.; Wang, Y. Toward intelligent construction: Prediction of mechanical properties of manufactured-sand concrete using tree-based models. Journal of Cleaner Production 2020, 258, 120665. [Google Scholar] [CrossRef]
- Probst, P.; Boulesteix, A.L.; Bischl, B. Tunability: importance of hyperparameters of machine learning algorithms. Journal of Machine Learning Research 2019, 20, 1934–1965. [Google Scholar]
- Feurer, M.; Hutter, F., Hyperparameter Optimization. In Automated Machine Learning: Methods, Systems, Challenges; Hutter, F.; Kotthoff, L.; Vanschoren, J., Eds.; Springer International Publishing: Cham, 2019; pp. 3–33. [CrossRef]
- Boulesteix, A.L.; Binder, H.; Abrahamowicz, M.; Sauerbrei, W.; for the Simulation Panel of the STRATOS Initiative. On the necessity and design of studies comparing statistical methods. Biometrical Journal 2018, 60, 216–218. [Google Scholar] [CrossRef]
- Fernández-Delgado, M.; Cernadas, E.; Barro, S.; Amorim, D. Do we need hundreds of classifiers to solve real world classification problems? Journal of Machine Learning Research 2014, 15, 3133–3181. [Google Scholar]
- Akoglu, H. User’s guide to correlation coefficients. Turkish Journal of Emergency Medicine 2018, 18, 91–93. [Google Scholar] [CrossRef]
- Eskisar, T. Influence of Cement Treatment on Unconfined Compressive Strength and Compressibility of Lean Clay with Medium Plasticity. Arabian Journal for Science and Engineering 2015, 40, 763–772. [Google Scholar] [CrossRef]
- Yao, K.; tao Pan, Y.; jiu Jia, L.; Yi, J.; Hu, J.; Wu, C. Strength evaluation of marine clay stabilized by cementitious binder. Marine Georesources & Geotechnology 2020, 38, 730–743. [Google Scholar] [CrossRef]
- Wu, J.; Liu, L.; Deng, Y.; Zhang, G.; Zhou, A.; Wang, Q. Distinguishing the effects of cementation versus density on the mechanical behavior of cement-based stabilized clays. Construction and Building Materials 2020, p. 121571. [CrossRef]
- Horpibulsuk, S.; Rachan, R.; Chinkulkijniwat, A.; Raksachon, Y.; Suddeepong, A. Analysis of strength development in cement-stabilized silty clay from microstructural considerations. Construction and Building Materials 2010, 24, 2011–2021. [Google Scholar] [CrossRef]
- Zhang, W.; Wu, C.; Zhong, H.; Li, Y.; Wang, L. Prediction of undrained shear strength using extreme gradient boosting and random forest based on Bayesian optimization. Geoscience Frontiers 2021, 12, 469–477. [Google Scholar] [CrossRef]
- Zhang, Q.; Wang, L.; Gu, H. Having Deep Investigation on Predicting Unconfined Compressive Strength by Decision Tree in Hybrid and Individual Approaches. International Journal of Advanced Computer Science and Applications 2024, 15, 127–140. [Google Scholar] [CrossRef]
- Md. Ikramul Hoque, Muzamir Hasan, M.S.I.M.H.M.A.; Sobuz, M.H.R. Machine Learning Methods to Predict and Analyse Unconfined Compressive Strength of Stabilised Soft Soil with Polypropylene Columns. Cogent Engineering 2023, 10, 2220492. [CrossRef]
- Kim, M.; Senturk, M.A.; Li, L. Compression Index Regression of Fine-Grained Soils with Machine Learning Algorithms. Applied Sciences 2024, 14. [Google Scholar] [CrossRef]
- Zhou, J.; Li, E.; Wei, H.; Li, C.; Qiao, Q.; Armaghani, D.J. Random Forests and Cubist Algorithms for Predicting Shear Strengths of Rockfill Materials. Applied Sciences 2019, 9. [Google Scholar] [CrossRef]
- Mohr, F.; van Rijn, J.N. Fast and Informative Model Selection Using Learning Curve Cross-Validation. IEEE Transactions on Pattern Analysis and Machine Intelligence 2023, 45, 9669–9680. [Google Scholar] [CrossRef] [PubMed]
- Gallitto, G.; Englert, R.; Kincses, B.; Kotikalapudi, R.; Li, J.; Hoffschlag, K.; Bingel, U.; Spisak, T. External validation of machine learning models - registered models and adaptive sample splitting. bioRxiv 2024. [Google Scholar] [CrossRef]
- Raschka, S. Model evaluation, model selection, and algorithm selection in machine learning. arXiv preprint arXiv:1811.12808 2018. arXiv:1811.12808 2018.
- Kuhn, M.; Johnson, K. Applied Predictive Modeling; SpringerLink: Bücher, Springer New York, 2013. [Google Scholar]
- van Rijn, J.N.; Hutter, F. Hyperparameter Importance Across Datasets. In Proceedings of the Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, New York, NY, USA, 2018; p. 2367–2376. [CrossRef]
- Rodriguez, J.D.; Perez, A.; Lozano, J.A. Sensitivity Analysis of k-Fold Cross Validation in Prediction Error Estimation. IEEE Transactions on Pattern Analysis and Machine Intelligence 2010, 32, 569–575. [Google Scholar] [CrossRef]
- Kraszewski, C.; Rafalski, L.; Gajewska, B. Effect of Compaction Ratio on Mechanical Properties of Low-Strength Hydraulically Bound Mixtures for Road Engineering. Materials 2022, 15. [Google Scholar] [CrossRef]







| CaO | MgO | ||||||
| 56.2 | 19.8 | 7.1 | 3.6 | 1.2 | 0.38 | 0.69 | 2.9 |
| Sample Preparations |
Compaction Velocities (mm/min) |
Cement Percentages (%) | |||
| 2.5 | 5 | 7.5 | 10 | ||
| For 24 Hrs. | 1.25 | 3 | 4 | 3 | 3 |
| 1.0 | 3 | 3 | 4 | 4 | |
| 0.75 | 3 | 3 | 3 | 4 | |
| For 7 days | 1.25 | 4 | 4 | 4 | 4 |
| 1.0 | 4 | 4 | 4 | 4 | |
| 0.75 | 4 | 4 | 4 | 4 | |
| For 14 days | 1.25 | 3 | 4 | 3 | 4 |
| 1.0 | 4 | 4 | 4 | 4 | |
| 0.75 | 3 | 3 | 4 | 3 | |
| For 28 days | 1.25 | 3 | 3 | 4 | 3 |
| 1.0 | 3 | 3 | 4 | 3 | |
| 0.75 | 4 | 3 | 3 | 4 | |
| Overall Samples Used for Tests | 41 | 42 | 44 | 44 | |
| 171 | |||||
| Model | MSE | RMSE | MAE | R2 |
|---|---|---|---|---|
| Linear Methods | ||||
| Linear Regression | 229558.94 | 479.12 | 405.85 | 0.8390 |
| Ridge | 229731.25 | 479.3 | 406.75 | 0.8388 |
| Lasso | 229581.38 | 479.15 | 406.16 | 0.8389 |
| ElasticNet | 345532.78 | 587.82 | 516.76 | 0.7576 |
| Tree-Based Methods | ||||
| Decision Tree | 83397.33 | 288.79 | 201.92 | 0.9415 |
| Random Forest | 70209.46 | 264.97 | 178.98 | 0.9507 |
| Gradient Boosting | 64004.83 | 252.99 | 171.5 | 0.9551 |
| XGBoost | 83396.61 | 288.78 | 201.92 | 0.9415 |
| LightGBM | 72716.76 | 269.66 | 181.41 | 0.9490 |
| CatBoost | 81478.5 | 285.44 | 200.55 | 0.9428 |
| Ada Boost | 84803.66 | 291.21 | 193.61 | 0.9405 |
| Other Non-Linear Methods | ||||
| SVR | 1403743.36 | 1184.8 | 1018.6 | 0.0152 |
| KNN | 83397.33 | 288.79 | 201.92 | 0.9415 |
| Gaussian Process | 83397.3 | 288.79 | 201.92 | 0.9415 |
| Model | R2 (mean ± std) | RMSE (kPa) | MAE (kPa) | MAPE (%) |
|---|---|---|---|---|
| Random Forest | 0.9673 ± 0.0097 | 220.76 | 152.58 | 7.41 |
| Gradient Boosting | 0.9669 ± 0.0095 | 221.60 | 156.65 | 9.52 |
| XGBoost | 0.9655 ± 0.0099 | 226.61 | 164.63 | 9.02 |
| Hyperparameter | Random Forest | Gradient Boosting | XGBoost | |||
|---|---|---|---|---|---|---|
| Optimal | Best Fold | Optimal | Best Fold | Optimal | Best Fold | |
| n_estimators | 174 | 211 | 157 | 120 | 231 | 203 |
| max_depth | 9 | 5 | 4 | 4 | 8 | 4 |
| min_samples_split | 4 | 4 | - | - | - | - |
| min_samples_leaf | 2 | 2 | - | - | - | - |
| learning_rate | - | - | 0.026 | 0.032 | 0.121 | 0.079 |
| min_child_weight | - | - | - | - | 4 | 6 |
| subsample | - | - | 0.857 | 0.856 | 0.835 | 0.954 |
| colsample_bytree | - | - | - | - | 0.968 | 0.813 |
| Metric | Random Forest | Gradient Boosting | XGBoost | |||
|---|---|---|---|---|---|---|
| Optimal | Best Fold | Optimal | Best Fold | Optimal | Best Fold | |
| R2 | 0.9825 | 0.9806 | 0.9817 | 0.9814 | 0.9804 | 0.9804 |
| RMSE | 167.52 | 176.69 | 171.68 | 173.08 | 177.47 | 177.45 |
| MAE | 112.48 | 118.84 | 120.08 | 121.92 | 124.93 | 124.52 |
| EVS | 0.9825 | 0.9806 | 0.9817 | 0.9814 | 0.9804 | 0.9804 |
| Max Error | 636.97 | 701.06 | 665.92 | 677.68 | 708.35 | 682.96 |
| Median AE | 69.17 | 73.86 | 73.56 | 71.42 | 84.79 | 85.83 |
| CV RMSE | 7.83 | 8.26 | 8.02 | 8.09 | 8.29 | 8.29 |
| MAPE | 5.55 | 5.80 | 7.08 | 7.57 | 6.37 | 6.39 |
| Metric | Random Forest | Gradient Boosting | XGBoost | |||
|---|---|---|---|---|---|---|
| Optimal | Best Fold | Optimal | Best Fold | Optimal | Best Fold | |
| Mean Error (kPa) | -1.62 | -0.58 | 0.99 | 0.56 | 0.71 | -0.57 |
| Std Dev (kPa) | 167.51 | 176.69 | 171.67 | 173.08 | 177.47 | 177.45 |
| Min Error (kPa) | -583.13 | -634.52 | -665.92 | -677.68 | -646.61 | -678.60 |
| Max Error (kPa) | 636.97 | 701.06 | 660.95 | 668.68 | 708.35 | 682.96 |
| Errors ±100 kPa (%) | 60.6 | 57.7 | 58.3 | 57.1 | 53.7 | 56.6 |
| Errors ±200 kPa (%) | 81.7 | 79.4 | 80.6 | 80.6 | 77.7 | 76.6 |
| Errors ±300 kPa (%) | 91.4 | 89.1 | 90.9 | 91.4 | 90.9 | 91.4 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
