Submitted:
17 April 2026
Posted:
22 April 2026
You are already at the latest version
Abstract

Keywords:
1. Introduction
2. Related Work
3. Materials and Methods
3.1. Overall Methodological Framework
3.2. Study Area and Data Source
3.3. Entropy-Weighted Water Quality Index (EWQI)
3.3.1. Hydrochemical Matrix Construction and Normalization
3.3.2. Ratio Matrix, Entropy, and Weight Calculation
3.3.3. Quality Rating and Final EWQI Calculation
3.3.4. EWQI Classification
3.4. Supervised Learning Design
3.5. Machine Learning Models
3.5.1. Feature Scenarios
- Scenario 1 (hydrochemical): pH, TDS, K, Na, Mg, Ca, Cl, SO4, HCO3, and NO3.
- Scenario 2 (enhanced_hydro): Scenario 1 plus Depth, Static_WL, Dynamic_WL, and Well_productivity.
- Scenario 3 (enhanced_spatial): Scenario 2 plus Decimal_X and Decimal_Y.
3.5.2. Preprocessing and Data Splitting
3.5.3. Model Descriptions
Support Vector Machine (SVM)
Random Forest (RF)
Backpropagation Multilayer Perceptron (BP-MLP)
One-Dimensional Convolutional Neural Network (1D-CNN)
3.5.4. Evaluation Metrics
4. Results and Discussion
4.1. Regression Performance: Continuous EWQI Prediction
4.2. Classification Performance: EWQI_Class Prediction
4.3. Scenario Comparison and Ablation Insight
4.4. Model Interpretability
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Abbreviations
| EWQI | Entropy-Weighted Water Quality Index |
| WQI | Water Quality Index |
| GWQI | Groundwater Quality Index |
| IQS | Iraqi Standards for Drinking Water |
| ML | Machine learning |
| SVM | Support Vector Machine |
| SVR | Support Vector Regression |
| SVC | Support Vector Classification |
| RF | Random Forest |
| BP-MLP | Backpropagation Multilayer Perceptron |
| CNN | Convolutional Neural Network |
| 1D-CNN | One-Dimensional Convolutional Neural Network |
| RMSE | Root Mean Square Error |
| MAE | Mean Absolute Error |
| EC | Electrical Conductivity |
| TDS | Total Dissolved Solids |
| WL | Water Level |
| RBF | Radial Basis Function |
| ROC | Receiver Operating Characteristic |
| AUC | Area Under the Curve |
References
- Henry, M. Water and the Origin of Life. Water 2024, 16, 2854. [Google Scholar] [CrossRef]
- Makanda, K.; Nzama, S.; Kanyerere, T. Assessing the Role of Water Resources Protection Practice for Sustainable Water Resources Management: A Review. Water 2022, 14, 3153. [Google Scholar] [CrossRef]
- Al-Zubedi, A.S. Groundwater in Iraq; Araa Publication: Baghdad, Iraq, 2022; Available online: https://isbniraq.org/product/groundwater-in-iraq/.
- Yaseen, Z.M.; Sulaiman, S.O.; Sharif, A. The Nature of Tigris–Euphrates Rivers Flow: Current Status and Future Prospective. In Tigris and Euphrates Rivers: Their Environment from Headwaters to Mouth; 2021; pp. 229–242. [Google Scholar] [CrossRef]
- Jalut, Q.H.; Abbas, N.L.; Mohammad, A.T. Management of Groundwater Resources in the Al-Mansourieh Zone in the Diyala River Basin in Eastern Iraq. Groundw. Sustain. Dev. 2018, 6, 79–86. [Google Scholar] [CrossRef]
- Al-Sudani, H.I.Z. Groundwater Utilization and Water Quality in Khanaqin District, Diyala Governorate, Northeast of Iraq. Resour. Environ. Inf. Eng. 2024, 6, 305–312. Available online: https://www.syncsci.com/journal/REIE/article/view/REIE.2024.03.005. [CrossRef]
- Mahmood, M.A.; Abdullah, M.A.; Baider, A.A. Hydraulic Characteristics of Groundwater of Khanaqin Sub-Basin, Diyala Governorate, Northeast of Iraq. Iraqi Natl. J. Earth Sci. 2025, 25, 102–122. [Google Scholar] [CrossRef]
- Huseen, M.R.; Abed, B.S. Groundwater Simulation and Wells Distribution at Qazaniyah City in Diyala Governorate. J. Eng. 2020, 26, 95–113. [Google Scholar] [CrossRef]
- Akilabi, J.A.H.; Alkhlidy, Q.K.N.; Khaleefa, N.H. Water Quality Evaluation of Selected Springs in Qazania Area, Diala Governorate, East Iraq. Iraqi Geol. J. 2021, 54, 112–121. [Google Scholar] [CrossRef]
- Hashim, N.S.; Mutashar, N.S.; Jameel, H.T.; Mahmood, S.M. A Comparative Study to Analyze the Validity of Well Water for Some Areas of Eastern and Western Diyala Governorate 10.21070/acopen.9.2024.9848. Academia Open 2024, 9. [Google Scholar] [CrossRef]
- Aswad, A.H.; Rashed, M.A.; Mahdi, L.E.; Al-Dainey, M.T.; Fahmi, A.H. Assessment of Ground Water Suitability for Different Purpose in Some Wells Diyala. Ann. Rom. Soc. Cell Biol. 2021, 25, 577–585. Available online: https://www.annalsofrscb.ro/index.php/journal/article/view/1117.
- Abbas, N.; Wasimi, S.A.; Al-Ansari, N. Impacts of Climate Change on Water Resources in Diyala River Basin, Iraq. J. Civ. Eng. Archit. 2016, 10, 1059–1074. [Google Scholar] [CrossRef]
- Al-Ansari, N.; Saleh, S.; Abdullah, T.; Abed, S.A. Quality of Surface Water and Groundwater in Iraq. Earth Sci. Geotech. Eng. 2021, 11, 161–199. [Google Scholar] [CrossRef]
- Abdulameer, L.; Nama, A.H.; Al-Shammari, M.M.A.; Al Maimuri, N.M.L.; Rashid, F.L.; Al-Dujaili, A.N. Sustaining Iraq’s Hidden Resource: A Review of the Strategies for Effective Groundwater Management. Water Conserv. Manag. 2025, 9, 120–131. [Google Scholar] [CrossRef]
- Huang, X.; Yao, R.; Zhang, Y.; Li, X.; Yu, Z.; Guo, H. Data-Driven Prediction Modeling of Groundwater Quality Using Integrated Machine Learning in Pinggu Basin, China. J. Hydrol. Reg. Stud. 2025, 62, 102812. [Google Scholar] [CrossRef]
- Niazkar, M.; Piraei, R.; Goodarzi, M.R.; Abedi, M.J. Comparative Assessment of Machine Learning Models for Groundwater Quality Prediction Using Various Parameters. Environ. Processes 2025, 12, 10. [Google Scholar] [CrossRef]
- Sarker, M.A.R.; Chowdhury, M.A.H.; Haque, M.T.; Rahman, M.M.; Meftaul, I.M.; Jubayer, M.F. From Data to Decision: Leveraging Machine Learning and Water Quality Index for Groundwater Quality Evaluation. Sustain. Water Resour. Manag. 2025, 11, 102. [Google Scholar] [CrossRef]
- Krishnamoorthy, L.; Lakshmanan, V.R. Groundwater Quality Assessment Using Machine Learning Models: A Comprehensive Study on the Industrial Corridor of a Semi-Arid Region. Environ. Sci. Pollut. Res. 2025, 32, 28319–28342. [Google Scholar] [CrossRef]
- Karunanidhi, D.; Raj, M.R.H.; Roy, P.D.; Subramani, T. Integrated Machine Learning Based Groundwater Quality Prediction through Groundwater Quality Index for Drinking Purposes in a Semi-Arid River Basin of South India. Environ. Geochem. Health 2025, 47, 119. [Google Scholar] [CrossRef]
- Maleky, S.; Faraji, M.; Hashemi, M.; Esfandyari, A. Investigation of Groundwater Quality Indices and Health Risk Assessment of Water Resources of Jiroft City, Iran, by Machine Learning Algorithms. Appl. Water Sci. 2025, 15, 43. [Google Scholar] [CrossRef]
- Xie, Z.; Liu, W.; Chen, S.; Yao, R.; Yang, C.; Zhang, X.; Li, J.; Wang, Y.; Zhang, Y. Machine Learning Approaches to Identify Hydrochemical Processes and Predict Drinking Water Quality for Groundwater Environment in a Metropolis. J. Hydrol. Reg. Stud. 2025, 58, 102227. [Google Scholar] [CrossRef]
- Sahour, S.; Khanbeyki, M.; Gholami, V.; Sahour, H.; Kahvazade, I.; Karimi, H. Evaluation of Machine Learning Algorithms for Groundwater Quality Modeling. Environ. Sci. Pollut. Res. 2023, 30, 46004–46021. [Google Scholar] [CrossRef]
- Mahdi, L.M.J.; Abood, M.H.; Mohammed, Z.T. Statistical Analysis of the Correlation between Land Surface Temperature and Terrain Elements in Diyala Governorate Using Remote Sensing Techniques. In IOP Conference Series: Earth and Environmental Science; IOP Publishing, 2025; Vol. 1545, p. 012127. [Google Scholar] [CrossRef]
- Abdullah, I.; Awadh, S. Quality Assessment and Hydrochemical Facies of Groundwater in Al-Qarma District, West of Baghdad, Iraq: Implications for Drinking and Irrigation. Iraqi Geol. J. 2025, 58, 139–154. [Google Scholar] [CrossRef]
- Bedolla-Rivera, H.I.; del Carmen González-Rosillo, M. Developing a Groundwater Quality Assessment in Mexico: A GWQI-Machine Learning Model. Hydrology 2025, 12, 285. [Google Scholar] [CrossRef]
- Tian, J.; Yang, J.; Liu, W.; Zhang, M.; Daskalopoulou, K.; Zou, Y.; Xu, N.; Liao, Z.; Huo, Y.; Zhu, Y. Assessing Groundwater Quality for Drinking and Irrigation Using Hydrogeochemistry and Machine Learning in Northern China. Agric. Water Manag. 2025, 322, 109975. [Google Scholar] [CrossRef]
- Koukaras, P.; Tjortjis, C. Data Preprocessing and Feature Engineering for Data Mining: Techniques, Tools, and Best Practices. AI 2025, 6, 257. [Google Scholar] [CrossRef]
- Joel, L.O.; Doorsamy, W.; Paul, B.S. A Comparative Study of Imputation Techniques for Missing Values in Healthcare Diagnostic Datasets. Int. J. Data Sci. Anal. 2025, 20, 6357–6373. [Google Scholar] [CrossRef]
- Malakouti, S.M.; Menhaj, M.B.; Suratgar, A.A. The Usage of 10-Fold Cross-Validation and Grid Search to Enhance ML Methods Performance in Solar Farm Power Generation Prediction. Clean. Eng. Technol. 2023, 15, 100664. [Google Scholar] [CrossRef]
- Aytekin, M.; Ediş, S.; Kaya, İ. A Hybrid PCA-TOPSIS and Machine Learning Approach to Basin Prioritization for Sustainable Land and Water Management. Water 2026, 18, 5. [Google Scholar] [CrossRef]
- Lokman, A.; Ismail, W.Z.W.; Aziz, N.A.A. A Review of Water Quality Forecasting and Classification Using Machine Learning Models and Statistical Analysis. Water 2025, 17, 2243. [Google Scholar] [CrossRef]
- Wolak, W.; Plichta, A.; Orlicki, H. Interpretable Ensemble Learning for Tumor-Type Prediction with a SHAP-Based Evaluation of CatBoost and Voting Classifiers. Sci. Rep. 2026, 16, 1401. [Google Scholar] [CrossRef] [PubMed]
- Martínez-García, J.; Montaño, J.J.; Jiménez, R.; Gervilla, E.; Cajal, B.; Núñez, A.; Leguizamo, F.; Sesé, A. Decoding Artificial Intelligence: A Tutorial on Neural Networks in Behavioral Research. Clin. Health 2025, 36, 77–95. [Google Scholar] [CrossRef]
- Hussain, M.Z.; Hanapi, Z.M.; Abdullah, A.; Hussin, M.; Ninggal, M.I.H. Hybrid-CNNTree: A Convolutional Neural Network and Decision Tree Fusion Model for Wormhole Attack Detection. IEEE Access 2025, 13, 186811–186833. [Google Scholar] [CrossRef]
- Rahman, A.; Ali, M.H.; Malik, A.W.; Mahmood, M.A.; Liou, F. Physics-Based Machine Learning Framework for Predicting Structure–Property Relationships in DED-Fabricated Low-Alloy Steels. Metals 2025, 15, 965. [Google Scholar] [CrossRef]
- Kassem, A.; Sefelnasr, A.; Ebraheem, A.A.; Ali, L.; Baig, F.; Sherif, M. Machine Learning-Based Prediction and Classification of Seawater Intrusion in the Hyper-Arid Coastal Aquifer of Fujairah, UAE. J. Hydrol. Reg. Stud. 2025, 61, 102664. [Google Scholar] [CrossRef]





| Rank | EWQI Range | Classification |
|---|---|---|
| I | Excellent quality water | |
| II | 50–100 | Good quality water |
| III | 100–150 | Median quality water |
| IV | 150–200 | Poor quality water |
| V | Extremely poor quality water |
| No. | EWQI_Class | Count |
|---|---|---|
| 1 | Good | 247 |
| 2 | Excellent | 225 |
| 3 | Extremely Poor | 192 |
| 4 | Median | 132 |
| 5 | Poor | 57 |
| Parameter | Min | Max | Mean | Std | Limit () | Out of Limit (%) | Weight () |
|---|---|---|---|---|---|---|---|
| pH | 7.09 | 7.90 | 7.2078 | 0.0643 | 6.5–8.5 | 0.00 | 0.209779 |
| TDS (ppm) | 260 | 15584 | 2406.7960 | 2361.3240 | 1000 | 68.58148 | 0.098124 |
| K (ppm) | 0.1 | 320 | 27.1508 | 45.8696 | 12 | 28.48769 | 0.077403 |
| Na (ppm) | 31 | 2162 | 357.0469 | 356.1764 | 200 | 53.34115 | 0.118096 |
| Mg (ppm) | 10 | 1194 | 143.7315 | 149.8041 | 100 | 45.25205 | 0.062903 |
| Ca (ppm) | 25 | 1492 | 232.7327 | 221.4152 | 150 | 50.99648 | 0.094106 |
| Cl (ppm) | 60 | 3223 | 515.2345 | 499.0093 | 250 | 56.74091 | 0.105907 |
| SO4 (ppm) | 10 | 4593 | 797.9918 | 769.5791 | 400 | 69.16764 | 0.127304 |
| HCO3 (ppm) | 6 | 2501 | 291.5006 | 347.5571 | 250 | 33.29426 | 0.075755 |
| NO3 (ppm) | 0 | 9 | 1.0341 | 0.7761 | 50 | 0.00 | 0.030624 |
| Scenario | Model | RMSE | MAE | |
|---|---|---|---|---|
| S1 | SVM | 4.1386 | 0.9297 | 0.9991 |
| S1 | BP-MLP | 5.0328 | 3.3509 | 0.9987 |
| S1 | RF | 8.1241 | 3.2019 | 0.9966 |
| S1 | CNN | 9.8056 | 5.6141 | 0.9951 |
| S2 | BP-MLP | 5.0520 | 3.5053 | 0.9987 |
| S2 | SVM | 6.1633 | 1.7722 | 0.9981 |
| S2 | RF | 8.2142 | 3.2709 | 0.9965 |
| S2 | CNN | 11.9060 | 8.3256 | 0.9927 |
| S3 | BP-MLP | 6.0866 | 4.1269 | 0.9981 |
| S3 | SVM | 7.5198 | 2.2760 | 0.9971 |
| S3 | RF | 8.2551 | 3.3523 | 0.9965 |
| S3 | CNN | 17.8854 | 12.0662 | 0.9836 |
| Scenario | Model | Accuracy | Macro-F1 |
|---|---|---|---|
| S1 | SVM | 0.9708 | 0.9728 |
| S1 | CNN | 0.9708 | 0.9649 |
| S1 | RF | 0.9649 | 0.9609 |
| S1 | BP-MLP | 0.9240 | 0.8585 |
| S2 | RF | 0.9591 | 0.9551 |
| S2 | SVM | 0.9357 | 0.9452 |
| S2 | CNN | 0.8947 | 0.8494 |
| S2 | BP-MLP | 0.8772 | 0.8225 |
| S3 | RF | 0.9649 | 0.9609 |
| S3 | SVM | 0.9357 | 0.9336 |
| S3 | CNN | 0.9357 | 0.8742 |
| S3 | BP-MLP | 0.8889 | 0.8557 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).