Submitted:
24 September 2025
Posted:
24 September 2025
Read the latest preprint version here
Abstract
Keywords:
1. Introduction
2. Materials and Methods
2.1. Models
2.2. Datasets
2.3. Evaluation Metrics
2.4. Evaluation Scenarios
- Case I: Same Dataset Evaluation. Both models are evaluated on the dataset X.
- Case II: Expanded Dataset with Both Models Re-Evaluated. Both models are evaluated on the combined dataset Y.
- Case III: Unequal Dataset Evaluation. Model A is evaluated on X, while Model B is evaluated on Y.
3. Results
3.1. Case I: Same Dataset Evaluation
3.2. Case II: Expanded Dataset with Both Models Re-Evaluated
3.3. Case III: Unequal Dataset Evaluation
4. Discussion
4.1. Case I and Case II
4.2. Case III
4.3. Implications and Potential for Inflated NSE
5. Algorithmic Demonstration of NSE Inflation
5.1. Algorithm Outline
5.2. Practical Notes and Constraints
- Valid range: ; otherwise would exceed 1 or feasibility fails.
- If the right-hand side of (27) is negative, the chosen cannot deliver the target ; increase or reduce .
- Error assignment on Z: construct a vector with , then set for . This can be done deterministically (equal-magnitude entries with alternating signs) or stochastically (i.i.d. random draws rescaled to the exact norm).
5.3. Minimal Worked Algorithms
All Spread (no mean shift)
| Algorithm 1 Variance-Only Construction (All Spread). |
|
All Shift (minimal spread)
| Algorithm 2 Mean-Shift-Only Construction (All Shift). |
|
Mixed Spread+Shift (optional)
5.4. Ensuring Larger RMSE Together with Higher NSE
6. Conclusions
- When models are evaluated on the same dataset (Cases I and II), the rankings by RMSE and NSE are always consistent. A lower RMSE necessarily implies a higher NSE, and no contradictory outcomes are possible.
-
When models are evaluated on unequal datasets (Case III), contradictions may arise. In this setting, it is possible for one model to have a lower RMSE but simultaneously a lower NSE. The necessary and sufficient condition for this outcome is that the total variance of the expanded dataset more than doubles that of the original dataset, i.e.,This situation may occur if the new data block has very large variability, a substantial mean shift, or both.
- A strengthened bound was derived showing that, for a targeted increase of in NSE, one requireswhich guarantees that the combined RMSE is also larger than the original RMSE. This result generalizes the rule: when , the strengthened bound reduces exactly to . The implication is that an inferior model, already worse in RMSE, can nevertheless appear superior under NSE once the dataset is artificially expanded. In other words, the model remains less accurate in absolute terms yet appears better in relative efficiency, exposing a structural vulnerability of NSE.
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Nash, J.E.; Sutcliffe, J.V. River Flow Forecasting through Conceptual Models Part I—A Discussion of Principles. J. Hydrol. 1970, 10, 282–290. [Google Scholar] [CrossRef]
- Liu, C.-Y.; Ku, C.-Y.; Wu, T.-Y.; Chiu, Y.-J.; Chang, C.-W. Liquefaction susceptibility mapping using artificial neural network for offshore wind farms in Taiwan. Eng. Geol. 2025, 351, 108013. [Google Scholar] [CrossRef]
- Zhang, Q.; Miao, C.; Gou, J.; Zheng, H. Spatiotemporal characteristics and forecasting of short-term meteorological drought in China. J. Hydrol. 2023, 624, 129924. [Google Scholar] [CrossRef]
- Hu, J.; Miao, C.; Zhang, X.; Kong, D. Retrieval of suspended sediment concentrations using remote sensing and machine learning methods: A case study of the lower Yellow River. J. Hydrol. 2023, 627, 130369. [Google Scholar] [CrossRef]
- Sahour, H.; Gholami, V.; Vazifedan, M.; Saeedi, S. Machine learning applications for water-induced soil erosion modeling and mapping. Soil Tillage Res. 2021, 211, 105032. [Google Scholar] [CrossRef]
- Chen, W.; Nguyen, K.A.; Lin, B.-S. Rethinking Evaluation Metrics in Hydrological Deep Learning: Insights from Torrent Flow Velocity Prediction. Sustainability.
- Chen, W.; Nguyen, K.A.; Lin, B.-S. Deep Learning and Optical Flow for River Velocity Estimation: Insights from a Field Case Study. Sustainability 2025, 17, 8181. [Google Scholar] [CrossRef]
- Gupta, H.V.; Kling, H.; Yilmaz, K.K.; Martinez, G.F. Decomposition of the mean squared error and NSE performance criteria: Implications for improving hydrological modelling. J. Hydrol. 2009, 377, 80–91. [Google Scholar] [CrossRef]
- Williams, G.P. Friends Don’t Let Friends Use Nash–Sutcliffe Efficiency (NSE) or KGE for Hydrologic Model Accuracy Evaluation: A Rant with Data and Suggestions for Better Practice. Environ. Model. Softw. 2025, 106, 106665. [Google Scholar] [CrossRef]
- Melsen, L.A.; Puy, A.; Torfs, P.J.J.F.; Saltelli, A. The Rise of the Nash–Sutcliffe Efficiency in Hydrology. Hydrol. Sci. J. 2025, 1–12. [Google Scholar] [CrossRef]
- Onyutha, C. Pros and Cons of Various Efficiency Criteria for Hydrological Model Performance Evaluation. Proc. IAHS 2024, 385, 181–187. [Google Scholar] [CrossRef]
- Lamontagne, J.R.; Barber, C.A.; Vogel, R.M. Improved Estimators of Model Performance Efficiency for Skewed Hydrologic Data. Water Resour. Res. 2020, 56, e2020WR027101. [Google Scholar] [CrossRef]
- Chai, T.; Draxler, R.R. Root Mean Square Error (RMSE) or Mean Absolute Error (MAE)?—Arguments Against Avoiding RMSE in the Literature. Geosci. Model Dev. 2014, 7, 1247–1250. [Google Scholar] [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).