Submitted:
21 August 2025
Posted:
22 August 2025
You are already at the latest version
Abstract
Keywords:
1. Introduction
2. Methodology
2.1. Datasets
2.2. Machine Learning
2.3. Data Augmentation (PADRE Algorithm)
| Dataset | Before PADRE | After PADRE |
|---|---|---|
| DS0 | 398 items x 145 features | 158.006 items x 436 features |
| DS1 | 806 items x 145 features | 648.830 items x 436 features |
3. Results and Discussion
3.1. Effect of Increasing the Dataset Size with New Real Data

| Model | MAE | R2 |
|---|---|---|
| SVM | 1.13 | 0.82 |
| RFGB | 1.07 | 0.84 |
| RF | 1.47 | 0.74 |
| KNN | 1.82 | 0.64 |
3.2. Effect of Increasing the Dataset Size with Data Augmentation


3.3. Clustering and Further Analysis
3.4. Validation of Model Predictions

4. Conclusions
Supplementary Materials
Acknowledgments
Conflicts of Interest
References
- Ball, M.; Wietschel, M. The Future of Hydrogen – Opportunities and Challenges. Int J Hydrogen Energy 2009, 34, 615–627. [Google Scholar] [CrossRef]
- Züttel, A. Materials for Hydrogen Storage. Materials Today 2003, 6, 24–33. [Google Scholar] [CrossRef]
- Allendorf, M.D.; Stavila, V.; Snider, J.L.; Witman, M.; Bowden, M.E.; Brooks, K.; Tran, B.L.; Autrey, T. Challenges to Developing Materials for the Transport and Storage of Hydrogen. Nat. Chem. 2022, 14, 1214–1223. [Google Scholar] [CrossRef]
- Hirscher, M.; Yartys, V.A.; Baricco, M.; Bellosta von Colbe, J.; Blanchard, D.; Bowman, R.C.; Broom, D.P.; Buckley, C.E.; Chang, F.; Chen, P.; et al. Materials for Hydrogen-Based Energy Storage – Past, Recent Progress and Future Outlook. J. Alloys Compd. 2020, 827, 153548. [Google Scholar] [CrossRef]
- Witman, M.; Ling, S.; Wadge, M.; Bouzidi, A.; Pineda-Romero, N.; Clulow, R.; Ek, G.; Chames, J.; Allendorf, E.; Agarwal, S.; et al. Towards Pareto Optimal High Entropy Hydrides via Data-Driven Materials Discovery. [CrossRef]
- Witman, M.; Ek, G.; Ling, S.; Chames, J.; Agarwal, S.; Wong, J.; Allendorf, M.D.; Sahlberg, M.; Stavila, V. Data-Driven Discovery and Synthesis of High Entropy Alloy Hydrides with Targeted Thermodynamic Stability. Chemistry of Materials 2021, 33, 4067–4076. [Google Scholar] [CrossRef]
- Marques, F.; Balcerzak, M.; Winkelmann, F.; Zepon, G.; Felderhoff, M. Review and Outlook on High-Entropy Alloys for Hydrogen Storage. Energy Environ Sci 2021, 14, 5191–5227. [Google Scholar] [CrossRef]
- Liu, X.; Zhang, J.; Pei, Z. Machine Learning for High-Entropy Alloys: Progress, Challenges and Opportunities. Prog Mater Sci 2023, 131, 101018. [Google Scholar] [CrossRef]
- Han, G.; Sun, Y.; Feng, Y.; Lin, G.; Lu, N. Artificial Intelligence Guided Thermoelectric Materials Design and Discovery. Adv Electron Mater 2023, 9. [Google Scholar] [CrossRef]
- Chen, C.; Zuo, Y.; Ye, W.; Li, X.; Deng, Z.; Ong, S.P. A Critical Review of Machine Learning of Energy Materials. Adv Energy Mater 2020, 10. [Google Scholar] [CrossRef]
- Butler, K.T.; Davies, D.W.; Cartwright, H.; Isayev, O.; Walsh, A. Machine Learning for Molecular and Materials Science. Nature 2018, 559, 547–555. [Google Scholar] [CrossRef]
- Rahnama, A.; Zepon, G.; Sridhar, S. Machine Learning Based Prediction of Metal Hydrides for Hydrogen Storage, Part I: Prediction of Hydrogen Weight Percent. Int J Hydrogen Energy 2019, 44, 7337–7344. [Google Scholar] [CrossRef]
- Rahnama, A.; Zepon, G.; Sridhar, S. Machine Learning Based Prediction of Metal Hydrides for Hydrogen Storage, Part II: Prediction of Material Class. Int J Hydrogen Energy 2019, 44, 7345–7353. [Google Scholar] [CrossRef]
- Suwarno, S.; Dicky, G.; Suyuthi, A.; Effendi, M.; Witantyo, W.; Noerochim, L.; Ismail, M. Machine Learning Analysis of Alloying Element Effects on Hydrogen Storage Properties of AB2 Metal Hydrides. Int J Hydrogen Energy 2022, 47, 11938–11947. [Google Scholar] [CrossRef]
- Kim, J.M.; Ha, T.; Lee, J.; Lee, Y.-S.; Shim, J.-H. Prediction of Pressure-Composition-Temperature Curves of AB2-Type Hydrogen Storage Alloys by Machine Learning. Metals and Materials International 2023, 29, 861–869. [Google Scholar] [CrossRef]
- Maghsoudy, S.; Zakerabbasi, P.; Baghban, A.; Esmaeili, A.; Habibzadeh, S. Connectionist Technique Estimates of Hydrogen Storage Capacity on Metal Hydrides Using Hybrid GAPSO-LSSVM Approach. Sci Rep 2024, 14, 1503. [Google Scholar] [CrossRef] [PubMed]
- Wen, C.; Zhang, Y.; Wang, C.; Xue, D.; Bai, Y.; Antonov, S.; Dai, L.; Lookman, T.; Su, Y. Machine Learning Assisted Design of High Entropy Alloys with Desired Property. Acta Mater 2019, 170, 109–117. [Google Scholar] [CrossRef]
- Halpren, E.; Yao, X.; Chen, Z.W.; Singh, C.V. Machine Learning Assisted Design of BCC High Entropy Alloys for Room Temperature Hydrogen Storage. Acta Mater 2024, 270, 119841. [Google Scholar] [CrossRef]
- Witman, M.D.; Ling, S.; Wadge, M.; Bouzidi, A.; Pineda-Romero, N.; Clulow, R.; Ek, G.; Chames, J.M.; Allendorf, E.J.; Agarwal, S.; et al. Towards Pareto Optimal High Entropy Hydrides via Data-Driven Materials Discovery. J Mater Chem A 2023, 11, 15878–15888. [Google Scholar] [CrossRef]
- Huang, W.; Martin, P.; Zhuang, H.L. Machine-Learning Phase Prediction of High-Entropy Alloys. Acta Mater 2019, 169, 225–236. [Google Scholar] [CrossRef]
- Witman, M.; Ling, S.; Grant, D.M.; Walker, G.S.; Agarwal, S.; Stavila, V.; Allendorf, M.D. Extracting an Empirical Intermetallic Hydride Design Principle from Limited Data via Interpretable Machine Learning. J Phys Chem Lett 2020, 11, 40–47. [Google Scholar] [CrossRef]
- Tynes, M.; Gao, W.; Burrill, D.J.; Batista, E.R.; Perez, D.; Yang, P.; Lubbers, N. Pairwise Difference Regression: A Machine Learning Meta-Algorithm for Improved Prediction and Uncertainty Quantification in Chemical Search. J Chem Inf Model 2021, 61, 3846–3857. [Google Scholar] [CrossRef] [PubMed]
- https://wolverton.bitbucket.io/.
- Ward, L.; Agrawal, A.; Choudhary, A.; Wolverton, C. A General-Purpose Machine Learning Framework for Predicting Properties of Inorganic Materials. NPJ Comput Mater 2016, 2, 16028. [Google Scholar] [CrossRef]
- Dematteis, E.M.; Berti, N.; Cuevas, F.; Latroche, M.; Baricco, M. Substitutional Effects in TiFe for Hydrogen Storage: A Comprehensive Review. Mater Adv 2021, 2, 2524–2560. [Google Scholar] [CrossRef]
- Zhou, P.; Xiao, X.; Zhu, X.; Chen, Y.; Lu, W.; Piao, M.; Cao, Z.; Lu, M.; Fang, F.; Li, Z.; et al. Machine Learning Enabled Customization of Performance-Oriented Hydrogen Storage Materials for Fuel Cell Systems. Energy Storage Mater 2023, 63, 102964. [Google Scholar] [CrossRef]
- https://scikit-learn.org/stable/.
- https://pandas.pydata.org/.
- Lundberg, S.; Lee, S.-I. A Unified Approach to Interpreting Model Predictions. 31st Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, CA, USA. 2017. [CrossRef]
- Meredig, B.; Antono, E.; Church, C.; Hutchinson, M.; Ling, J.; Paradiso, S.; Blaiszik, B.; Foster, I.; Gibbons, B.; Hattrick-Simpers, J.; et al. Can Machine Learning Identify the next High-Temperature Superconductor? Examining Extrapolation Performance for Materials Discovery. Mol Syst Des Eng 2018, 3, 819–825. [Google Scholar] [CrossRef]

| Kmean++ cluster used as test set | Number of instances | MAE | R2 |
|---|---|---|---|
| 1 | 531 | 5.97 | -2.68 |
| 2 | 43 | 2.92 | 0.1 |
| 3 | 69 | 3.26 | 0.01 |
| 4 | 176 | 1.25 | 0.82 |
| Material Class | Quantity of data | MAE | R2 |
|---|---|---|---|
| A2B | 10 | 1.65 | -4.06 |
| AB | 78 | 3.9 | 0.10 |
| AB2 | 454 | 2.10 | 0.37 |
| AB5 | 106 | 1.80 | -0.5 |
| Mg | 32 | 2.68 | -0.26 |
| MIC | 52 | 3.74 | -0.41 |
| SS | 85 | 2.25 | 0.54 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).