Submitted:
29 December 2025
Posted:
31 December 2025
You are already at the latest version
Abstract
Keywords:
1. Introduction
2. Materials and Methods
2.1. Data Collection and Preprocessing
2.1.1. Dataset Assembly and Cleaning
2.1.2. Novel Data Augmentation Protocol
- Gaussian Noise Injection
- Mixup Interpolation
- SMOTE Oversampling
2.2. Machine Learning Model Development
2.2.1. Model Selection and Training
2.2.2. Hyperparameter Optimization
2.3. Model Interpretability and Explainability (SHAP)
2.4. Sustainability Assessment (LCA)
2.5. Multi-Objective Optimization (NSGA-II)
- Maximize slump flow
- Minimize emissions
- Minimize material cost
2.6. External Validation
2.7. Software and Code Availability
3. Results
3.1. Predictive Performance of the Machine Learning Framework
3.2. Model Interpretability via SHAP Analysis
3.2.1. Global Feature Importance
3.2.2. Feature Dependence and Physical Interpretation
3.3. Multi-Objective Optimization for Sustainable SCC Design
3.3.1. Sustainability Benefits of Pareto-Optimal Mixes


3.3.2. Constrained Single-Objective Optimization
3.4. External Validation Using Industrial SCC Mixes
4. Discussion
4.1. Context, Implications, and Future Work
4.1.1. Contextualization with Previous Studies
- Generalization Proof: The external validation results in Figure 12 demonstrate the model’s successful transfer to industrial SCC mixes from Kuwait. Four independent production mixes from a local ready-mix supplier were predicted, and all predictions fall within the mm tolerance, with small and tightly clustered errors and no systematic bias. This confirms that a model trained exclusively on global academic data can generalize to real industrial conditions, providing a rare and robust demonstration of real-world applicability that goes beyond cross-validation statistics alone.1
- Transparency and Interpretability: The global SHAP feature importance in Figure 5 shows that the water-to-binder ratio, superplasticizer dosage, and powder content are consistently dominant, fully aligning with expected rheological behavior and reinforcing confidence in the learned relationships. These findings echo previous explainable-AI analyses on the same dataset, which independently identified water-to-binder ratio, aggregate content, and powder volume as the principal drivers of SCC workability. The close agreement between current SHAP patterns and earlier studies suggests that the improved model is not simply overfitting but is reinforcing physically meaningful trends.
- Integrated Sustainability Assessment: The strong dependence of embodied on cement content (Figure 9) and the Pareto front of sustainable SCC designs (Figure 10) illustrate the value of coupling LCA with ML and evolutionary optimization in a unified framework. Compared with the original dataset, the Pareto-optimal solutions achieve noticeable reductions in , energy, and cost while maintaining acceptable workability, confirming that the optimization procedure is not only mathematically sound but also practically beneficial from a sustainability perspective.
4.1.2. Broader Implications of the Findings
- Accelerated Sustainable Design: Engineers can rapidly explore environmentally optimized mixes guided by the Pareto front in Figure 10. These mixes achieve up to 3.9% reduction while preserving workability requirements, shortening design cycles and reducing experimental load. In combination with the optimization-validation results, which show that the vast majority of Pareto-optimal solutions satisfy standard SCC acceptance criteria, the framework effectively delivers a ready-to-use design map of feasible, greener alternatives rather than isolated “point” recommendations.
- Enhanced Quality Control: With accurate predictions of SCC workability from mix proportions (Figure 2 and Figure 4), the model can be integrated into batching systems to provide real-time guidance and reduce the risk of non-compliant deliveries. The external validation on Kuwaiti industrial mixes indicates that the predictive errors remain small and consistent even when materials and production conditions differ from those represented in the training data. This stability suggests that the model can function as a soft sensor for quality control, flagging potentially problematic batches before casting and supporting proactive adjustments in plant operations.
- Advancement of Data-Driven Materials Science: SHAP interaction patterns in Figure 6, Figure 7 and Figure 8 expose complex nonlinear effects and thresholds that traditional mixture design methods cannot capture, providing new mechanistic insights and hypothesis-generation opportunities. For example, the observed interaction between powder content and superplasticizer dosage, or between aggregate grading and water-to-binder ratio, may motivate targeted experimental campaigns aimed at refining existing design guidelines and updating empirical limits used in codes and company specifications.
4.1.3. Limitations of the Work
- Focus on Fresh Properties Only: The present framework targets workability-related fresh properties. Hardened properties such as compressive strength or durability indicators were not included but are essential for full structural optimization. In particular, the current optimization searches within a feasible fresh-state envelope but does not explicitly enforce long-term mechanical or durability constraints, which must still be checked separately.
- LCA Data Uncertainty: The sustainability assessment is based on regional average emission factors and cost data. Real impacts may vary with supplier-specific processes, transportation distances, and energy mixes. As a result, the absolute values of , energy, and cost should be interpreted as approximate indicators rather than precise project-specific quantities, and recalibration with local LCA datasets is advised before use in critical infrastructure projects.
- Literature-Derived Dataset: Although large, the dataset is derived from published studies and may therefore carry publication biases or over-representation of certain mix types. Industrial data from under-represented regions and applications (e.g., precast elements, high-powder or low-cement SCC) remain limited. While the Kuwaiti validation partially offsets this limitation by confirming performance on unseen industrial mixes, broader multi-regional validations would further strengthen confidence in global deployment.
4.1.4. Future Research Directions
- Integration of Hardened Properties: Extend the framework to predict compressive strength, modulus of elasticity, and durability metrics, enabling fully performance-based optimization of SCC. A natural next step is to embed multi-objective optimization in a joint fresh–hardened property space, balancing workability, mechanical performance, and durability with environmental and economic indicators.
- Advanced Decision Support: Incorporate multi-criteria decision-making (MCDM) methods to help practitioners rank or select solutions from the Pareto front based on project-specific priorities (e.g., carbon-to-cost ratio, robustness to material variability, or construction speed). This would convert the current set of Pareto-optimal mixes into an interactive decision-support tool aligned with stakeholders’ preferences.
- Real-Time Intelligent Batching: Couple the predictive models with sensor-driven feedback from batching plants to automatically adjust mix proportions under material variability. In such a closed-loop system, the ML model would serve as a digital twin of workability, continuously updated with plant measurements and enabling adaptive control strategies that maintain SCC performance despite fluctuations in moisture content, grading, or admixture effectiveness.
- Transfer Learning and Regional Adaptation: Develop transfer learning pipelines to adapt the globally trained model to regional datasets with minimal local data, increasing accessibility for small- and medium-sized concrete producers. The Kuwaiti industrial validation suggests that only modest local calibration may be needed for good performance; formalizing this process through transfer learning, domain adaptation, or active learning would make the framework more scalable and easier to adopt in new regions and for new material systems (e.g., LC3 binders, recycled aggregates, or novel admixtures).
5. Conclusions
- Superior Generalization Capability: The framework is built on the largest and most diverse SCC dataset reported to date, comprising 2,506 mixes from 156 global sources. A three-part data augmentation protocol (Gaussian Noise, Mixup, SMOTE) significantly improved robustness and mitigated the effects of dataset heterogeneity. As a result, the XGBoost model achieved excellent predictive performance with for Slump Flow and for T50.
- Proven Real-World Applicability: External validation on four SCC mixes from Kuwait demonstrated 100% accuracy within the industry-standard tolerance of mm (Figure 12), with a Mean Absolute Error of only 79.9 mm. This provides strong evidence of the model’s practicality for field adoption.
- Transparent and Physically Grounded Insights: Through comprehensive SHAP analysis (Figure 5, Figure 6, Figure 7 and Figure 8), the framework transitions from a black-box predictor to a transparent engineering tool. The model’s learned relationships are physically meaningful, identifying the water-to-binder ratio and superplasticizer dosage as the dominant parameters influencing SCC workability.
- Holistic Sustainable Optimization: By integrating cradle-to-gate LCA (Figure 9) with NSGA-II multi-objective optimization (Figure 10), the framework generates a Pareto front of 50 non-dominated, sustainability-enhanced mix designs. These optimized mixes achieve average reductions of 3.9% in embodied and 2.2% in embodied energy compared to baseline designs.
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
Abbreviations
| SCC | Self-Compacting Concrete |
| ML | Machine Learning |
| XAI | Explainable Artificial Intelligence |
| SHAP | SHapley Additive exPlanations |
| LCA | Life Cycle Assessment |
| NSGA-II | Non-Dominated Sorting Genetic Algorithm II |
References
- El Asri, Y.; Benaicha, M.; Zaher, M.; Hafidi Alaoui, A. Prediction of the compressive strength of self-compacting concrete using artificial neural networks based on rheological parameters. Structural Concrete 2022, 23, 3864–3876. [Google Scholar] [CrossRef]
- Cheng, B.; Mei, L.; Long, W.J.; Kou, S.; Li, L.; Geng, S. Ai-guided proportioning and evaluating of self-compacting concrete based on rheological approach. Construction and Building Materials 2023, 399, 132522. [Google Scholar] [CrossRef]
- Safhi, A.E.M.; Dabiri, H.; Soliman, A.; Khayat, K.H. Prediction of self-consolidating concrete properties using XGBoost machine learning algorithm: Part 1–Workability. Construction and Building Materials 2023, 408, 133560. [Google Scholar] [CrossRef]
- Cakiroglu, C.; Bekdaş, G.; Kim, S.; Geem, Z.W. Explainable Ensemble Learning Models for the Rheological Properties of Self-Compacting Concrete. Sustainability 2022, 14, 14640. [Google Scholar] [CrossRef]
- Safhi, A.E.M.; Dabiri, H.; Soliman, A.; Khayat, K.H. Prediction of self-consolidating concrete properties using XGBoost machine learning algorithm: Rheological properties. Powder Technology 2024, 438, 119623. [Google Scholar] [CrossRef]
- Chakravarthy H G, N.; Seenappa, K.M.; Naganna, S.R.; Pruthviraja, D. Machine Learning Models for the Prediction of the Compressive Strength of Self-Compacting Concrete Incorporating Incinerated Bio-Medical Waste Ash. Sustainability 2023, 15, 13621. [Google Scholar] [CrossRef]
- Cui, T.; Kulasegaram, S.; Li, H. Design automation of sustainable self-compacting concrete containing fly ash via data driven performance prediction. Journal of Building Engineering 2024, 87, 108960. [Google Scholar] [CrossRef]
- Cheng, B.; Mei, L.; Long, W.J.; Kou, S.; Luo, Q.; Feng, Y. AI-guided design of low-carbon high-packing-density self-compacting concrete. Journal of Cleaner Production 2023, 428, 139318. [Google Scholar] [CrossRef]
- Jiang, P.; Zhao, D.; Jin, C.; Ye, S.; Luan, C.; Tufail, R.F. Compressive strength prediction and low-carbon optimization of fly ash geopolymer concrete based on big data and ensemble learning. PLOS ONE 2024, 19, e0310422. [Google Scholar] [CrossRef]
- Wang, M.; Du, M.; Jia, Y.; Chang, C.; Zhou, S. Carbon Emission Optimization of Ultra-High-Performance Concrete Using Machine Learning Methods. Materials 2024, 17, 1670. [Google Scholar] [CrossRef] [PubMed]
- Wakjira, T.G.; Kutty, A.A.; Alam, M.S. A novel framework for developing environmentally sustainable and cost-effective ultra-high-performance concrete (UHPC) using advanced machine learning and multi-objective optimization techniques. Construction and Building Materials 2024, 416, 135114. [Google Scholar] [CrossRef]
- Huang, G.; Abou-Chakra, A.; Geoffroy, S.; Absi, J. Improving the mechanical and thermal performance of bio-based concrete through multi-objective optimization. Construction and Building Materials 2024, 421, 135673. [Google Scholar] [CrossRef]
- Helali, S.; Albalawi, S.; Alanazi, M.; Alanazi, B.; Bel Hadj Ali, N. Optimizing Carbon Footprint and Strength in High-Performance Concrete Through Data-Driven Modeling. Sustainability 2025, 17, 7808. [Google Scholar] [CrossRef]
- Wang, S.; Xia, P.; Gong, F.; Zeng, Q.; Chen, K.; Zhao, Y. Multi objective optimization of recycled aggregate concrete based on explainable machine learning. Journal of Cleaner Production 2024, 445, 141045. [Google Scholar] [CrossRef]
- Huang, P.; Dai, K.; Yu, X. Machine learning approach for investigating compressive strength of self-compacting concrete containing supplementary cementitious materials and recycled aggregate. Journal of Building Engineering 2023, 79, 107904. [Google Scholar] [CrossRef]
- Fang, G.H.; Lin, Z.M.; Xie, C.Z.; Han, Q.Z.; Hong, M.Y.; Zhao, X.Y. Optimized Machine Learning Model for Predicting Compressive Strength of Alkali-Activated Concrete Through Multi-Faceted Comparative Analysis. Materials 2024, 17, 5086. [Google Scholar] [CrossRef]
- Pan, B.; Liu, W.; Zhou, P.; Wu, D.O. Predicting the Compressive Strength of Recycled Concrete Using Ensemble Learning Model. IEEE Access 2025, 13, 2958–2969. [Google Scholar] [CrossRef]
- Sun, B.; Cui, W.; Liu, G.; Zhou, B.; Zhao, W. A hybrid strategy of AutoML and SHAP for automated and explainable concrete strength prediction. Case Studies in Construction Materials 2023, 19, e02405. [Google Scholar] [CrossRef]
- Wang, J.; Deng, J.; Li, S.; Du, W.; Zhang, Z.; Liu, X. Explainable Machine Learning for Multicomponent Concrete: Predictive Modeling and Feature Interaction Insights. Materials 2025, 18, 4456. [Google Scholar] [CrossRef]
- Shanthi Vengadeshwari, R.; Ujwal, M.S.; Shiva Kumar, G.; Mahesh, R.; Sanjay, N.; Rajiv, K.N.; Pandit, P. SHAP-based prediction and optimization of compressive strength in M30 concrete with dry sewage sludge as fine aggregate replacement. Discover Materials 2025, 5, 183. [Google Scholar] [CrossRef]
- Hariri-Ardebili, M.A.; Mahdavi, P.; Pourkamali-Anaraki, F. Benchmarking AutoML solutions for concrete strength prediction: Reliability, uncertainty, and dilemma. Construction and Building Materials 2024, 423, 135782. [Google Scholar] [CrossRef]
| 1 | See the detailed industrial validation summary for Kuwaiti mixes for full numerical metrics and per-mix errors. |










| Target Property | Metric | Value | Interpretation |
|---|---|---|---|
| Slump Flow (mm) | 0.835 | Excellent correlation with observed values | |
| MAE (mm) | 38.2 | Low average absolute error | |
| RMSE (mm) | 51.9 | Acceptable prediction dispersion | |
| T50 (s) | 0.828 | Highly reliable correlation | |
| MAE (s) | 0.21 | Very low absolute error | |
| RMSE (s) | 0.30 | High precision in time prediction | |
| V-Funnel (s) | 0.751 | Good correlation for flow time | |
| MAE (s) | 0.35 | Acceptable error range | |
| RMSE (s) | — | — | |
| L-box () | 0.724 | Acceptable predictive correlation | |
| MAE (ratio) | 0.04 | High precision for ratio prediction | |
| RMSE | — | — |
| Mix ID | Target Slump Flow (mm) | Predicted (mm) | Abs. Error (mm) | Within mm? |
|---|---|---|---|---|
| Kuwait_K700_1 | 600 ± 100 | 678.9 | 78.9 | Yes |
| Kuwait_SRC_Micro | 600 ± 100 | 673.8 | 73.8 | Yes |
| Kuwait_65Nmm2 | 600 ± 100 | 684.6 | 84.6 | Yes |
| Kuwait_SRC_OPC | 600 ± 100 | 682.2 | 82.2 | Yes |
| MAE = 79.9 mm MRE = 13.3% | ||||
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
