Submitted:
07 July 2025
Posted:
08 July 2025
You are already at the latest version
Abstract
Keywords:
1. Introduction
2. Model Description and Dataset Construction
2.1. Reservoir Model
2.2. Decision Variable Selection and Dataset Generation
- Geological Constraints: Variable ranges were defined to comply with reservoir properties. For instance, BHPO was constrained to remain above the bubble point pressure to prevent premature gas breakout and below the reservoir fracture pressure to ensure structural integrity. Similarly, RATEG and RATEW were restricted within the reservoir’s injection capacity to avoid fracturing.
- Engineering Feasibility: Values were chosen to align with practical field operation limits, ensuring system stability. For example, the WGR was constrained by the capacity of surface facilities and well designs, while the IC was optimized to prevent incomplete displacement and uneven fluid distribution.
- Economic Rationality: Parameter ranges were optimized to balance operational costs and economic returns. RATEG and RATEW were adjusted to maximize recovery efficiency while minimizing costs, and IC was optimized to enhance overall production efficiency.
2.2. Model Evaluation Metrics
3. Methodology
3.1. XGBoost
- The number of iterations reaches a predefined limit.
- The reduction in residuals falls below a specified threshold.
3.2. Hyperparameter Optimization Using Metaheuristic Algorithms
3.2.1. Crowned Porcupine Optimization (CPO)
- Data Preprocessing: The dataset is divided into training and testing subsets, and input features are normalized to eliminate dimensional biases and improve model stability.
- Population Initialization: Multiple candidate solutions, each representing a combination of hyperparameters, are randomly generated. Their fitness values are then evaluated to initialize the population.
-
Iterative Optimization: The CPO algorithm iteratively updates the population using a four-stage defensive strategy:
- First Defense Phase: Enhances solution diversity and broadens the search range by adjusting the distance between the predator and the target point, incorporating random perturbations.
- Second Defense Phase: Simulates auditory defense behavior to improve local search capability and diversify solutions.
- Third Defense Phase: Combines local perturbations with regional expansion to enhance global search capability.
- Fourth Defense Phase: Simulates elastic collisions, avoiding local optima and improving global search efficiency.
- First Defense Phase: This phase enhances solution diversity and expands the search range by adjusting the distance between the predator and the target point, incorporating random factors. The update formula is:
- 2.
- Second Defense Phase: Simulating the porcupine’s auditory defense behavior, this phase enhances local search capability and further diversifies solutions. The update formula is:
- 3.
- Third Defense Phase: This phase improves global search capability by combining local perturbations with regional expansion. The update formula is:
- 4.
- Fourth Defense Phase: This phase simulates elastic collisions to improve global search ability and avoid local optima. The update formula is:where, denotes the best solution at iteration ; denotes the position of the -th individual at iteration ; and correspond to the positions of two randomly selected individuals; , ,…, are random values in [0, 1], controlling update magnitude; and are two random integers within [0, N], where N is population size; is randomly generated and contains elements that are either 0 or 1; The position of the predator is represented by , while serves as a parameter to control the search direction; defines the odor diffusion factor, and is the adjustment factor; indicates the resistance experienced by individual i during iteration t.
3.2.2. Grey Wolf optimization (GWO)
3.2.3. Artificial Hummingbird Algorithm (AHA)
3.2.4. Black Kite Algorithm(BKA)
3.3. Enhanced Algorithm
3.3.1. Method Population Initialization via Chebyshev Chaotic Mapping
3.3.2. Elite Opposition-Based Learning Strategy for Population Optimization
3.3.3. ICPO-XGBoost
4. Results and Analysis
4.1. Model Comparison and Analysis
4.2. Model Validation
4.3. Feature Importance Analysis
5. Conclusion
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Massarweh, O.; Abushaikha, A. S. A Review of Recent Developments in CO2 Mobility Control in Enhanced Oil Recovery. Petroleum 2022, 8 (3), 291–317. [CrossRef]
- Wenrui, H.; Yi, W.; Jingwei, B. Development of the Theory and Technology for Low Permeability Reservoirs in China. Petroleum Explor. Dev. 2018, 45 (4), 685–697. [CrossRef]
- Carpenter, C. Experimental Program Investigates Miscible CO2 WAG Injection in Carbonate Reservoirs. J. Pet. Technol. 2019, 71 (1), 47–49. [CrossRef]
- Wang, H.; Kou, Z.; Ji, Z.; et al. Investigation of Enhanced CO2 Storage in Deep Saline Aquifers by WAG and Brine Extraction in the Minnelusa Sandstone, Wyoming. Energy 2023, 265, 126379. [CrossRef]
- Nassabeh, M.; You, Z.; Keshavarz, A.; et al. Hybrid EOR Performance Optimization through Flue Gas–Water Alternating Gas (WAG) Injection: Investigating the Synergistic Effects of Water Salinity and Flue Gas Incorporation. Energy Fuels 2024, 38 (15), 13956–13973. [CrossRef]
- Wu, Z.; Ling, H.; Wang, J.; et al. EOR Scheme Optimization of CO2 Miscible Flooding in Bohai BZ Reservoir. Xinjiang Oil Gas 2024, 20 (4), 70–76.
- Hussain, M.; Boukadi, F. Optimizing Oil Recovery: A Sector Model Study of CO2-WAG and Continuous Injection Techniques. Unpublished, 2025.
- Cui, G.; Yang, L.; Fang, J.; et al. Geochemical Reactions and Their Influence on Petrophysical Properties of Ultra-Low Permeability Oil Reservoirs during Water and CO2 Flooding. J. Pet. Sci. Eng. 2021, 203, 108672. [CrossRef]
- Li, J.; Xi, Y.; Zhang, M.; et al. Applicability Evaluation of Tight Oil Reservoir Gas Channeling and Sweep Control System. Xinjiang Oil Gas 2024, 20 (1), 68–76.
- Bai, Y.; Hou, J.; Liu, Y.; et al. Energy-Consumption Calculation and Optimization Method of Integrated System of Injection-Reservoir-Production in High Water-Cut Reservoir. Energy 2022, 239, 121961. [CrossRef]
- An, Z.; Zhou, K.; Hou, J.; et al. Accelerating Reservoir Production Optimization by Combining Reservoir Engineering Method with Particle Swarm Optimization Algorithm. J. Pet. Sci. Eng. 2022, 208, 109692. [CrossRef]
- Li, D.; Wang, X.; Xie, Y.; et al. Analytical Calculation Method for Development Dynamics of Water-Flooding Reservoir Considering Rock and Fluid Compressibility. Geoenergy Sci. Eng. 2024, 242, 213250. [CrossRef]
- Hourfar, F.; Salahshoor, K.; Zanbouri, H.; et al. A Systematic Approach for Modeling of Waterflooding Process in the Presence of Geological Uncertainties in Oil Reservoirs. Comput. Chem. Eng. 2018, 111, 66–78. [CrossRef]
- Jaber, A. K.; Al-Jawad, S. N.; Alhuraishawy, A. K. A Review of Proxy Modeling Applications in Numerical Reservoir Simulation. Arab. J. Geosci. 2019, 12 (22), 701. [CrossRef]
- Mao, S.; Mehana, M.; Huang, T.; et al. Strategies for Hydrogen Storage in a Depleted Sandstone Reservoir from the San Joaquin Basin, California (USA) Based on High-Fidelity Numerical Simulations. J. Energy Storage 2024, 94, 112508. [CrossRef]
- Pu, W.; Jin, X.; Tang, X.; et al. Prediction Model of Water Breakthrough Patterns of Low-Permeability Bottom Water Reservoirs Based on BP Neural Network. Xinjiang Oil Gas 2024, 20 (2), 37–47.
- Mahdaviara, M.; Sharifi, M.; Ahmadi, M. Toward Evaluation and Screening of the Enhanced Oil Recovery Scenarios for Low Permeability Reservoirs Using Statistical and Machine Learning Techniques. Fuel 2022, 325, 124795. [CrossRef]
- Meng, S.; Fu, Q.; Tao, J.; et al. Predicting CO2-EOR and Storage in Low-Permeability Reservoirs with Deep Learning-Based Surrogate Flow Models. Geoenergy Sci. Eng. 2024, 233, 212467. [CrossRef]
- Gong, A.; Zhang, L. Deep Learning-Based Approach for Reservoir Fluid Identification in Low-Porosity, Low-Permeability Reservoirs. Phys. Fluids 2025, 37 (4). [CrossRef]
- Li, H.; Luo, P.; Bai, Y.; et al. Overview of Machine Learning Algorithms and Their Applications in Drilling Engineering. Xinjiang Oil Gas 2022, 18 (1), 1–13.
- Zhou, W.; Liu, C.; Liu, Y.; et al. Machine Learning in Reservoir Engineering: A Review. Processes 2024, 12 (6), 1219. [CrossRef]
- You, J.; Ampomah, W.; Sun, Q.; et al. Machine Learning Based Co-Optimization of Carbon Dioxide Sequestration and Oil Recovery in CO2-EOR Project. J. Cleaner Prod. 2020, 260, 120866. [CrossRef]
- Thanh, H. V.; Dashtgoli, D. S.; Zhang, H.; et al. Machine-Learning-Based Prediction of Oil Recovery Factor for Experimental CO2-Foam Chemical EOR: Implications for Carbon Utilization Projects. Energy 2023, 278, 127860. [CrossRef]
- Li, H.; Gong, C.; Liu, S.; et al. Machine Learning-Assisted Prediction of Oil Production and CO2 Storage Effect in CO2-Water-Alternating-Gas Injection (CO2-WAG). Appl. Sci. 2022, 12 (21), 10958.
- Khan, W. A.; Rui, Z.; Hu, T.; et al. Application of Machine Learning and Optimization of Oil Recovery and CO2 Sequestration in the Tight Oil Reservoir. SPE J. 2024, 1–21. [CrossRef]
- Talbi, E. G. Machine Learning into Metaheuristics: A Survey and Taxonomy. ACM Comput. Surv. 2022, 54 (6), 1–32. [CrossRef]
- Menad, N. A.; Noureddine, Z. An Efficient Methodology for Multi-Objective Optimization of Water Alternating CO2 EOR Process. J. Taiwan Inst. Chem. Eng. 2019, 99, 154–165. [CrossRef]
- Gao, M.; Liu, Z.; Qian, S.; et al. Machine-Learning-Based Approach to Optimize CO2-WAG Flooding in Low Permeability Oil Reservoirs. Energies 2023, 16 (17), 6149. [CrossRef]
- Kanaani, M.; Sedaghat Kameholiya, A. M.; Amarzadeh, A.; et al. Stacking Learning for Smart Proxy Modeling in CO2–WAG Optimization: A Techno-Economic Approach to Sustainable Enhanced Oil Recovery. ACS Omega 2025, 10 (9), 9563–9582. [CrossRef]
- Chen, T.; Guestrin, C. XGBoost: A Scalable Tree Boosting System. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining; 2016; pp 785–794. [CrossRef]
- Pan, S.; Zheng, Z.; Guo, Z.; et al. An Optimized XGBoost Method for Predicting Reservoir Porosity Using Petrophysical Logs. J. Pet. Sci. Eng. 2022, 208, 109520. [CrossRef]
- Abdel-Basset, M.; Mohamed, R.; Abouhawwash, M. Crested Porcupine Optimizer: A New Nature-Inspired Metaheuristic. Knowl.-Based Syst. 2024, 284, 111257. [CrossRef]
- Liu, H.; Zhou, R.; Zhong, X.; et al. Multi-Strategy Enhanced Crested Porcupine Optimizer: CAPCPO. Mathematics 2024, 12 (19), 3080. [CrossRef]
- Abed-Alguni, B. H.; Paul, D. Island-Based Cuckoo Search with Elite Opposition-Based Learning and Multiple Mutation Methods for Solving Optimization Problems. Soft Comput. 2022, 26 (7), 3293–3312. [CrossRef]
- Faris, H.; Aljarah, I.; Al-Betar, M. A.; et al. Grey Wolf Optimizer: A Review of Recent Variants and Applications. Neural Comput. Appl. 2018, 30, 413–435. [CrossRef]
- Bakır, H. A Novel Artificial Hummingbird Algorithm Improved by Natural Survivor Method. Neural Comput. Appl. 2024, 36 (27), 16873–16897. [CrossRef]
- Hosseinzadeh, M.; Rahmani, A. M.; Husari, F. M.; et al. A Survey of Artificial Hummingbird Algorithm and Its Variants: Statistical Analysis, Performance Evaluation, and Structural Reviewing. Arch. Comput. Methods Eng. 2025, 32 (1), 269–310. [CrossRef]
- Si, T.; Bhattacharya, D.; Nayak, S.; et al. PCOBL: A Novel Opposition-Based Learning Strategy to Improve Metaheuristics Exploration and Exploitation for Solving Global Optimization Problems. IEEE Access 2023, 11, 46413–46440. [CrossRef]
- Yang, J. P.; Chen, W. Z.; Dai, Y.; et al. Numerical Determination of Elastic Compliance Tensor of Fractured Rock Masses by Finite Element Modeling. Int. J. Rock Mech. Min. Sci. 2014, 70, 474–482. [CrossRef]
- Yildiz, B. S.; Pholdee, N.; Bureerat, S.; et al. Enhanced Grasshopper Optimization Algorithm Using Elite Opposition-Based Learning for Solving Real-World Engineering Problems. Eng. Comput. 2022, 38 (5), 4207–4219. [CrossRef]














| Components | Fraction | Components | Fraction |
|---|---|---|---|
| CO2 | 0.005 | C10+ | 0.115 |
| C1N2 | 0.253 | C12+ | 0.123 |
| C2+ | 0.083 | C16+ | 0.074 |
| C5+ | 0.112 | C20+ | 0.057 |
| C7+ | 0.178 |
| Characteristic parameter | Unit | Mean | Maximum | Minimum | Standard deviation |
Skewness | Variable coefficient |
|---|---|---|---|---|---|---|---|
| BHPO | MPa | 17.4 | 20 | 15 | 1.45 | 0.0048 | 0.0826 |
| WGR | % | 50 | 90 | 10 | 30 | 0.0032 | 0.5778 |
| ORAT | m3/d | 65.1 | 90 | 40 | 14.4 | 0.0017 | 0.2212 |
| RATEG | m3/d | 34974.4 | 45000 | 25000 | 4440.5 | 0.0028 | 0.2887 |
| RATEW | m3/d | 46.1 | 60 | 30 | 8.7 | 0.0031 | 0.4972 |
| IC | Day | 106.9 | 150 | 60 | 53.9 | 0.0029 | 0.3849 |
| OPRO | m3 | 1757850.2 | 2354830.2 | 1019188.1 | 293397.1 | -0.2337 | 0.1671 |
| Model | Max learning rate | Learning rate | Warm-up rounds | Early stopping rounds | Sub-sample ratio |
Iteration count |
|---|---|---|---|---|---|---|
| XGBoost | 12 | 0.18 | 5 | 20 | 0.4 | 80 |
| CPO-XGboost | 14 | 0.21 | 8 | 25 | 0.5 | 140 |
| AHA-XGboost | 17 | 0.27 | 10 | 30 | 0.6 | 145 |
| BKA-XGboost | 15 | 0.29 | 10 | 30 | 0.4 | 125 |
| GWO-XGboost | 18 | 0.30 | 10 | 30 | 0.3 | 150 |
| ICPO-XGboost | 15 | 0.35 | 8 | 25 | 0.5 | 135 |
| Model | Data range | R2 | MAPE | MAE | RMSE | MSE |
|---|---|---|---|---|---|---|
| XGBoost | Training set | 0.9343 | 29.54% | 276.71 | 271.24 | 0.0285 |
| Test set | 0.9321 | 27.68% | 255.12 | 252.36 | 0.0242 | |
| Overall data | 0.9325 | 28.31% | 279.58 | 273.51 | 0.0277 | |
| CPO-XGboost | Training set | 0.9796 | 12.67% | 131.84 | 127.62 | 0.0095 |
| Test set | 0.9784 | 11.07% | 126.63 | 105.24 | 0.0074 | |
| Overall data | 0.9788 | 12.26% | 149.12 | 129.67 | 0.0081 | |
| ICPO-XGboost | Training set | 0.9902 | 10.21% | 119.92 | 95.53 | 0.0072 |
| Test set | 0.9894 | 8.47% | 112.65 | 67.94 | 0.0053 | |
| Overall data | 0.9896 | 9.87% | 125.63 | 102.98 | 0.0064 | |
| AHA-XGboost | Training set | 0.9749 | 16.52% | 148.15 | 135.15 | 0.0119 |
| Test set | 0.9721 | 15.94% | 142.92 | 114.28 | 0.0088 | |
| Overall data | 0.9725 | 16.42% | 151.45 | 142.61 | 0.0126 | |
| BKA-XGboost | Training set | 0.9768 | 14.62% | 141.51 | 132.82 | 0.0129 |
| Test set | 0.9754 | 14.28% | 138.87 | 125.95 | 0.0115 | |
| Overall data | 0.9758 | 14.59% | 142.12 | 136.08 | 0.0157 | |
| GWO-XGboost | Training set | 0.9632 | 28.42% | 252.81 | 245.34 | 0.0272 |
| Test set | 0.9627 | 27.33% | 248.92 | 238.26 | 0.0226 | |
| Overall data | 0.9629 | 28.15% | 256.22 | 245.58 | 0.0257 | |
| ICPO-XGboost | Training set | 0.9902 | 10.21% | 119.92 | 95.53 | 0.0072 |
| Test set | 0.9894 | 8.47% | 112.65 | 67.94 | 0.0053 | |
| Overall data | 0.9896 | 9.87% | 125.63 | 102.98 | 0.0064 |
| Parameter | Unit | Case1 | Case2 | Case3 | Case4 | Case5 | Case6 |
|---|---|---|---|---|---|---|---|
| BHPO | MPa | 18.5 | 17.3 | 18.4 | 17.6 | 17.5 | 16.6 |
| WGR | % | 30 | 85 | 87 | 90 | 45 | 90 |
| ORAT | m3/d | 82.73 | 89.85 | 42.01 | 44.85 | 78.45 | 82.73 |
| RATEG | m3/d | 29940.96 | 38116.19 | 39678.19 | 33328.05 | 38912.61 | 39940.96 |
| RATEW | m3/d | 45.38 | 41.47 | 41.16 | 56.77 | 51.04 | 52.35 |
| IC | Day | 122.27 | 184.55 | 106.01 | 61.52 | 113.28 | 82.27 |
| Simulation prediction | ×106m3 | 1.308 | 1.348 | 1.424 | 1.495 | 1.881 | 2.355 |
| Model prediction | ×106m3 | 1.311 | 1.349 | 1.421 | 1.494 | 1.881 | 2.355 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).