Submitted:
08 June 2026
Posted:
09 June 2026
You are already at the latest version
Abstract
Keywords:
1. Introduction
- Establishing an interpretable multidimensional sensitivity analysis and effective feature construction mechanism: To address temporal coupling and nonlinear disturbances in industrial operating data, we integrate dynamic causal identification, parameter influence quantification, and nonlinear importance ranking to quantitatively evaluate each variable’s contribution to energy efficiency objectives. By precisely identifying key influencing factors, this mechanism overcomes the ambiguity of variable selection under traditional experience-driven approaches, reconstructing an effective direct-control feature space centered on device operating frequencies. It accurately isolates environmental disturbances and clarifies the direct mapping between bottom-level devices and system PUE, providing interpretable data support for the paradigm shift from set-point indirect control to critical parameter direct control.
- Designing a device direct-control method under industrial safety constraints: We reformulate system optimization logic from temperature set-point optimization into a black-box optimization problem with multiple hard constraints (air supply safety, physical boundaries and device smoothness). By incorporating multiple industrial safety constraints, we establish a closed-loop collaborative optimization method directly targeting compressor and fan frequencies. This breaks the response delays of traditional cascade control and resolves common oscillation issues in direct-control modes, significantly enhancing industrial applicability. Furthermore, Bayesian optimization’s high sample efficiency and the robustness of surrogate models enable low-cost mapping of optimal compressor–fan frequency combinations, achieving smooth and efficient global energy savings while ensuring thermal safety.
- Conducting industrial application evaluation of energy efficiency and safety using real data: Based on actual operating data from indirect evaporative cooling systems in industrial data centers, we perform dual-dimensional evaluations of energy efficiency and safety. Results show that the proposed secure direct-control optimization method not only meets industrial requirements for energy efficiency and safety but also effectively identifies and coordinates device coupling relationships, achieving intelligent trade-offs between air-side and liquid-side cooling capacity.
2. Related Work
- PID-based Control
- MPC-based Control
- RL-based Control
- BO-based Control
3. Methodology
3.1. Research Object and Objectives
3.1.1. Research Object
- Primary Air (Indoor Side): This stream originates from the hot return air of the data center. It passes through the dry channel of the heat exchanger, where sensible heat is exchanged with the secondary air through the channel walls. Its temperature decreases while absolute humidity remains unchanged, and it is ultimately supplied back to the data hall as cooled air.
- Secondary Air (Outdoor Side): This stream is drawn from outdoor fresh air or partially from primary exhaust. It flows through the wet channel of the heat exchanger, where it is sprayed with circulating water. Evaporation of water on the wet channel surface absorbs substantial latent heat, significantly lowering the secondary air temperature and enabling efficient heat absorption from the primary air. The warmed, humid exhaust is then discharged outdoors.
- Heat Exchanger: The thermodynamic core of the system, enabling non-contact energy transfer between primary and secondary air streams while fully isolating outdoor contaminants.
- Supply Fan (Indoor Fan): Regulates primary air volume, directly determining the supply airflow and cooling delivery efficiency within the data center.
- Exhaust Fan (Outdoor Fan): Controls secondary air volume, establishing negative pressure within the heat exchanger to enhance evaporation, while expelling absorbed heat to the ambient environment.
- Circulating Pump: Delivers cooling water to the spray system. Its operating frequency influences spray coverage and evaporation efficiency.
- Mechanical Refrigeration Backup (Compressor/DX Unit): Comprising a variable-frequency compressor, condenser, and evaporator, this subsystem provides auxiliary cooling when evaporative cooling alone is insufficient (e.g., under hot and humid weather). As the primary energy consumer in the system, precise frequency regulation of the compressor is critical for energy optimization.
3.1.2. Research Objectives
- Boundary Constraints: The operating frequency or speed percentage of all equipment must remain within the physically allowed range:
- Rate-of-Change Constraints: To prevent frequent power-consuming equipment start-stop cycles or excessive mechanical stress, the magnitude of change in the control variables between adjacent control periods (t and t-1) is limited:
- Operational Safety Constraints: It must be ensured that critical thermal environment indicators (mainly the supply air temperature, ) are always maintained within the safety threshold. Since is a complex non-linear function of X and Z, this constraint is expressed as:
3.2. Methodology Design
- Data Acquisition and Processing: The dataset collected from the real data center includes external environmental data, HVAC equipment control and operational data, indoor environmental and IT load data. Data processing involves sequential parsing, anomaly detection, missing-value handling, data transformation, and data fusion.
- Multidimensional Sensitivity Analysis and Effective Feature Construction: From temporal, global, and nonlinear perspectives, dynamic causal identification, parameter influence quantification, and nonlinear importance ranking are performed. Cross-validation is then used to construct the final set of effective key features.
- Control Strategy Optimization under Safety Constraints: Industrial safety constraints are embedded into each stage of the Bayesian optimization algorithm, including dataset construction and updating, surrogate model construction and updating, acquisition function design under safety constraints, candidate point selection within dynamic feasible regions, and application feedback. Specifically,Sample set construction and updating includes initial dataset generation and iterative updates. Surrogate model construction and updating includes energy-consumption surrogate modeling and safety surrogate modeling.
- Energy Efficiency and Safety Evaluation for Industrial Applications: The principle of industrial energy optimization is to achieve energy savings under the premise of ensuring safety. Accordingly, the evaluation focuses on three key aspects: energy-saving effect evaluation, evaluation of control strategy fluctuation and composite safety index, and evaluation of temperature safety in control strategy application.
3.2.1. Multidimensional Sensitivity Analysis and Effective Feature Construction
3.2.1.1. Dynamic Causality Identification (Granger Causality Test)
- Restricted Model (): Uses only the lagged terms of Y for prediction:
- Unrestricted Model (): Simultaneously introduces the lagged terms of both Y and the candidate variable X for prediction:
3.2.1.2. Parameter Influence Quantification (Sobol’ Sensitivity Analysis)
3.2.1.3. Non-linear Importance Ranking (XGBoost Feature Importance)
3.2.2. Control Strategy Optimization under Safety Constraints
- Energy surrogate model: Fits the PUE response surface and outputs the predictive mean and variance .
- Temperature surrogate model: Fits the supply-air temperature response surface and outputs the predictive mean and variance .
3.2.3. Energy Saving Effect and Safety Evaluation for Industrial Applications
- 2) Evaluation of Control Strategy Fluctuation and Composite Safety Index: To quantify the impact of the control strategy on long-term hardware reliability, a dynamic composite safety index CSI(t) is adopted[31]. This index integrates six core fluctuation metrics (MAD, FII, SD, RSD, CV, CD) and introduces a load-aware weighting mechanism to reflect risk sensitivity under varying operating conditions:
- Temperature Violation Rate (TVR): Defined relative to a preset safety upper limit (e.g., 22 °C). Over the sampling horizon , TVR characterizes the frequency of departures from safe operating conditions:
- Temperature Standard Deviation (TSD): Calculated as the standard deviation of the supply-air temperature time series, this metric evaluates the stability of the constrained surrogate model during control:
4. Experimental Results and Analysis
4.1. Data Description and Experimental Setup
- Data Acquisition
- Data Preprocessing
- ✓
- Environmental Variables (e.g., outdoor dry-bulb temperature, wet-bulb temperature/relative humidity);
- ✓
- Internal Load Variables (e.g., IT server load);
- ✓
- System Control Variables (e.g., equipment operating speeds or frequencies);
- ✓
- System State Variables (e.g., temperature, pressure, flow rate, and power consumption).
- Feature Engineering and Selection
- Dataset Partitioning
- ExperimentalEnvironment
- BaselineAlgorithms
- ✓
- PID Control: Representing the conventional feedback mechanism most widely adopted in industry, the PID baseline reflects the standard engineering tuning level based on error regulation. As the original system under study utilizes PID control, this comparison quantifies the improvement of the proposed strategy in mitigating the inherent lag of high-inertia systems.
- ✓
- RL Control: This baseline utilizes the Deep Deterministic Policy Gradient (DDPG) algorithm. As a state-of-the-art model for continuous action spaces, DDPG represents the frontier of purely data-driven intelligent control. It is used to evaluate the advantages and disadvantages of the proposed method in terms of model generalization and policy safety.
4.2. Results of Multidimensional Sensitivity Analysis and Effective Feature Construction
4.2.1. Dynamic Causality Identification Based on Granger Causality Test
- Environmental and Load Features: Outdoor input air temperature (outdoor_input_air_temp) showed highly significant correlation across all lag orders (p<0.001), validating the immediate and continuous impact of the environment on system energy efficiency. Outdoor humidity and IT load exhibited clear time-lag effects, crossing the significance threshold at Lag≥2and Lag≥3 respectively, which aligns with the physical inertia of heat transfer and temperature control compensation in industrial cooling systems.
- Equipment Control Features: Indoor and outdoor fan speed percentages, along with the frequencies of various compressor groups (dxc01-04), remained highly significant across different lag orders, indicating that adjustments to control parameters are the primary drivers of PUE fluctuations.
- Redundant Features: Notably, the p-values for pump frequency (pump_frequency) across all observed lag orders were much greater than 0.05 (ranging from 0.5169 to 0.7384). This suggests that within this specific operating condition or data cycle, changes in pump frequency do not provide significant information regarding the future trend of PUE, indicating a lack of temporal causality. Based on this result, this study treats pump_frequency as a redundant feature and excludes it from the final feature set to reduce model dimensionality (possibly due to the pump operating at a constant state for extended periods in this system).
4.2.2. Parameter Influence Quantification Based on Sobol’ Sensitivity Analysis
- Decisive Role of Compressor Power Consumption: As shown in Table 3, the sensitivity indices for the frequencies of the four compressor groups (dxc01-04) occupy an absolutely dominant position. The sum of their total-effect indices (ST) reaches 0.9865, indicating that in this indirect evaporative cooling system, the vast majority of PUE fluctuations are attributed to changes in compressor operating states. From an energy decoupling perspective, this result confirms that compressor frequency is the most critical controlled variable determining system energy efficiency and serves as the core entry point for energy efficiency improvement in the subsequent SDCO algorithm.
- Interaction Response Characteristics of Auxiliary Equipment: The impact of outdoor fan speed percentage (ST=0.0707) and pump frequency (ST=0.0700) on energy consumption ranks in the second tier, with sensitivity significantly lower than that of the compressors. Notably, pump frequency exhibits distinct nonlinear interaction characteristics: its interaction effect (ST−S1=0.0362) is comparable in magnitude to its first-order effect (S1=0.0338). This reflects that, in terms of physical operating logic, the pump is highly dependent on the cooling demand generated by the compressors and the heat dissipation conditions of the outdoor environment, lacking independent strong adjustment attributes. Combined with the non-significant performance of this variable in the previous Granger causality test, this validates the secondary status of pump frequency in the sequential control logic.
- Weak Sensitivity Response of Environmental Parameters and Load: Although IT load and environmental temperature/humidity showed significant time-lag lead relationships in the causality test, their ST indices in the Sobol’ transient sensitivity analysis are close to zero or even show slight negative values (statistical noise). This indicates that in steady-state fluctuations over short time scales, the direct perturbation of environmental parameters on PUE is small; their influence is manifested more indirectly by triggering control strategies (such as frequency adjustments).
4.2.3. Non-Linear Importance Ranking Based on XGBoost Gain Ranking
- Dominance of Core Control Variables: Compressor frequency (especially dxc03) demonstrates a decisive influence, with an importance coefficient as high as 0.4462. Combined with the aforementioned Sobol’ global sensitivity analysis, this reinforces that compressor operating status is the core factor determining the transient energy consumption of indirect evaporative cooling systems. The prominent contribution of dxc03 may stem from it bearing the primary peak-shaving load during the data collection period, characterized by higher fluctuation frequency and regulation gain.
- Synergy between Load Drive and Fan Regulation: IT_load (0.1161) ranks at the top of the second tier, with a contribution far exceeding other environmental parameters, indicating that load fluctuation is the source inducing changes in energy consumption. Meanwhile, the importance ranking of the outdoor fan (0.0781) and indoor fan (0.0413) closely follows the compressors, reflecting that in the air-side heat exchange cycle, the adjustment of fan speed percentage has a non-negligible marginal impact on overall system efficiency.
- Identification and Removal of Weakly Correlated Features: Experimental data reveal that the importance coefficients of outdoor input air temperature (0.0015), pump frequency (0.0010), and outdoor humidity (0.0007) all approach zero. This conclusion holds significant physical meaning: within a short 3-minute sampling period, quasi-static changes in environmental temperature and humidity are difficult to reflect immediately in sharp PUE fluctuations via time-delay effects. The extremely low contribution of pump frequency corroborates its non-significant p-value (>0.05) in the Granger causality test, further confirming that this indicator can be eliminated as a redundant feature in subsequent energy consumption modeling.
4.2.4. Comprehensive Feature Identification and Feature Space Construction
- Synergy and Convergence of Core Control Variables: The analysis shows that compressor frequency (dxc01-04), outdoor fan speed percentage, and indoor fan speed percentage exhibit high consistency across the three evaluation dimensions. The Granger test established their temporal driving relationships (P < 0.05), the Sobol’ analysis revealed their absolute dominance over total energy consumption fluctuations (especially the high sensitivity of dxc03), and the XGBoost gain ranking further verified their critical role in high-dimensional nonlinear mapping. Consequently, these variables are defined as the core control decision variables of the system.
- Redundant Feature Elimination: Regarding pump frequency (pump_frequency), the experimental results show a certain peculiarity. Although Sobol’ sensitivity analysis yielded a moderate ST of 0.0700—reflecting that the pump’s operation itself accounts for a certain share of energy consumption—both the Granger test (P > 0.05) and XGBoost ranking (coefficient of only 0.0010) consistently indicate that this variable contributes almost nothing to dynamic response and predictive gain. This divergence suggests that under the current control logic, the pump may be operating at a constant frequency or in a passive following state, and its adjustment has an extremely weak marginal effect on the system’s total energy consumption fluctuations. Following the principle of parsimony (Occam’s Razor), it is identified as a redundant feature and excluded from subsequent modeling to reduce the dimensionality of the control space.
- Physical Analysis of Environmental and Operating Condition Features: IT_load exhibits characteristics of high predictive gain (XGBoost 0.1161) but low transient fluctuation contribution (Sobol’ 0.0007). This indicates that while IT load is not the trigger for instantaneous sharp oscillations in energy consumption, it is the core environmental variable determining the baseline energy consumption level. Combined with the causality from the Granger test, it is defined as a key operating condition disturbance variable. While experimental data show the direct predictive contribution of outdoor temperature and humidity is near zero (XGB < 0.002)—deviating from traditional physical intuition—deep engineering insight suggests that the control algorithm of this indirect evaporative cooling system has achieved a high degree of decoupling from external disturbances. However, considering the physical thermal inertia of the building and heat exchange media, the impact of the outdoor environment may involve significant lag effects.
4.3. Energy Saving Effect and Application Safety Evaluation
4.3.1. Energy-Saving Effect Evaluation
4.3.2. Application Safety Evaluation
4.3.2.1. Evaluation of Control Strategy Fluctuations and Composite Safety Index
4.3.2.2. Evaluation of Temperature Safety in Control Strategy Applications
4.3.3. Comprehensive Discussion
4.3.3.1. Generalizability
- Search precision in complex nonlinear spaces: Experimental data show that SDCO achieves a maximum deviation of only 0.213 °C in high-thermal-inertia cooling systems, far lower than traditional PID. This proves that when facing large-scale state spaces and nonlinear constraints (e.g., nonlinear coupling between compressor frequency and supply-air temperature), globally coordinated search algorithms can effectively avoid local optima and achieve precise anchoring of target temperatures. This property is highly valuable in industrial scenarios with stringent environmental control requirements.
- Device lifespan and O&M cost trade-off: Evaluation of fluctuation (TSD) reveals that SDCO not only reduces energy consumption but also mitigates compressor mechanical wear by suppressing frequency oscillations. In industrial practice, the quality of a control strategy depends not only on instantaneous efficiency but also on its impact on equipment life-cycle cost (LCC). SDCO’s ability to maintain steady operation provides a technical pathway to reducing long-term hardware failure rates in data centers.
- Effectiveness of thermal inertia compensation: SDCO incorporates lag effects of ambient temperature and humidity into feature engineering, constructing predictive regulation logic that offsets physical delays in cooling cycles. This transition from feedback regulation to predictive-collaborative regulation represents a core direction for the evolution of industrial automation.
4.3.3.2. Limitations
- Potential interference from multi-compressor collinearity: When multiple compressors operate in parallel, strong collinearity among variables may cause local collapse of the search space. Under extreme operating transitions, highly correlated input signals may induce temporary decision stagnation. Further decoupling algorithms or regularization constraints are needed to enhance robustness.
- Trade-off between computational demand and response latency: Compared with the lightweight PID algorithm, SDCO and deep RL require higher computational resources. In high-frequency real-time control systems, convergence speed may become a bottleneck. Balancing search precision with reduced computational overhead is a key challenge for deployment on edge-computing devices.
- Generalization under extreme conditions: Current evaluation is based on typical meteorological data and stable load scenarios. Under sudden hot spots or extreme outdoor heat, conflicts may arise between boundary protection logic and energy-optimization logic (i.e., safety vs. efficiency prioritization). Stress testing in harsher physical simulation environments is required.
5. Conclusions
- Feature reconstruction and selection framework for direct control: To address high redundancy in industrial data, this study innovatively integrates Granger causality (temporal identification), Sobol’ sensitivity analysis (global quantification), and XGBoost (nonlinear selection). This framework not only achieves dimensionality reduction but also reveals input-output coupling mechanisms: identifying IT load as the core feed forward disturbance variable, locking compressor and fan frequencies as core direct control variables, and eliminating redundant variables (e.g., pump frequency).
- Direct energy-saving control and multi-constraint optimization via SDCO: Energy-efficiency optimization is modeled as a black-box problem with multiple engineering constraints (physical boundaries, rate-of-change limits, temperature safety thresholds). An end-to-end safety-constrained Bayesian optimization algorithm is applied, bypassing intermediate temperature setpoints and directly optimizing device frequencies. Experiments confirm that SDCO avoids surrogate model search-space collapse and maintains high precision and stability in nonlinear, non-convex control spaces.
- Validation of superior energy savings and intelligent coordination: Under peak conditions, SDCO achieves a PUE optimization rate of 11.733%, significantly outperforming PID and RL. Its intelligent coordination logic prioritizes low-energy devices, leveraging fan-side cooling potential to reduce compressor frequency. This smooth control avoids PID’s high-frequency oscillations and RL’s response lag, simultaneously saving energy and reducing mechanical wear.
- Establishment of high temperature safety and robustness: Safety evaluation shows that SDCO exhibits strong predictive compensation under environmental disturbances and load surges. Its supply-air temperature deviation is only 0.213 °C, with TSD as low as 0.037, achieving a near-zero violation target. In contrast, PID suffers frequent overshoot due to system inertia, while RL shows instability during condition switching. By explicitly handling boundary constraints, SDCO ensures server thermal safety and demonstrates high reliability and generalization under industrial-scale complex conditions.
Author Contributions
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Gao, Y.; Wang, L.; Chen, J. Energy consumption and cooling efficiency in data centers: A review. Energies 2020, 13, 1–29. [Google Scholar]
- Duan, Z.; Zhan, C.; Zhang, X.; et al. Indirect evaporative cooling: Past, present and future potentials. Renew. Sustain. Energy Rev. 2012, 16, 6823–6850. [Google Scholar] [CrossRef]
- Johnson, M.A.; Moradi, M.H. PID Control: New Identification and Design Methods; Springer: London, UK, 2005. [Google Scholar]
- Wei, T.; Wang, Y.; Zhu, Q. Deep reinforcement learning for building HVAC control. Article No. 22; Proceedings of the 54th Annual Design Automation Conference (DAC ’17). New York, NY, USA, ACM, 2017; 4, pp. 1–6. [Google Scholar]
- Dulac-Arnold, G.; Mankowitz, D.; Hester, T.; et al. Challenges of real-world reinforcement learning. arXiv 2019, arXiv:1904.12901. [Google Scholar] [CrossRef]
- Ang, K.H.; Chong, G.C.Y.; Li, Y. PID control system analysis, design, and technology. IEEE Trans. Control Syst. Technol. 2005, 13, 559–576. [Google Scholar]
- Afram, A.; Janabi-Sharifi, F. Theory and applications of HVAC control systems – A review of model predictive control (MPC). Build. Environ. 2014, 72, 343–355. [Google Scholar] [CrossRef]
- Lillicrap, T.P.; Hunt, J.J.; Pritzel, A.; et al. Continuous control with deep reinforcement learning. arXiv 2015, arXiv:1509.029718. [Google Scholar]
- Shahriari, B.; Swersky, K.; Wang, Z.; et al. Taking the human out of the loop: A review of Bayesian optimization. Proc. IEEE 2016, 104, 148–175. [Google Scholar] [CrossRef]
- Castellano, S. Investigation of Proportional Integral Derivative Control Stability for Mission Critical Cooling Equipment. In 2015 IEEE IAS Electrical Safety Workshop (ESW); IEEE: New York, NY, USA, 2015; pp. 1–8. [Google Scholar]
- Bell, G.C.; Storey, B.; Patterson, M.K. Control of Computer Room Air Conditioning using IT Equipment Sensors; UNT Digital Library: Denton, TX, USA, 2017; p. 11. [Google Scholar]
- Zhao, J.; Chen, Z.; Li, H.; et al. A model predictive control for a multi-chiller system in data center considering whole system energy conservation. Energy Build. 2024, 324, 114919. [Google Scholar] [CrossRef]
- Lazic, N.; Boutilier, C.; Lu, T.; et al. Data center cooling using model-predictive control. In Advances in Neural Information Processing Systems (NeurIPS 2018); Curran Associates, Inc.: Montréal, Canada, 2018; pp. 3818–3827. [Google Scholar]
- Hübsch, O.; Horovitz, S. Learning and Model Predictive Control Applied to Energy Optimization of Chiller Plants for Data Centers. TFRT-6291; Master’s thesis. Lund University, 2025. [Google Scholar]
- Gao, J. Machine learning applications for data center optimization. In Google White Paper; 2014; pp. 1–13. [Google Scholar]
- Gao, Y.; Xu, M.; Wang, Z.; et al. Data center cooling system energy saving using deep reinforcement learning. Appl. Energy 2020, 278, 115607. [Google Scholar]
- Gao, G.; Li, J.; Wen, Y. DeepComfort: Energy-efficient thermal comfort control in smart buildings via deep reinforcement learning. IEEE Internet Things J. 2020, 7, 8472–8484. [Google Scholar] [CrossRef]
- Du, Y.; Zandi, H.; Kotevska, O.; et al. Intelligent multi-zone residential hvac control strategy based on deep reinforcement learning. Appl. Energy 2021, 281, 116117. [Google Scholar] [CrossRef]
- Wang, Y.; Lu, Y.; Zhang, W. Adaptive energy optimization for cloud data centers: A deep reinforcement learning approach. IEEE Trans. Sustain. Comput. 2022, 7, 600–612. [Google Scholar]
- Chervonyi, Y.; Dutta, P.; Trochim, P.; et al. Semi-analytical industrial cooling system model for reinforcement learning. arXiv 2022, arXiv:2207.13131. [Google Scholar] [CrossRef]
- Chauhan, A.K. AI-driven predictive control for data center HVAC systems. Heat Pump. Technol. Mag. 2025, 43, 12–19. [Google Scholar]
- Tian, Y.; Wang, J.; Qi, Z.; et al. Calibration method for sensor drifting bias in data center cooling system using Bayesian Inference coupling with Autoencoder. J. Build. Eng. 2023, 67, 105961. [Google Scholar] [CrossRef]
- Sharma, D.; Shah, S.L. Control co-design of commercial building chiller plant using Bayesian optimization. Energy Build. 2020, 210, 109736. [Google Scholar]
- Konstantakopoulos, G.D.; Bletsas, A. Safe contextual Bayesian optimization for PID tuning. IEEE Control Syst. Lett. 2020, 4, 722–727. [Google Scholar]
- Granger, C.W.J. Investigating causal relations by econometric models and cross-spectral methods. Econometrica 1969, 37, 424–438. [Google Scholar] [CrossRef]
- Sobol, I.M. Sensitivity estimates for nonlinear mathematical models. Math. Model. Comput. Exp. 1993, 1, 407–414. [Google Scholar]
- Chen, T.; Guestrin, C. XGBoost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, New York, NY, USA; ACM, 2016; pp. 785–794. [Google Scholar]
- Wang, R.; Zhou, X.; Dong, K.S.; et al. Kalibre: Knowledge-based Neural Surrogate Model Calibration for Data Center Digital Twins. In Proceedings of the 7th ACM International Conference on Systems for Energy-Efficient Buildings, Cities, and Transportation, New York, NY, USA; ACM, 2020; pp. 220–229. [Google Scholar]
- Liu, L.X.; Dong, H.; Liu, M.; et al. Comprehensive Assessment of Data Center Energy Optimization Methods for Industrial Applications. Energy Rep. 2026, 12, 88–101. [Google Scholar] [CrossRef]







| Ref. | Control Mode | Experimental Data | Feature Analysis & Construction | Optimization Type & Targets | Energy Savings Metrics | Safety / Volatility Metrics |
|---|---|---|---|---|---|---|
| [12] | MPC | Multi-chiller system (Real-world operational data) | / | Direct Device Control: Chilled water pump frequency, cooling tower fan speed | System COP, Total energy savings rate | Supply pressure fluctuation (Pressure Surge) |
| [13] | MPC | Google Data Center (Real-world data) | / | Set-point Optimization: Supply air temp, condenser water pump pressure set-point | energy savings percentage | Sensor observation bias, Data confidence intervals |
| [14] | MPC | Chiller system (Simulation data) | / | Set-point Optimization: Thermal storage thresholds, chiller supply water temp | Total cooling power, Peak load reduction rate | Storage system pressure limits, Switching frequency |
| [15] | MPC | Google Data Center (Real-world data) | / | Set-point Optimization: Target ambient temp, chiller load distribution | PUE optimization rate | Hardware lifespan loss, System failure rate |
| [16] | RL | Cooling system (Real-world operational data) | / | Direct Device Control: Compressor frequency, fan speed, pump frequency | Total Power Consumption | Indoor air temperature excursion |
| [17] | RL | Smart Buildings (Simulation data) | Feature Space Reduction: Defined state boundaries to strip redundant features and improve convergence | Hybrid Optimization: PMV target set-point, air flow rate and heater power direct control | Energy Efficiency Ratio (EER) | PMV index deviation, Dew point safety |
| [18] | RL | Multi-zone residential HVAC (Simulation) | / | Hybrid Optimization: Zone target temp set-points, VAV terminal opening direct control | Multi-zone collaborative energy savings | Zone temperature fluctuation variance |
| [19] | RL | Cloud Data Center (Real-world data) | / | Joint Optimization: Server frequency (DVFS) and cooling fan frequency | PUE, Cloud server energy efficiency ratio | Task Latency, System availability |
| [20] | RL | Industrial cooling (Semi-analytical simulation) | / | Direct Device Control: Industrial pump frequency, cooling tower speed % | / | Control action jittering |
| [21] | MPC | Data Center HVAC (Real-world data) | / | Direct Device Control: Chilled pump frequency, cooling tower fan frequency | Comprehensive efficiency, PUE optimization rate | Flow fluctuation rate, Sensor redundancy safety |
| [22] | BO | Cooling system (Hybrid Real/Simulation) | Deep Feature Extraction: Autoencoder for high-dimensional compression and non-linear reconstruction | Sensor Bias Calibration: Calibrated drifting sensors to ensure data-source-level control accuracy | Reduction in over-cooling loss due to perception errors | Sensor Health Index, Fault detection accuracy (F1-score) |
| [23] | BO | HVAC Systems (Real-world data) | / | Set-point Optimization: Min outdoor air flow, return air temp set-point | Daily average HVAC power | Dynamic response time, Duration of temp excursion |
| [24] | BO | PID Tuning (Simulation data) | / | Set-point Optimization: Adaptive tuning of PID gains (Kp, Ki, Kd) | Regulating Power | Constraint Violation frequency |
| Our Paper | BO | Industrial Data Center IECS (Real-world data) | Multi-dimensional Decoupled Analysis: Combined Granger Causality, Sobol’ Sensitivity, and XGBoost for interpretability and rigorous reduction | End-to-End Direct Control: Direct frequency/speed control of internal/external fans and compressors | PUE optimization rate, Total energy reduction | Control stability, Integrated Safety Index, Temperature Violation Rate (TVR), Temp Std Dev (TSD) |
| SSR Based F Test for PUE | ||||
|---|---|---|---|---|
| Variables | lag=1 | lag=2 | lag=3 | Lag=4 |
| outdoor_input_air_temp | 0.0006 | 0.0000 | 0.0000 | 0.0000 |
| outdoor_input_air_humidity | 0.2829 | 0.0139 | 0.0033 | 0.0000 |
| IT_load | 0.1240 | 0.1233 | 0.0000 | 0.0000 |
| indoor_fan_speed_percent | 0.0388 | 0.0000 | 0.0000 | 0.0000 |
| outdoor_fan_speed_percent | 0.0000 | 0.0000 | 0.0000 | 0.0000 |
| dxc01comphz_frequency | 0.0000 | 0.0000 | 0.0000 | 0.0000 |
| dxc02comphz_frequency | 0.0000 | 0.0000 | 0.0000 | 0.0000 |
| dxc03comphz_frequency | 0.0000 | 0.0000 | 0.0000 | 0.0000 |
| dxc04comphz_frequency | 0.0000 | 0.0000 | 0.0000 | 0.0000 |
| pump_frequency | 0.6779 | 0.5169 | 0.6005 | 0.7384 |
| Variables | S1 | ST |
|---|---|---|
| dxc02comphz_frequency | 0.3088 | 0.3284 |
| dxc03comphz_frequency | 0.2540 | 0.2992 |
| dxc04comphz_frequency | 0.1408 | 0.2074 |
| dxc01comphz_frequency | 0.1225 | 0.1515 |
| outdoor_fan_speed_percent | 0.0474 | 0.0707 |
| pump_frequency | 0.0338 | 0.0700 |
| indoor_fan_speed_percent | 0.0062 | 0.0203 |
| outdoor_input_air_temp | -0.0004 | 0.0072 |
| outdoor_input_air_humidity | 0.0016 | 0.0042 |
| IT_load | -0.0003 | 0.0007 |
| Name | Coefficient |
|---|---|
| dxc03comphz_frequency | 0.4462 |
| IT_load | 0.1161 |
| outdoor_fan_speed_percent | 0.0781 |
| dxc01comph_frequency | 0.0725 |
| dxc04comph_frequency | 0.0612 |
| dxc02comphz_frequency | 0.0530 |
| indoor_fan_speed_percent | 0.0413 |
| outdoor_input_air_temp | 0.0015 |
| pump_frequency | 0.0010 |
| outdoor_input_air_humidity | 0.0007 |
| Data | Method | PUE | PUE Optimization Rate |
|---|---|---|---|
| 2022.5 | PID | 1.26835 | / |
| RL | 1.22226 | 3.607% | |
| SDCO | 1.19528 | 5.761% | |
| 2022.6 | PID | 1.36608 | / |
| RL | 1.24777 | 8.661% | |
| SDCO | 1.20580 | 11.733% | |
| 2022.7 | PID | 1.34872 | / |
| RL | 1.29667 | 3.859% | |
| SDCO | 1.25950 | 6.615% |
| Control Strategies | Method | SD | RSD | CV | MAD | CD | FII | CSI |
|---|---|---|---|---|---|---|---|---|
| indoor_fan_speed_percent | PID | 0.847 | 0.136 | 0.013 | 0.154 | 0.064 | 0.270 | 0.832 |
| RL | 0 | 0.000 | 0.000 | 0 | 0 | 0 | 1.000 | |
| SDCO | 1.740 | 0.123 | 0.029 | 0.153 | 0.131 | 0.089 | 0.894 | |
| outdoor_fan_speed_percent | PID | 2.548 | 0.614 | 0.052 | 0.735 | 0.356 | 0.501 | 0.681 |
| RL | 0 | 0.000 | 0.000 | 0 | 0 | 0 | 1.000 | |
| SDCO | 4.032 | 0.558 | 0.093 | 0.585 | 0.483 | 0.416 | 0.680 | |
| dxc01comph_frequency | PID | 23.842 | 2.353 | 0.305 | 2.488 | 1.511 | 0.259 | 0.474 |
| RL | 0 | 0.000 | 0.000 | 0 | 0 | 0 | 1.000 | |
| SDCO | 13.578 | 1.402 | 0.274 | 1.421 | 1.090 | 0.282 | 0.558 | |
| dxc02comph_frequency | PID | 23.165 | 2.888 | 0.324 | 3.206 | 1.921 | 0.293 | 0.424 |
| RL | 2.699 | 0.561 | 0.044 | 0.676 | 0.223 | 0.579 | 0.679 | |
| SDCO | 18.453 | 2.995 | 0.286 | 3.114 | 1.968 | 0.450 | 0.393 | |
| dxc03comph_frequency | PID | 11.815 | 5.603 | 0.075 | 7.082 | 0.657 | 0.507 | 0.353 |
| RL | 3.117 | 0.638 | 0.029 | 0.761 | 0.163 | 0.603 | 0.672 | |
| SDCO | 17.793 | 2.132 | 0.207 | 2.356 | 0.941 | 0.488 | 0.454 | |
| dxc04comph_frequency | PID | 27.953 | 7.171 | 0.414 | 9.004 | 2.221 | 0.479 | 0.217 |
| RL | 3.185 | 0.659 | 0.029 | 0.788 | 0.151 | 0.586 | 0.673 | |
| SDCO | 18.758 | 3.101 | 0.307 | 3.495 | 1.706 | 0.482 | 0.374 |
| Method | Mean | Max_bound | Mean_bound | TVR(0.5) | TVR(0.25) | TSD |
|---|---|---|---|---|---|---|
| PID | 22.028 | 1.078 | 0.131 | 0.400% | 6.550% | 0.164 |
| RL | 22.028 | 0.656 | 0.123 | 0.200% | 6.600% | 0.153 |
| SDCO | 22.106 | 0.213 | 0.106 | 0.000% | 0.000% | 0.037 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).