Submitted:
25 December 2023
Posted:
26 December 2023
You are already at the latest version
Abstract
Keywords:
1. Introduction
1.1. Contextual Framework
1.2. Literature Review
2. Materials and Methods
2.1. Digitised Screen-Based Procedures
2.2. Experimental Setup
- Pressure indicator control failure. In this scenario, the automatic pressure management system in the tank ceases to function. Consequently, the operator must manually modulate the inflow of nitrogen into the tank to preserve the pressure. During this scenario, the cessation of nitrogen flow into the tank results in a pressure drop as the pump continues to channel nitrogen into the plant.
- Nitrogen valve primary source failure. This scenario is an alternative version of the first one. In this case, the primary source of nitrogen in the tank fails. The operator has to switch to a backup system. While the backup system starts slowly the operator has to regulate the pump power to slow down the drop of pressure inside the tank.
- Temperature indicator control failure in the Heat Recovery section. The operator initially attempts to rectify the issue by manually adjusting the set point of the cooling water flow in the absorber. However, this intervention is ineffective. Consequently, the operator has to contact the supervisor for further guidance. The supervisor informs the operator that the fault is beyond the scope of control room resolution and requires the intervention of a field operator. While the field operator is dispatched to address the issue on-site, the control room operator is tasked with managing the reactor’s temperature. The primary concern in this scenario is the potential for the reactor to overheat. To mitigate this risk, the operator must closely monitor and adjust the cooling water temperature of the reactor.



2.3. Dynamic Influence Diagrams
2.3.1. Influence Diagrams
- A DAG with nodes V and directed links E encoding dependence relations and information precedence.
- A set of discrete random variables and discrete decision variables , such that represented by nodes of G.
- A set of conditional probability distributions P containing one distribution for each discrete random variable given its parents .
- A set of utility functions U containing one utility function for each node v in the subset of utility nodes.
2.3.2. Dynamic Influence Diagram
- Decision Node (depicted in pink): This node represents the array of decisions options available to an operator at a point in time. Each state within this node has a direct impact on the distribution of physical values within the system, embodying the influence of operator decisions on the process.
- Physical Value Node: This node models the physical parameters of the system. However, rather than using precise values, the states are represented as intervals. This discretization approach is essential for simplifying the model while capturing the necessary detail. States indicating hazardous conditions in this node elevate the probability of adverse consequences.
- Consequence Node: This node encompasses the potential outcomes or consequences (e.g., tank explosion, tank implosion) that may arise from the system’s current state. Each state within this node is associated with a specific cost, reflecting the severity or impact of that outcome.
- Utility Node (colored in green): The utility node is where the cost (or reward) of each potential consequence is integrated. The utility value is computed by considering the probability of each consequence and its corresponding cost, thereby quantifying the overall risk or benefit associated with a particular system state.
- Physical Value Node with Stripes: This node represents the physical values from the previous time step. It is linked to the current physical value node to model the influence of past states on the present. Additionally, it is connected to the decision node to represent the operator’s ability to observe past physical values and make informed decisions in future time steps.
2.3.3. Conflict Analysis
2.4. Deep Reinforcement Learning (DRL)
Twin Delayed Deep Deterministic Policy Gradient (TD3) Architecture:
- : expected cumulative reward for action a in state s.
- r: immediate reward obtained from taking action a in state s.
- : discount factor for future rewards.
- : maximum expected future reward in the next state .
- : Predicted Q-value by the target critic network.
- y: Target value for the critic update,
- : state, action, reward, and next state, sampled from replay buffer.
- : Action selected by the target policy in the next state.
- : Q-value associated with the selected action and state.
- : Action selected by the policy in the current state.
2.4.1. DRL for Process Control
State:
Action:
Reward:
2.4.2. Specialized Reinforcement Learning Agent (SRLA)
SRLA in Process Industry:
3. Statistical Test
4. Result
4.1. Construction of the Model
4.1.1. Operational Framework and Objectives of the DID Model
4.1.2. Fault detection
4.1.3. Parameter and Structure Specification

4.1.4. Utility
4.1.5. Dynamic Model
4.2. AI Framwork
| Algorithm 1: Influence Diagram-based Recommendation Algorithm. |
![]() |
4.3. Use of the Model
- First, we use the anomalies detection model to detect potential faults in the system.
- Next, we incorporate both the observed data at T0 and the potential anomaly identified in the previous stage. We then evaluate various actions at T1 for their maximum utility, thereby formulating an optimal set of actions intended to either prevent or mitigate a potentially critical event.
- If some set point in the form of an interval is recommended to the operator, the appropriate reinforcement learning agent is called to precise this value.
- Finally, the operator is presented with the optimal procedure, which outlines the recommended course of action based on the preceding analyses. The fault is also provided to the operator with its potential consequences,
5. Experiment
5.1. Participants
5.2. Situation Awareness
-
Monitoring:
- –
- The Shapiro-Wilk test indicates non-normal distribution for G1, but normal for G2.
- –
- Significant differences in monitoring are observed, with G4 showing lower levels as indicated by the t-test and Wilcoxon Rank-Sum Test.
-
Planning:
- –
- Both groups exhibit normal distribution according to the Shapiro-Wilk test.
- –
- No significant differences in planning, though G2 tends to have slightly lower levels.
-
Intervention:
- –
- Shapiro-Wilk test suggests borderline normal distribution for both groups.
- –
- A significant difference is found, with G2 engaging in less intervention than G3.
-
SPAM Index:
- –
- Normal distribution is suggested for both groups by the Shapiro-Wilk test.
- –
- The SPAM Index shows a significant difference, with G2 having a lower index than G3.

5.3. Workload
- Mental Demand: The statistical tests indicate no significant difference in mental demand between groups G1 and G2.
- Physical Demand: Similarly, there is no significant difference in physical demand between the groups.
- Temporal Demand: The results show no significant difference in temporal demand, suggesting both groups experienced similar time-related pressures.
- Performance: No significant difference in perceived performance is observed between the groups.
- Effort: The effort levels do not differ significantly between the groups.
- Frustration: A significant difference in frustration levels is observed, with G1 experiencing more frustration than G2.
- TLX Index: The overall TLX Index, representing the combined workload, shows no significant difference between the groups.
5.4. Performance
- Reaction time/Response time/Overall Performance: Based on the overall performance, Group 2 had optimal performance compared to Group 1 Table 4. This is typically the same for the reaction and response times. The statistical tests Table 5 indicate significant differences in the two groups’ time-based and overall performance metrics while solving the scenarios. Except scenario 3 where there is no significant difference between the two groups. One possible interpretation is the wide range of different behaviors inside each group due to the complexity of the task. All in all, the group with the decision support showed better performance than the group without.
5.5. Physiological Data



6. Discussion
7. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Conflicts of Interest
Abbreviations
| DID | Dynamic influence diagram |
| DRL | Deep Reinforcement learning |
| TD3 | Twin Delayed Deep Deterministic Policy Gradient |
| SRLA | Specialized Reinforcement learning Agent |
| AI | artificial intelligence |
| GUI | Graphical User Interface |
| DSS | Decision support systems |
| CPT | Conditional Probability Table |
References
- Proudlove, N.C.; Vaderá, S.; Kobbacy, K.A.H. Intelligent management systems in operations: a review. Journal of the Operational Research Society 1998, 49(7), 682–699. [CrossRef]
- Kobbacy, K.A.H.; Vadera, S. A survey of AI in operations management from 2005 to 2009. Journal of Manufacturing Technology Management 2011, 22(6), 706–733. [CrossRef]
- Eom, S.; Kim, E. A survey of decision support system applications (1995–2001). Journal of the Operational Research Society 2006, 57, 1264–1278. [CrossRef]
- Power, D.J. Decision support systems concept. In Business Information Systems: Concepts, Methodologies, Tools and Applications; IGI Global: 2010; pp. 1–5. [CrossRef]
- Demichela, M.; Baldissone, G.; Camuncoli, G. Risk-based decision making for the management of change in process plants: benefits of integrating probabilistic and phenomenological analysis. Industrial & Engineering Chemistry Research 2017, 56(50), 14873–14887. [CrossRef]
- McNaught, K.; Chan, A. Bayesian networks in manufacturing. Journal of Manufacturing Technology Management 2011, 22(6), 734–747. [CrossRef]
- Hsieh, M.-H.; Hwang, S.-L.; Liu, K.-H.; Liang, S.-F. M.; Chuang, C.-F. A decision support system for identifying abnormal operating procedures in a nuclear power plant. Nuclear Engineering and Design 2012, 249, 413–418. [CrossRef]
- Lee, S.J.; Seong, P.H. Development of an integrated decision support system to aid cognitive activities of operators. Nuclear Engineering and Technology 2007, 39(6), 703. [CrossRef]
- Valdez, A.C.; Brauner, P.; Ziefle, M.; Kuhlen, T.W.; Sedlmair, M. Human factors in information visualization and decision support systems. 2016, Gesellschaft für Informatik e.V. [CrossRef]
- Madhavan, P.; Wiegmann, D.A. Effects of information source, pedigree, and reliability on operator interaction with decision support systems. Human factors 2007, 49(5), 773–785. [CrossRef]
- Al-Dabbagh, A.W.; Hu, W.; Lai, S.; Chen, T.; Shah, S.L. Toward the advancement of decision support tools for industrial facilities: Addressing operation metrics, visualization plots, and alarm floods. IEEE Transactions on Automation Science and Engineering 2018, 15(4), 1883–1896. [CrossRef]
- Naef, M.; Chadha, K.; Lefsrud, L. Decision support for process operators: Task loading in the days of big data. Journal of Loss Prevention in the Process Industries 2022, 75, 104713. [CrossRef]
- Kjærulff, U.B.; Madsen, A.L. Bayesian Networks and Influence Diagrams: A Guide to Construction and Analysis. Springer: 2013.
- MIETKIEWICZ, Joseph, et al. Dynamic Influence Diagram-Based Deep Reinforcement Learning Framework and Application for Decision Support for Operators in Control Rooms. 2023. [CrossRef]
- MIETKIEWICZ, Joseph; MADSEN, Anders Læsø. Data driven Bayesian network to predict critical alarm. In: European Conference on Safety and Reliability. Research Publishing, Singapore, 2022. p. 522. [CrossRef]
- HUGIN EXPERT A/S, Hugin sofware https://www.hugin.com.
- Abbas, A. N., Chasparis, G. C., & Kelleher, J. D. (2023). Hierarchical framework for interpretable deep reinforcement learning-based predictive maintenance. Data & Knowledge Engineering, 102240. [CrossRef]
- Abbas, A. N., Chasparis, G. C., & Kelleher, J. D. (2022, July). Interpretable Input-Output Hidden Markov Model-Based Deep Reinforcement Learning for the Predictive Maintenance of Turbofan Engines. In International Conference on Big Data Analytics and Knowledge Discovery (pp. 133-148). Cham: Springer International Publishing. [CrossRef]
- Abbas, A. N., Chasparis, G. C., & Kelleher, J. D. (2023). Specialized Deep Residual Policy Safe Reinforcement Learning-Based Controller for Complex and Continuous State-Action Spaces. arXiv preprint arXiv:2310.14788. [CrossRef]
- Abbas, A. N., Chasparis, G. C., & Kelleher, J. (2022). Deep Residual Policy Reinforcement Learning as a Corrective Term in Process Control for Alarm Reduction: A Preliminary Report. [CrossRef]
- Mietkiewicz, J., Abbas, A. N., Amazu, C. W., Madsen, A. L., & Baldissone, G. (2023). Dynamic Influence Diagram-Based Deep Reinforcement Learning Framework and Application for Decision Support for Operators in Control Rooms. [CrossRef]
- Spielberg, S., Tulsyan, A., Lawrence, N. P., Loewen, P. D., & Gopaluni, R. B. (2020). Deep reinforcement learning for process control: A primer for beginners. arXiv preprint arXiv:2004.05490. [CrossRef]
- François-Lavet, V., Henderson, P., Islam, R., Bellemare, M. G., & Pineau, J. (2018). An introduction to deep reinforcement learning. Foundations and Trends® in Machine Learning, 11(3-4), 219-354. [CrossRef]
- Fujimoto, S., Hoof, H., & Meger, D. (2018, July). Addressing function approximation error in actor-critic methods. In International conference on machine learning (pp. 1587-1596). PMLR.
- Shapiro, S.S.; Wilk, M.B. An analysis of variance test for normality (complete samples). Biometrika 1965, 52(3/4), 591–611. [CrossRef]
- Mann, H.B.; Whitney, D.R. On a test of whether one of two random variables is stochastically larger than the other. The Annals of Mathematical Statistics 1947, 50-60. [CrossRef]
- Levene, H. Robust tests for equality of variances. In Contributions to Probability and Statistics; 1960, 278-292.
- Welch, B.L. The generalization of ‘STUDENT’S’ problem when several different population variances are involved. Biometrika 1947, 34(1-2), 28-35. [CrossRef]
- Student. The probable error of a mean. Biometrika 1908, 6(1), 1-25. [CrossRef]
- Mietkiewicz, J.; Madsen, A.L. Enhancing Control Room Operator Decision Making: An Application of Dynamic Influence Diagrams in Formaldehyde Manufacturing. In European Conference on Symbolic and Quantitative Approaches with Uncertainty; 2023; Springer Nature Switzerland: Cham, pp. 15-26. [CrossRef]
- Weidl, G.; Madsen, A.L.; Israelson, S. Applications of object-oriented Bayesian networks for condition monitoring, root cause analysis and decision support on operation of complex continuous processes. Computers & Chemical Engineering 2005, 29(9), 1996–2009. [CrossRef]
- Dey, S.; Story, J.A.; Stori, J.A. A Bayesian network approach to root cause diagnosis of process variations. International Journal of Machine Tools and Manufacture 2005, 45(1), 75-91. [CrossRef]
- Energy Institute, London. Human Factors Performance Indicators for the Energy and Related Process Industries. 1st edition; Energy Institute: London, 2010; ISBN 978-0-85293-587-3.
- Amazu, C.W.; et al. Analysing "Human-in-the-Loop" for Advances in Process Safety: A Design of Experiment in a Simulated Process Control Room. 2023. [CrossRef]












| Shapiro-Wilk G3 | Shapiro-Wilk G4 | Levene’s Test | t-test | Wilcoxon Rank-Sum Test | |
|---|---|---|---|---|---|
| Instability | 0.43 | 0.66 | 0.25 | 0.19 | 0.10 |
| Variability | 0.07 | 0.36 | 0.35 | 0.66 | 0.35 |
| Complexity | 0.22 | 0.39 | 0.63 | 0.75 | 0.42 |
| Arousal | 0.84 | 0.28 | 0.34 | 0.45 | 0.16 |
| Spare_capacity | 0.67 | 0.48 | 0.38 | 0.60 | 0.32 |
| Concentration | 0.016 | 0.18 | 0.58 | (0.97) | 0.33 |
| Attention_division | 0.05 | 0.96 | 0.88 | (0.99) | 0.39 |
| Quantity | 0.28 | 0.29 | 0.08 | 0.72 | 0.36 |
| Quality | 0.27 | 0.28 | 0.20 | 0.37 | 0.22 |
| Familiarity | 0.56 | 0.82 | 0.10 | 0.88 | 0.48 |
| Shapiro-Wilk G1 | Shapiro-Wilk G2 | Levene’s Test | t-test | Wilcoxon Rank-Sum Test | |
|---|---|---|---|---|---|
| Monitoring | 0.00 | 0.10 | 0.01 | (0.00) | 0.00 |
| Planning | 0.38 | 0.36 | 0.13 | 0.14 | 0.09 |
| Intervention | 0.07 | 0.26 | 0.76 | (0.00) | 0.00 |
| SPAM_index | 0.16 | 0.61 | 0.03 | 0.00 | 0.00 |
| Shapiro-Wilk G3 | Shapiro-Wilk G4 | Levene’s Test | t-test | Wilcoxon Rank-Sum Test | |
|---|---|---|---|---|---|
| Mental_demand | 0.74 | 0.37 | 0.52 | 0.38 | 0.18 |
| Physical_demand | 0.04 | 0.00 | 0.20 | (0.56) | 0.47 |
| Temporal_demand | 0.81 | 0.05 | 0.28 | 0.72 | 0.37 |
| Performance | 0.53 | 0.01 | 0.05 | (0.61) | 0.16 |
| Effort | 0.54 | 0.72 | 0.68 | 0.85 | 0.41 |
| Frustration | 0.18 | 0.03 | 0.83 | (0.04) | 0.02 |
| TLX_index | 0.79 | 0.63 | 0.82 | 0.59 | 0.21 |
| Reaction time | Response time | |||
|---|---|---|---|---|
| M | SD | M | SD | |
| S1 | ||||
| G1 | 276.30 | 46.35 | 339.30 | 150.39 |
| G2 | 106.96 | 130.96 | 259.04 | 170.13 |
| S2 | ||||
| G1 | 266.91 | 80.42 | 361.48 | 102.57 |
| G2 | 63.43 | 84.26 | 154.83 | 168.95 |
| S3 | ||||
| G1 | 190.17 | 117.22 | 787.87 | 207.56 |
| G2 | 117.38 | 55.78 | 680.00 | 226.43 |
| Wilcoxon Rank-Sum Test | |||
|---|---|---|---|
| S1 | S2 | S3 | |
| Reaction Time | 0.00 | 0.00 | 0.14 |
| Response Time | 0.07 | 0.00 | 0.08 |
| Overall Performance | 0.00 | 0.05 | 1.00 |
| Comparison | Shapiro-Wilk G1 | Shapiro-Wilk G2 | Levene’s Test | T-test | Wilcoxon Rank-Sum Test |
| Heart rate | 0.22 | 0.18 | 0.02 | 0.05 | 0.09 |
| Temperature | 0.46 | 0 | 0.98 | 0.13 | 0.1 |
| EDA | 0 | 0 | 0.19 | 0.17 | 0.21 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
