(383b) Dynamic Bayesian Network Based Networked Process Monitoring, Fault Propagation Identification and Root Cause Diagnosis | AIChE

(383b) Dynamic Bayesian Network Based Networked Process Monitoring, Fault Propagation Identification and Root Cause Diagnosis


Mori, J. - Presenter, McMaster University
Yu, J., McMaster University

Principal component analysis (PCA) and partial least squares (PLS) methods are widely used in multivariate statistical process monitoring (MSPM) fields to build data-driven models within the low-dimensional subspace that retains variance or covariance structure. Then, statistical indices such as T2and SPE are developed to capture abnormal process variations. Moreover, once a fault has been detected, contribution plots can be generated to identify the major fault effect variables without prior process knowledge. However, contribution plot methods may not be able to identify the root causes of faulty operations without in-depth process knowledge and analysis due to the intricate variable interactions throughout the entire process. 

In order to identify the cause-effect relationship and diagnose the root-cause variables, signed directed graph (SDG) method for incipient fault diagnosis has been developed. SDG approach can capture the cause-effect relationship and the direction of the effect [1]. However, since this method identifies candidate faults from the prior fault database, it can be difficult to diagnose faulty operations that the database does not include. A method for the fault source identification, propagation analysis and time delay estimation is also developed [2]. Nevertheless, when the process sampling frequency is high, the online implementation becomes challenging due to its heavy computational load. As an alternative solution, Granger causality methods for diagnosing the root cause of the plant oscillation are proposed [3]. Granger causality that is based on linear prediction of time series may extract the cause-effect relationship as well as the feature of process dynamics. However, since the industrial processes are often characterized with strong nonlinearity, Granger causality based diagnosis approach may not always be effective in identifying the root causes. 

Aimed at overcoming these limitations, Dynamic Bayesian Network (DBN) based networked process monitoring and diagnosis framework is proposed for fault detection, propagation pathway identification and root cause diagnosis. DBNs are graphical models to characterize time-varying dynamics and variable causality under system uncertainty [4]. A DBN is essentially a directed acyclic graph (DAG) consisting of various nodes, each of which is connected to the neighbor nodes both within the same time slice and across different time slices. 

The first important task is to construct the underlying network structure of DBN to qualitatively characterize complex process. There are two ways to design the network structure, which are the process-knowledge based and data-driven techniques. In the former strategy, the qualitative cause-effect relationships are identified from process knowledge and analysis. Then the network structure can be inferred from process flow sheet, where the intra-slice topology within the same time slice and the inter-slice topology among different time slices should be defined on different process measurement variables. Specifically, a series of network arcs can be connected between the parent and child nodes based on the cause-effect relationships that can be analyzed from process flow diagrams as well as the physical or chemical interactions among monitored variables. Then, the inter-slice topology can be determined to characterise the process dynamics. Each node in the previous time slice is connected to the same one in the current time slice in order to represent the process dynamics. Moreover, in order to account for the inherent process time delays, each parent node in the previous time slice is connected to its child nodes in both the previous and current time slices. On the other hand, if the cause-effect relationship is difficult to identify due to lack of process knowledge or complicated processes, the second strategy of data-driven technique can be adopted. The basic idea of learning network structure is to find the graph so that the likelihood function is maximized, but this optimization problem is NP-hard. In this study, the heuristic and approximation algorithms are proposed to optimize the network structure. After the dynamic Bayesian network structure is defined, the model parameters including all the conditional probability density functions of different network nodes are estimated from historical process data. 

After DBN model is obtained, the likelihood for the new observation can be derived for fault detection and propagation pathway identification. The smaller value the likelihood function is, the higher possibility that the abnormal event occurs. Thus, the log-likelihood based index, termed as abnormal likelihood index (ALI), is proposed for process fault detection. Then, a dynamic Bayesian probability index (DBPI) is developed from the conditional probability function of each network node. Moreover, with the DBPI, the probabilistic inference rules are further designed to identify fault propagation pathways by searching from the downstream process backwards to the upstream process within the Bayesian network. Finally, the ending nodes of the fault propagation pathways can be determined as the root-cause variables that lead to the process upsets. 

The novel dynamic Bayesian network based networked process monitoring and diagnosis method is applied to the Tennessee Eastman Chemical benchmark process, and the results demonstrate that it can accurately detect abnormal operating events, identify fault propagation pathways, and ultimately diagnose root-cause variables.  


[1] Maurya, M., Rengaswamy, R. and Venkatasubramanian, V. (2004). Application of signed digraphs-based analysis for fault diagnosis of chemical process flowsheets. Engineering Applications of Artificial Intelligence, 17, 501-518.

[2] Stockmann, M., Haber, R. and Schmitz, U. (2012). Source identification of plant-wide faults based on k nearest neighbor time delay estimation. Journal of Process Control, 22, 583-598.

[3] Yuan, T. and Qin, S. (2012). Root cause diagnosis of plant-wide oscillations using Granger causality. 8th IFAC International Symposium on Advanced Control of Chemical Processes 2012, 8, 160-165.

[4] Murphy, K. (2002). Dynamic Bayesian Networks: Representation, Inference and Learning. Ph.D. thesis, University of California, Berkeley.