(615f) Bayesian Fault Detection and Isolation: Test Results for a Simple Benchmark Problem | AIChE

(615f) Bayesian Fault Detection and Isolation: Test Results for a Simple Benchmark Problem


Srinivasan, B. - Presenter, Columbia University
Rengaswamy, R. - Presenter, Texas Tech University
Rieger, C. - Presenter, Idaho National Laboratory
Garcia, H. - Presenter, Idaho National Laboratory
Narasimhan, S. - Presenter, Indian Institute of Technology
Nallasivam, U. - Presenter, Clarkson University
Villez, K. - Presenter, Purdue University

In this contribution, we develop a Bayesian framework for Fault Detection and Isolation (FDI) based on Kalman filtering. The FDI problem is solved in two steps, detection and identification, as is classic within the FDI context. This development is to be part of a supervisory control system aimed at the achievement of increased resiliency in complex, automated systems [1].

For the detection problem, a Bayesian binary classifier is developed. The first class is represented by the normal, faultless behavior of the system. The conditional likelihood evaluation for this class (indexed 0), L(Y|C=0), follows directly from the Kalman filter [2]. Evaluating the likelihood of the alternative class (indexed 1), L(Y|C=1) is less obvious. In theory, one should evaluate and integrate over all alternative behaviors. When a whole range of faulty scenarios are available, this is not computable in real time. For this reason, a generic likelihood function is developed which can incorporate all considered abnormal behaviors, including the ones that cannot be parametrized upfront; this at the cost of loss in specificity. Figure 1 displays both likelihood functions for a single future measurement of simple tank system (SISO system). The Kalman filter delivers this likelihood function for faultless operation (C=0) based on a one-step ahead prediction . The alternative class (C=1), is characterized by a uniform likelihood function in the measurement range of the level sensor (0--2). This function is independent of any previous measurement and specifies the notion that upon total lack of process state knowledge, the level measurement could take any value in the measuring range. As can be seen, a small region exists where the likelihood for normal class is higher than the abnormal class. If the measurement falls in this region, normal operation is assumed. If it falls outside this region, abnormal operation is assumed and fault diagnosis is initiated.

Figure 1: Log Likelihood function for the normal (blue) and the abnormal (red) class.

Conventional fault identification tools make use of multiple hypothesis testing. Indeed, one evaluates the likelihood for several alternative fault models [3-4] or a specific test statistics is constructed for separate faults [5]. The former approach requires considerable computational effort while the second needs corrections for multiple hypothesis testing. Instead, we suggest a single fault model which is composed of a library of faults including bias, drift, stiction band for both sensors and actuators, resulting in 4 parameters for a single sensor or actuator. This includes the one-parameter model for valve stiction as in [6]. The complete fault model is specified as follows:

tf =min{max{&theta/α,0},1} if α > 0
=H(&theta) if α = 0
Z(t) = Y(t) + β . tf
Yf(t) =Yf(t-1) if |Z(k)- Yf(t-1)| < δ
=Z(t) if |Z(k)- Yf(t-1)| ≥ δ
Y(t): the faultless signal
Yf(t): the faulty signal
H(.): Heaviside function
α: drift slope parameter
β: bias parameter
δ: stiction band parameter
θ: time elapsed since fault introduction

This model can reproduce both simple faults and combined faults. However, an unmodified maximum likelihood (ML) estimation of the parameters makes the identified faults unnecessary complex when they are truly simple. To circumvent this problem, a Bayesian regularization approach is taken from regression theory in which the parameters of the fault model are shrunk to zero. If all parameters are zero, the identified condition is the fault-free condition. By shrinking the parameters to zero, the least complex fault which explains the observations is identified. This corresponds to the application of a Bayesian prior on the parameters of this model [7]. An independent Laplacian prior is used for each parameter while taking into account that the parameters α, Îx and θ are positive and applying lower and upper bounds for θ based on the fault detection results. Indeed, the fault detection step can be used to give useful hints on the start time of the fault. Fault identification now corresponds to finding the parameters of the model that deliver the Maximum A Posteriori (MAP) likelihood. This likelihood function is shown in Figure 2 for a bias problem and in Figure 3 for a combined bias-drift problem, both evaluated for the correct values of Îx and θ. As can be seen, the likelihood is maximal for α equal to zero in the first case, while a non-zero α delivers the maximal value in the second case. As such, finding the maximum likelihood (ML) parameter set can deliver the right fault classification. By means of the application of the Laplacian prior, this ML parameter set is expected to enable effective identification of single and complex faults at the same time. Indeed, the application of the Laplacian prior corresponds to LASSO regression technique, which has been shown to lead to zero values for parameters that are ineffective in explaining empirical observations [8]. To increase chances of finding the global optimum for all parameters, Simulated Annealing (SA) will be applied.

Figure 2: Likelihood function for a bias fault. Maximum log likelihood (LLH) is found at α = 0, as desired.

Figure 3: Likelihood function for a combined bias and ramp fault. Maximum likelihood is found for α and β value around their true values (1 and 10), as expected.

Both the Bayesian classifier for fault detection and the Bayesian regularization are valid and intuitive tools for fault identification. By means of Bayesian regularization, it also becomes possible to discriminate between simple faults (bias, drift, stiction) as well as combined faults without compromising identifiability of the faults. This, in turn, is expected to lead to more accurate control actions in the envisioned supervisory control system, leading to increased robustness and resiliency in automated systems.


Work supported by the U.S. Department of Energy under DOE Idaho Operations Office Contract DE-AC07-05ID14517, performed as part of the Instrumentation, Control, and Intelligent Systems Distinctive Signature (ICIS) of Idaho National Laboratory.


[1] Rieger, C., Gertman, D. and McQueen, M. (2009). Resilient control systems: Next generation design research. In: Proceedings of the 2nd Conference on Human System Interactions, 632--636.

[2] Harvey, A. C. (1989). Forecasting, structural time series models and the Kalman filter. Cambridge University Press, Cambridge, 554pp.

[3] Prakash, J., Patwardhan, S.C. and Narasimhan, S. (2002). A supervisory approach to Fault-Tolerant Control of Linear Multivariable Systems. Ind. Eng. Chem. Res., 41, 2270-2281.

[4] Prakash, J., Narasimhan, S. and Patwardhan, S.C. (2005). Integrating model based fault diagnosis with Model Predictive Control. Ind. Eng. Chem. Res., 44, 4344-4360.

[5] Gertler, J. (1998). Fault detection and diagnosis in engineering systems. Marcel Dekker, New York, 484pp.

[6] Srinivasan, R., Rengaswamy, R., Narasimhan, S., & Miller, R. (2005). Control loop performance assessment 2: Hammerstein model approach for stiction diagnosis. Industrial and Engineering Chemistry Research, 44, 6719--6728.

[7] MacKay, D. J. (2003). Information Theory, Inference, and Learning Algorithms. Cambridge University Press, Cambridge, 628pp.

[8] Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. J. Royal. Statist. Soc B., 58, 267-288.