(625e) Fault Detection and Diagnosis of Continuous Processes Via Non-Linear Support Vector Machine Based Feature Selection

Onel, M., Texas A&M Energy Institute, Texas A&M University
Kieslich, C. A., Texas A&M University
Guzman, Y. A., Princeton University
Floudas, C. A., Texas A&M University
Pistikopoulos, E. N., Texas A&M Energy Institute, Texas A&M University
Advances in sensor and data collection technologies has valorized data-driven modeling approaches in process monitoring and fault detection in process systems engineering [1]. Today, machine learning and pattern recognition techniques play a significant role in attaining actionable insights and decision-making from the vast amounts of available process data by building accurate and robust data-driven models. One of the most popular machine learning techniques is Support Vector Machines (SVMs) [2-6] which allows the use of high dimensional feature sets for learning problems such as classification and regression. Yet reducing the dimensionality of the feature space in data-driven modeling, known as dimensionality reduction and feature selection, is still a key task in improving model accuracy as well as decreasing a priori data collection, which in turn yields enhanced efficiency in chemical processes.

In this work, we present the application of a novel non-linear (kernel-dependent) SVM-based feature selection algorithm [7-8] to process monitoring and fault detection of continuous processes. The developed methodology is derived from sensitivity analysis of the dual SVM objective and utilizes existing and novel greedy algorithms to rank features that also guides fault diagnosis. Specifically, we train two-class SVM models to detect known faults and use one-class support vector data descriptors to characterize normal operations. Here, the manipulated and measured variables of the process constitute the input feature space, and instances of normal and faulty operation yield training samples for our SVM models. The feature selection algorithm is used to improve the accuracy of fault detection models and perform fault diagnosis. We present results for the Tennessee Eastman process [9] as a case study and compare our approach to existing approaches for fault detection and diagnosis.


[1] Chiang, L. H., Braatz, R. D., & Russell, E. L. (2001). Fault Detection and Diagnosis in Industrial Systems. Springer Science & Business Media.

[2] Vapnik, V. (1995). The Nature of Statistical Learning Theory. Springer.

[3] Cortes, C., & Vapnik, V. (1995). Support-vector networks. Machine Learning, 20(3), 273-297.

[4] Schölkopf, B., Smola, A. J., Williamson, R. C., & Bartlett, P. L. (2000). New support vector algorithms. Neural Computation, 12(5), 1207-1245.

[5] Schölkopf, B., Platt, J. C., Shawe-Taylor, J., Smola, A. J., & Williamson, R. C. (2001). Estimating the support of a high-dimensional distribution. Neural Computation, 13(7), 1443-1471.

[6] Tax, D. M., & Duin, R. P. (2004). Support vector data description. Machine Learning, 54(1), 45-66.

[7] Guzman, Y.A., Kieslich, C.A., Floudas, C.A. (Submitted). A global optimization framework for feature selection with Support Vector Machines.

[8] Kieslich, C. A., Tamamis, P., Guzman, Y. A., Onel, M., Floudas, C. A. (2016). Highly Accurate Structure-Based Prediction of HIV-1 Coreceptor Usage Suggests Intermolecular Interactions Driving Tropism. PloS one,11(2), e0148974.

[9] Downs, J. J., & Vogel, E. F. (1993). A plant-wide industrial process control problem. Computers & chemical engineering, 17(3), 245-255.