(710a) Online Outlier Detection with a Bayesian Supervisory Approach for Recursive Soft Sensor Update
Partial least squares (PLS) based soft sensors that predict the primary variables of a process by using the secondary measurements have drawn increased research interests recently . Such data-driven soft sensors are easy to develop and only require a good historical data set. In our previous work , a reduced-order dynamic PLS (RO-DPLS) soft sensor was developed to address some limitations of the traditional DPLS soft sensor when applied to processes with large transport delays. By taking the process characteristics into account, RO-DPLS soft sensor can significantly reduce the number of regressor variables and improve prediction performance.
As industrial processes often experience time-varying changes, it is desirable to update the soft sensor model with the new process data once the soft sensor is implemented online . Recently, we extended the RO-DPLS soft sensor to its online adaptation version in order to track process changes . Since our focus in  was to investigate the properties of different recursive updating schemes and data scaling methods, we preprocessed the industrial datasets to remove all outliers before subjecting them to different experiments.
However, it has been recognized that the PLS algorithms are sensitive to outliers in the dataset . Therefore, outlier detection and handling plays a critical role in the development of the PLS based soft sensors, and there exist extensive studies on outlier detection for off-line model building. Despite many published results on outlier detection, it is still an unsolved problem, and it is often recommended to accompany any outlier detection method with a graphical inspection of the residual space and model parameters  to eliminate any possible outlier masking effect (i.e., outliers are classified as consistent samples) and outlier swamping effect (i.e., consistent samples are classified as outliers).
In this work, we propose multivariate approaches for both off-line outlier detection (for initial soft sensor model building) and online outlier detection (for soft sensor model recursive update). Specifically, for off-line outlier detection we combine leverage and y-studentized residuals; while for online outlier detection, we use squared prediction error indices for X and Y to monitor the independent variable and dependent variable space, respectively. For online outlier detection, to differentiate the outliers caused by erroneous reading from those caused by process changes, we propose a Bayesian supervisory approach to further analyze and classify the identified outliers. Both simulated and industrial case studies of a Kamyr digester show that the Bayesian supervisory approach is very effective in differentiating the outliers caused by erroneous readings from those caused by a process change, which enables the superior performance of the soft sensor with Bayesian supervisory approach.
 Kadlec, P.; Gabrys, B. & Strandt, S. Data-driven Soft Sensors in the process industry. Computers & Chemical Engineering, 2009, 33, 795-814.
 Galicia, H. J.; He, Q. P. & Wang, J. A reduced order soft sensor approach and its application to a continuous digester. Journal of Process Control, 2011, 21, 489-500.
 Kadlec, P.; Grbic, R. & Gabrys, B. Review of adaptation mechanisms for data-driven soft sensors. Computers & Chemical Engineering, 2011, 35, 1-24.
 Galicia, H. J.; He, Q. P. & Wang, J. Comparison of the performance of a reduced-order dynamic PLS soft sensor with different updating schemes for digester control. Submitted to Control Engineering Practice, 2011.
 Hubert, M. and Branden, K. V. Robust methods for partial least squares regression. Journal of Chemometrics, 2003, 17, 537-549.
 Martens, H. and Naes, T. Multivariate Calibration. John Wiley and Sons Ltd., 2002.