(373d) Intelligent Recursive Soft Sensor Adaptation Via Bayesian Outlier Detection and Classification

Galicia, H. - Presenter, Auburn University

soft sensors that predict the primary variables of a process by using the
secondary measurements have drawn increased research interests recently. Among
them, the partial least squares (PLS) based soft sensor is the most commonly
used approach for industrial applications. As industrial processes often
experience time-varying changes, it is desirable to update the soft sensor
model with the new process data once the soft sensor is implemented online. In
our previous work [1-2], the recursive reduced-order dynamic PLS (RO-DPLS) soft
sensor is developed to provide quality estimates of primary process variables
 in the event of large transport delays and time-varying process
conditions. Both simulated and industrial case studies of a continuous Kamyr
digester showed that a recursive RO-DPLS soft sensor can provide accurate
estimates of the extent of reaction (i.e., the Kappa number) and can cope with
time-varying process behavior.

the focus in [2] was to investigate the properties of different recursive
updating schemes and data scaling methods, the industrial datasets were
pre-processed to remove all outliers before subjecting them to different
experiments. This step was taken since static and recursive PLS algorithms are
sensitive to outliers in the dataset [3]. Therefore, outlier detection and
handling plays a critical role in the development of the PLS-based soft
sensors. Although there exists extensive studies on outlier detection for
off-line model building [3-7], outlier detection remains a challenging problem.
In addition, the detection of outliers online poses some additional challenges.
First, the resources dedicated to develop the soft sensor off-line (e.g. expert
knowledge) are not available during the online phase. Second, for online
adaptation of soft sensor models, if erroneous readings are used to update the
soft sensor model, future predictions from the updated model may deteriorate
significantly. Furthermore, the challenge increases since outliers online not
only can be erroneous readings, but they could also be normal samples of new
process states that represent a new process behavior. These samples are
necessary to use to provide the updated model with accurate information of the
current process behavior and thus improve the soft sensor adaptation.

In this work, a
multivariate approach for online outlier detection based on the squared
prediction error (SPE) statistic is developed to improve the automatic
adaptation of the soft sensor model. In addition, to differentiate outliers
caused by erroneous readings from those caused by process changes, a Bayesian
supervisory approach is proposed to analyze and classify the detected outliers.
A challenging simulation case study of a continuous Kamyr digester [8] is used
to assess the performance of the recursive soft sensor with outlier detection
and classification. In this case study a major process disturbance, a wood type
change, is simulated. It is worth noting that after this major process change
occurs, the process settles in a completely new state, and the normal
monitoring indices (i.e., SPE indices) may switch to a different level.
Therefore, the thresholds of the online monitoring SPE indices need to be
updated as well. Otherwise, the performance of outlier detection will
deteriorate considerably. In this work, a robust way to update the monitoring
thresholds based on an exponentially moving average (EWMA) filter is proposed.
The update strategy is shown in Eqn. (1)


the previous monitoring threshold for outlier detection before the
update;   the threshold after the update;  the
threshold estimated using the reconstructed SPE indices of new measurements.
The initial thresholds are determined using historical data under normal
operation condition. The parameter  is a tuning parameter which
controls how fast the thresholds are updated. In this work, the thresholds are
updated for normal operation every 20 samples i.e., no outliers are detected.
For this case a relatively conservative setting of  is
used (0.9 <  < 0.7). For cases where a process change is
detected, i.e., detected outliers are classified as part of a process change, a
more conservative setting is used (0.95 <  ).
It is also worth noting that usually different settings are used for monitoring
the independent and dependent variable spaces via SPEx and SPEy indices. This
is necessary due to the differences in variability on their corresponding
monitoring indices. Due to the limited number of samples for regular update
(i.e. 20 samples) and for process change update (i.e. 5 samples),  is
estimated through an empirical way as shown in Eqn. (2).


 and  are the mean and
standard deviation of SPE of the samples used for update;  is a tuning
parameter usually around 2~3.

The results obtained from the challenging simulation
case study indicate that the recursive soft sensor with outlier detection,
classification and threshold update, provides a robust way to address the time-varying
nature of industrial processes and to provide an intelligent adaptation of the
soft sensor model. In addition, the good performance observed in the simulation
case study of the proposed approaches is confirmed by the application to a more
challenging industrial case study of an industrial continuous Kamyr digester.


Galicia, H. J.; He, Q. P. & Wang, J. A reduced order soft sensor approach
and its application to a continuous digester. Journal of Process Control,
2011, 21, 489-500.

Galicia, H. J.; He, Q. P. & Wang, J. Comparison of the performance of a
reduced-order dynamic PLS soft sensor with different updating schemes for
digester control. Control Engineering Practice, 2012. Accepted

Hubert, M. and Branden, K.V.  Robust methods for partial least squares regression,
Journal of Chemometrics, vol. 17, pp. 537-549, 2003.

Hodge, V. and Austin, J. A survey of outlier detection methodologies.  Artificial
Intelligence Review
, vol. 22, pp. 85-126, 2004.

Pearson, R. K. Outliers in process modeling and identification. Control
Systems Technology
, IEEE Transactions on, vol. 10, pp. 55-63, 2002.

Davies, L. and Gather, U. The identification of multiple outliers. Journal
of the American Statistical Association
, vol. 88, pp. 782-792, 1993.

Jolliffe, I. T. Principal Component Analysis: Springer, 2002.

Wisnewski, P. A., Doyle, F., Kayihan, F. (1997). Fundamental
continuous-pulp-digester model for simulation and control. AIChE Journal,
43(12): 3175-3192.

See more of this Session: Process Monitoring and Fault Detection II

See more of this Group/Topical: Computing and Systems Technology Division