(470h) Subspace Based Model Identification for Missing Data | AIChE

(470h) Subspace Based Model Identification for Missing Data

Authors 

Patel, N. - Presenter, McMaster University
Mhaskar, P., McMaster University
Corbett, B., McMaster University
Batch processes are important for a wide range of manufacturing industries such as chemicals, polymers, specialty glass, ceramics, and steel production. The recipe can be adjusted based on results from previous batches to maintain and promote quality control.[1] Economic considerations motivate the need for advanced batch process control strategies which, in turn, often necessitates a good process model. Recent advances in data storage technology have resulted in increased availability of accessible historical process data, making data-driven modeling more viable than before. Data driven modeling techniques however, often deal with several challenges ranging from process nonlinearity to incomplete data.

One problem that is particularly common when dealing with historical data is that of missing data where for certain time instances, process data is not recorded. In some cases, there are periods where an entire set of data is missing due to sensor failure or maintenance on the system. In other cases, the measurements contain large errors and need to be removed from the data set. A common occurrence of missing data in chemical engineering processes is when different sensors have different sampling periods. Thus, while there is continuous data from each sensor the measurements cannot be readily aligned with the other recorded variables.

Partial least squares (PLS) is a data-driven techniques that is often applied to industrial data, especially in cases for which missing data must be accounted for.[2-4] The inherent ability to handle missing data is one of the attributes that makes PLS an attractive method for modeling batch processes. The method identifies a time-varying model, and at its core, does not distinguish between output and input variables, which may limit its natural applicability for traditional model predictive control, and particularly applications where the process duration itself might be a decision variable.

To address these issues, new approaches adapting existing subspace identification techniques[5-8] for batch processes have been proposed.[9,10] The present contribution focuses on adapting a recently proposed batch subspace identification approach.[9] In recent results, traditional subspace algorithm has been modified to use the same SVD method but for batch data with varying batch lengths.[11] In subspace identification methods, however, there are limited results that are able to directly handle the problem of missing data since SVD requires matrices to be full rank. Motivated by these considerations, this work presents a different approach to subspace identification that readily enables handing of missing data. The first step is to use latent variable methods (PCA followed by PLS) to identify a reduced dimensional space for the variables which accounts for missing data values. The second step replaces singular value decomposition with PCA to identify the states of the system. This proposed approach is demonstrated on a Polymethyl Methacrylate (PMMA) process as the motivating example. The proposed approach is shown to be more accurate in comparison to both mean replacement and linear interpolation techniques.

References

(1) Lee, K. S.; Lee, J. H. Iterative learning control-based batch process control technique for integrated control of end product properties and transient profiles of process variables. Journal of Process Control 2003, 13, 607 _ 621, Selected Papers from the sixth IFAC Symposium on Bridging Engineering with Science - DYCOPS - 6.

(2) Hu, B.; Zhao, Z.; Liang, J. Multi-loop nonlinear internal model controller design under nonlinear dynamic PLS framework using ARX-neural network model. Journal of Process Control 2012, 22, 207_217.

(3) Nelson, P. R.; Taylor, P. A.; MacGregor, J. F. Missing data methods in PCA and PLS: Score calculations with incomplete observations. Chemometrics and intelligent laboratory systems 1996, 35, 45_65.

(4) Walczak, B.; Massart, D. Dealing with missing data: Part I. Chemometrics and Intelligent Laboratory Systems 2001, 58, 15_27.

(5) Moonen, M.; De Moor, B.; Vandenberghe, L.; Vandewalle, J. On-and off-line identification of linear state-space models. International Journal of Control 1989, 49, 219_232.

(6) Qin, S. J. An overview of subspace identification. Computers and Chemical Engineering

2006, 30, 1502_1513.

(7) Huang, B.; Ding, S. X.; Qin, S. J. Closed-loop subspace identification: an orthogonal projection approach. Journal of process control 2005, 15, 53_66.

(8) Van Overschee, P.; De Moor, B. A unifying theorem for three subspace system identification algorithms. Automatica 1995, 31, 1853_1864.

(9) Corbett, B.; Mhaskar, P. Subspace identification for data-driven modeling and quality control of batch processes. AIChE Journal 62, 1581_1601.

(10) Dorsey, A. W.; Lee, J. H. Building inferential prediction models of batch processes using subspace identification. Journal of Process Control 2003, 13, 397_406.

(11) Corbett, B.; Mhaskar, P. Data-driven modeling and quality control of variable duration batch processes with discrete inputs. Industrial & Engineering Chemistry Research 2017, 56, 6962_6980.