(470h) Subspace Based Model Identification for Missing Data
AIChE Annual Meeting
2020
2020 Virtual AIChE Annual Meeting
Computing and Systems Technology Division
Data-Driven Techniques for Dynamic Modeling, Estimation and Control II
Tuesday, November 17, 2020 - 9:45am to 10:00am
One problem that is particularly common when dealing with historical data is that of missing data where for certain time instances, process data is not recorded. In some cases, there are periods where an entire set of data is missing due to sensor failure or maintenance on the system. In other cases, the measurements contain large errors and need to be removed from the data set. A common occurrence of missing data in chemical engineering processes is when different sensors have different sampling periods. Thus, while there is continuous data from each sensor the measurements cannot be readily aligned with the other recorded variables.
Partial least squares (PLS) is a data-driven techniques that is often applied to industrial data, especially in cases for which missing data must be accounted for.[2-4] The inherent ability to handle missing data is one of the attributes that makes PLS an attractive method for modeling batch processes. The method identifies a time-varying model, and at its core, does not distinguish between output and input variables, which may limit its natural applicability for traditional model predictive control, and particularly applications where the process duration itself might be a decision variable.
To address these issues, new approaches adapting existing subspace identification techniques[5-8] for batch processes have been proposed.[9,10] The present contribution focuses on adapting a recently proposed batch subspace identification approach.[9] In recent results, traditional subspace algorithm has been modified to use the same SVD method but for batch data with varying batch lengths.[11] In subspace identification methods, however, there are limited results that are able to directly handle the problem of missing data since SVD requires matrices to be full rank. Motivated by these considerations, this work presents a different approach to subspace identification that readily enables handing of missing data. The first step is to use latent variable methods (PCA followed by PLS) to identify a reduced dimensional space for the variables which accounts for missing data values. The second step replaces singular value decomposition with PCA to identify the states of the system. This proposed approach is demonstrated on a Polymethyl Methacrylate (PMMA) process as the motivating example. The proposed approach is shown to be more accurate in comparison to both mean replacement and linear interpolation techniques.
References
(1) Lee, K. S.; Lee, J. H. Iterative learning control-based batch process control technique for integrated control of end product properties and transient profiles of process variables. Journal of Process Control 2003, 13, 607 _ 621, Selected Papers from the sixth IFAC Symposium on Bridging Engineering with Science - DYCOPS - 6.
(2) Hu, B.; Zhao, Z.; Liang, J. Multi-loop nonlinear internal model controller design under nonlinear dynamic PLS framework using ARX-neural network model. Journal of Process Control 2012, 22, 207_217.
(3) Nelson, P. R.; Taylor, P. A.; MacGregor, J. F. Missing data methods in PCA and PLS: Score calculations with incomplete observations. Chemometrics and intelligent laboratory systems 1996, 35, 45_65.
(4) Walczak, B.; Massart, D. Dealing with missing data: Part I. Chemometrics and Intelligent Laboratory Systems 2001, 58, 15_27.
(5) Moonen, M.; De Moor, B.; Vandenberghe, L.; Vandewalle, J. On-and off-line identification of linear state-space models. International Journal of Control 1989, 49, 219_232.
(6) Qin, S. J. An overview of subspace identification. Computers and Chemical Engineering
2006, 30, 1502_1513.
(7) Huang, B.; Ding, S. X.; Qin, S. J. Closed-loop subspace identification: an orthogonal projection approach. Journal of process control 2005, 15, 53_66.
(8) Van Overschee, P.; De Moor, B. A unifying theorem for three subspace system identification algorithms. Automatica 1995, 31, 1853_1864.
(9) Corbett, B.; Mhaskar, P. Subspace identification for data-driven modeling and quality control of batch processes. AIChE Journal 62, 1581_1601.
(10) Dorsey, A. W.; Lee, J. H. Building inferential prediction models of batch processes using subspace identification. Journal of Process Control 2003, 13, 397_406.
(11) Corbett, B.; Mhaskar, P. Data-driven modeling and quality control of variable duration batch processes with discrete inputs. Industrial & Engineering Chemistry Research 2017, 56, 6962_6980.