(11d) Subspace Based Model Identification for Batch Quality Analysis Using Missing Data Algorithms | AIChE

(11d) Subspace Based Model Identification for Batch Quality Analysis Using Missing Data Algorithms

Authors 

Patel, N. - Presenter, McMaster University
Mhaskar, P., McMaster University
Sivanathan, K., McMaster University
Developing a good process model capable of handling batch process data is the key to good quality control. Batch processes tend towards the low volume production of high value products as this allows for poor quality batches to be discarded. Each discarded batch represents a significant loss in revenue motivating the need for advanced batch control approaches and fundamentally the development of accurate quality models. Recent advances in computing technology have led to increased amounts of historical data being available making data-driven modeling a viable choice for model identification.

An important consideration when identifying data-driven models is the choice of input and output variables. While the distinction is typically based on variables that are controlled in comparison to those that are measured, the batch process also introduces a separate type of output variable; quality variables. Quality variables are still considered to be outputs from the process however they are not measured continuously nor are they always measured online. Quality measurements are often calculated based on the regular measured outputs or are determined by separate analyses of the batch. This difference is an important consideration for data driven modeling techniques as one assumption that is prevalent in process modeling scenarios is that the inputs and outputs are sampled at a single and uniform sampling rate. In practice, industrial processes often have different sampling rates for input and output variables. Additionally, processes record quality measurements at an even slower rate compared to traditional inputs and outputs. This leads to scenarios where inputs and outputs have some missing values due to differences in sampling rates whereas quality measurements are available at extremely low frequencies. This presents a challenge to traditional data driven modeling approaches that require complete data sets to identify a model.

This presentation addresses the problem of quality modeling in batch process data using a missing data subspace algorithm that adapts nonlinear iterative partial least squares (NIPALS) algorithms from both partial least squares (PLS) and principal component analysis (PCA) is utilized to build a data driven model. The use of NIPALS algorithms allow for the correlation structure of the input-output data to minimize the impact of the large amounts of missing quality measurements. These techniques are applied to a polymethyl methacrylate (PMMA) process example to show the efficacy of the missing data approach at identifying batch quality models.