(188x) Dual-Rate Approach for Data-Driven Modeling and Prediction of Behavior of Processes with Variations in Sampling Frequencies

Authors: 
Gan, J., Chemical and Biological Engineering, Illinois Institute of Technology
Parulekar, S. J., Illinois Institute of Technology
Cinar, A., Illinois Institute of Technology
Multi-rate systems, where inputs and/or outputs have different sampling rates, are encountered in many chemical and biological engineering applications. Typically, this occurs in systems where multiple variables are measured. Process variables such as temperature, pressure and pH are measured more frequently, at times nearly continuously, while other variables, such as species concentrations in a reaction mixture, are measured less frequently. An example of such processes is batch and fed-batch mammalian cell cultures, where sampling rates for viable hybridoma cells, nutrients glucose and glutamine, by-products lactate and ammonia, and the target product, a MAB, are significantly different. The simplest way to handle multi-rate systems is to neglect excess data from fast sampling signals and synchronize the signals with the slowest sampling rate. With this method, great amount of data will be discarded, the data that may contain crucial information regarding system dynamics. Therefore, some effort has been devoted to modeling, analysis and identification of multi-rate systems to take advantage of the rich information available in the experimental databases. In all the techniques proposed, multi-rate systems are generally simplified to multiple dual rate systems where the slower sampling rate is a positive integer () multiple of the faster sampling rate. One of the most successful techniques, polynomial transformation technique, can successfully transfer single rate models into dual rate models with ease. The dual rate model, after transformation, can utilize dual rate signals simultaneously. For such model structures, multiple identification techniques have been proposed with good estimation accuracy and convergence rate.

In this work, a second order discrete time system is considered first to demonstrate convergence of the proposed parameter estimation algorithm for a dual rate model. The convergence rates and estimation accuracy for various values are compared to examine efficacy of the dual rate parameter estimation technique. Illustrative examples are discussed. One of these pertains to application of a dual rate model with parameter estimation technique for modeling and prediction of mammalian cell cultures. The fastest sampling rate corresponds to viable hybridoma cell concentration, an intermediate sampling rate corresponds to glucose and glutamine concentrations, and the slowest sampling rate corresponds to monoclonal antibody (MAb) concentration. The sampling rate for monoclonal antibody is much slower than that for glucose and glutamine. Recursive time series models are developed for key culture variables. The model parameters are recursively estimated by using a least square estimation algorithm. Model stability is evaluated and confirmed by converting the time series model into a state space counterpart. The most rapidly measured variable, viable cell concentration in cell culture is expressed by a recursive ARMAX model. The performance of the dual rate model coupled with frequent parameter estimation in representation and prediction of mammalian cell culture producing a MAB is examined in considerable detail.

Monoclonal antibodies (MAbs) have extensive biomedical applications and are produced in mammalian cell bioreactors at a variety of scales, with glucose and glutamine being the principal carbon and nitrogen sources required for cellular metabolism. Fed-batch operation has certain inherent advantages over batch culture for MAb production. Design, optimization, scale-up, and control of bioreactors used for MAb production requires reliable predictive empirical or mechanistic models for key cellular activities. The models used in prior studies have largely been the first principles-based models (FPMs), although data-driven models are receiving increasing attention due to their certain inherent benefits and increasing information available about biological processes. The simpler and much less rigid structure of data-driven models facilitates frequent updating of parameters and prediction of process trajectories, increasing their utility in representation, monitoring and control of these processes. Recursive time series models are developed in this work for representation of a mammalian cell culture, with process variables that are determinants of the performance of the culture being measured at different sampling frequencies. Appropriate parameter constraints have been imposed in parameter estimation algorithms and stability of these has been examined and is ensured. The data required for estimation of parameters are generated from simulated experiments using a well-tested FPM considering random variations in manipulated inputs, kinetic parameters in the FPM, and measurement error for an output. Glucose and glutamine being determinants of mammalian cell metabolism, their supply rates are considered to be the inputs. The predictive ability of the data-driven models is examined and demonstrated over a broad range of prediction horizon.