(371l) The Evolution of Approximated Kinetic Model Structures

Authors: 
Quaglio, M., University of Padova
Fraga, E. S., University College London
Galvanin, F., University College London

The evolution of approximated kinetic model structures

Marco Quaglio, Eric S. Fraga and Federico Galvanin

CPSE, Department of Chemical Engineering, University College London (UCL), Gower St., WC1E 6BT London, United Kingdom

Kinetic phenomena are often modeled as systems of differential and algebraic equations where a high number of kinetic parameters and state variables may be involved. The complexity associated with kinetic phenomena frequently leads to the construction of model structures characterized by some degree of approximation. Whenever an approximated model is falsified by data, i.e. whenever a significant process-model mismatch is observed, the model structure should be modified embracing the available experimental evidence. Nonetheless, improving an approximated model structure is a challenging task that heavily relies on the presence of experienced scientists.

Once the free parameters of the model are estimated by data fitting [1], the model is typically validated through an analysis on the residuals of the fit [2]. A goodness-of-fit test on the model residuals can inform on the presence of modeling errors, namely over-fitting or under-fitting, but does not provide clear information on how to improve the model structure. Systematic approaches are available to amend the structure of an over-fitting model. If over-fitting is detected one should proceed by performing a Wald test [3] to challenge the hypothesis that some free parameter equals 0. If the available data do not provide sufficient evidence to disprove this hypothesis, the parameter should be removed from the set of free model parameters and fixed to 0. Whenever under-fitting is detected, a modification in the model structure may be required to reduce the observed process-model mismatch. Superstructure-based approaches have been proposed to improve parametric models. These approaches regard the kinetic model as a constrained instance of one or multiple alternative superstructures [4,5]. Statistical tests are then employed to challenge the constrained model against the different alternatives. However, a significant limitation of this approach is that the definition of superstructures relies on the intuition of the modeler.

A data-driven approach to improve the structure of kinetic models is proposed. It is assumed that a dataset of kinetic data is available and that a candidate model structure is proposed to describe the kinetic phenomenon under analysis. The procedure, illustrated in Figure 1, involves the following sequential steps:

  1. Parameter estimation. The free parameters involved in the candidate model structure are fitted to the dataset using a maximum likelihood approach [1].
  2. Goodness of fit test. A goodness of fit test is performed on the model residuals to detect the presence of process-model mismatch [2].
  3. Evolution. If process-model mismatch is detected, the model structure is modified by evolving relevant model parameters into more complex state-dependent expressions. The evolution step is performed in two stages:


    1. Model diagnosis. At this stage a Lagrange multipliers test [6] is performed to detect which model parameters are expected to improve the model fitting quality the most should they be evolved into more complex state-dependent functions. A Model Modification Index (MMI) is computed for each model parameter as a function of the test statistic and it is proposed as a measure of model misspecification. A large MMI computed for a model parameter suggests that a significant improvement on the model fitting quality is expected if that parameter were evolved into a more complex function of the state variables.

    2. Model evolution. The parameter with the highest MMI is selected for mutation. The Lagrange multipliers test is further employed at this stage to compute a set of Effect Relevance Indexes (ERIs) and to detect which effects (i.e. which state variables or which state-dependent expressions) should be considered for an opportune mutation of the parameter. The parameter is evolved into a function of the most relevant computed effect. Once the model structure is evolved, the procedure is repeated from step 1. Notice that neither the model diagnosis nor the model mutation step require the re-estimation of the model parameters.

The procedure continues until process-model mismatch is no longer detected. In the illustrated approach, the evolution of approximated kinetic models is driven by data. Statistical evidence, quantified by MMIs and ERIs provides a measure to drive changes in the model structure with the aim of reducing the observed process-model mismatch. If process-model mismatch is not detected, one may proceed with additional model trimming procedures for improving the statistical quality of the parameter estimates [7] and/or removing parameters that are irrelevant for model fitting.

The approach is demonstrated with a simulated baker’s yeast cultivation [8]. The case study is illustrated in Figure 2. In the case study, the aim of the modeler is to accurately describe the kinetic behaviour of biomass concentration x1 and substrate concentration x2 in a fed-batch reactor. A kinetic dataset is generated in-silico assuming that baker’s yeast growth obeys a Cantois-type kinetic, where the growth rate r is inversely proportional to the biomass concentration x1. An approximated iteration 1 model structure is employed to fit the surrogate measurements. In the iteration 1 model, the inhibition effect of biomass on the growth rate is not considered and a goodness-of-fit test highlights the presence of a significant process-model mismatch. The largest model modification index (highlighted in red in Figure 2) is computed for parameter θ2 appearing at the denominator of the growth rate expression, meaning that a significant improvement in the model performance is expected if θ2 were evolved into a more complex state-dependent function. The largest effect relevance index associated to θ2 (highlighted in red in Figure 2) is computed for the biomass concentration x1. The model structure is modified by evolving parameter θ2 into a first order response surface θ2+θ5x1, which includes the main detected effect. After the evolution, the iteration 2 model is fitted to the kinetic dataset and process-model mismatch is no longer detected by the goodness-of-fit test.

In the illustrated case study, an approximated model structure of baker's yeast growth was enhanced using the proposed approach for model improvement based on the computation of MMIs and ERIs. It was possible to detect the presence of an inhibiting effect of biomass concentration on the growth rate that was not considered in the initially available approximated model structure. This work paves the way to a fully automated search-based framework for kinetic model identification. The validation of the proposed approach on further case studies, both with simulated and real experimental data, is going to be the object of future research activities.


Figure 1. Block diagram showing the proposed framework for kinetic model improvement.

Figure 2. Iteration 1 and Iteration 2 model structures for baker's yeast growth. The iteration 1 model structure incorrectly assumes that biomass concentration x1 does not inhibit yeast growth rate and a significant process-model mismatch is detected. The computed MMIs and ERIs suggest that a significant model improvement may be achieved should parameter θ2 be evolved in a more complex function of the biomass concentration x1. The iteration 2 model is constructed by evolving θ2 into the first order response surface θ2+θ5x1. After evolution, the iteration 2 model structure is fitted to the data and process-model mismatch is no longer detected.

References

  1. Y. Bard, Nonlinear Parameter Estimation. Academic Press, 1974.
  2. S. D. Silvey, Statistical Inference. CRC Press, 1975.
  3. A. Wald, “Tests of statistical hypotheses concerning several parameters when the number of observations is large,” Trans. Am. Math. Soc., vol. 54, no. 3, pp. 426–482, 1943.
  4. C. Tsay, R. C. Pattison, M. Baldea, B. Weinstein, S. J. Hodson, and R. D. Johnson, “A superstructure-based design of experiments framework for simultaneous domain-restricted model identification and parameter estimation,” Comput. Chem. Eng., Feb. 2017.
  5. R. F. Engle, “A general approach to lagrange multiplier model diagnostics,” J. Econom., vol. 20, no. 1, pp. 83–104, Oct. 1982.
  6. S. D. Silvey, “The Lagrangian Multiplier Test,” Ann. Math. Stat., vol. 30, no. 2, pp. 389–407, Jun. 1959.
  7. G. Franceschini and S. Macchietto, “Model-based design of experiments for parameter precision: State of the art,” Chem. Eng. Sci., vol. 63, no. 19, pp. 4846–4872, Oct. 2008.
  8. S. P. Asprey and S. Macchietto, “Statistical tools for optimal dynamic model building,” Comput. Chem. Eng., vol. 24, no. 2, pp. 1261–1267, Jul. 2000.