(415f) A General Strategy for Quantification of the Uncertainty on a Hybrid Model’s Uncertainty Estimates | AIChE

(415f) A General Strategy for Quantification of the Uncertainty on a Hybrid Model’s Uncertainty Estimates

Authors 

Rossi, F. - Presenter, Purdue University
Huang, Y. S., Purdue University
Kumar, S., Purdue University
Lagare, R., Purdue University
Mockus, L., Purdue University
Reklaitis, G., Purdue University
The industry 4.0 paradigm [1] calls for the systematic use of model-based design, monitoring, control and optimization for improving quality and reproducibility in product manufacturing as well as enhancing process automation, economics, reliability and safety. This paradigm shift in industrial practice has already influenced several industrial sectors, including the chemical and pharmaceutical industries [2], and is expected to accelerate in the future. So, every research contribution, which can help facilitate this transition, should interest both academia and industry.

A key barrier in the systematic implementation of the industry 4.0 guidelines is the lack of process models, accurate and reliable enough for purposes of process design, monitoring, control and optimization (the predictions of any mathematical model are indeed affected by parametric and/or structural uncertainty [3]). To help mitigate this problem, we can develop hybrid mechanistic/statistical models (conventional first-principles models, based on mass, energy and momentum conservation, augmented with Bayesian neural networks), which retain all the benefits of first-principles models, learn from experimental/process data as data-driven models, and provide point estimates of the process variables of interest as well as estimates of their degree of uncertainty. These smart machine learning models are accurate, reliable and easy to maintain, thus can be safely embedded into any framework for optimal design and operation of industrial processes [4].

Although hybrid mechanistic/statistical models represent a substantial step forward compared to conventional first-principles and machine learning models, they still suffer from certain limitations, the most important of which is the unknown degree of uncertainty of the estimates of the uncertainty on their predictions. It is not surprising that a hybrid model’s uncertainty estimates are, in turn, uncertain because their computation ultimately relies on Bayes theorem and thus on a likelihood function (Figure 1), for which we often do not have an exact expression (the formulation of an exact likelihood function would require the actual probability distribution/s of the experimental/process data, used to train the hybrid model, which is/are usually not available). This contribution offers a way to mitigate this limitation of smart machine learning models, namely, it describes a general strategy for quantification of the uncertainty on their uncertainty estimates.

The aforementioned strategy, called HM-UQ2, relies on hierarchical Bayesian inference [5] and encompasses four major steps:

1. First, given the available experimental/process data, we use hierarchical Bayesian inference to estimate an appropriate distribution over distributions (DoD) for every measured process variable (each of these statistical objects is a parametric family of probability distributions, in which every single member distribution is associated with a specific probability value)

2. Then, we define two different likelihood functions through appropriate manipulation of the distributions over distributions, estimated in phase 1, namely, a conventional likelihood function, computed with the global modes of the single DoD’s (the global mode of a DoD is the member distribution associated with the maximum probability value), and an augmented likelihood function, computed considering all the member distributions of all the DoD’s, which incorporates valuable information on the degree of uncertainty of the likelihood formulation.

3. Next, we exploit the two likelihood functions, formulated in phase 2, and hierarchical Bayesian inference to estimate two different probability distributions of the parameters/hyperparameters of the hybrid model.

4. Finally, to estimate the degree of uncertainty of the hybrid model’s uncertainty estimates, we perform uncertainty propagation with the two different model parameter/hyperparameter distributions, estimated in phase 3, and use the results of these calculations to compute an appropriate measure of the distance between the two different, resulting uncertainty estimates.

Note that HM-UQ2 moderately increases the computational cost of training smart machine learning models (it requires solution of a larger number of hierarchical Bayesian inference problems than traditional training approaches) but it only marginally affects the computational burden associated with using smart machine learning models to make predictions (the uncertainty propagation step is fully parallelizable and thus quite fast). Consequently, the benefits offered by HM-UQ2 far outweigh its sub-optimal computational efficiency.

The value of quantifying the uncertainty of a hybrid model’s uncertainty estimates is demonstrated on two important pharmaceutical engineering applications, namely, online quality assurance and sensor health monitoring in drug product manufacturing. The pilot-scale plant for production of solid oral doses, used as reference system in these case studies, is part of the assets of the Center for Particulate Products and Processes (CP3) at Purdue University.

References

1. Santos, C., Mehrsai, A., Barros, A. C., Araújo, M., and Ares, E., Towards Industry 4.0: an overview of European strategic roadmaps, Procedia Manufacturing, 13, 972-979 (2017)

2. Mockus, L., Reklaitis, G., Morris, K., and LeBlond, D., Risk-Based Approach to Lot Release, Journal of Pharmaceutical Sciences, 109, 1035-1042 (2020)

3. Rossi, F., Mockus, L., and Reklaitis, G., Rigorous Bayesian Inference VS New Approximate Strategies for Estimation of the Probability Distribution of the Parameters of DAE Models, Computer Aided Chemical Engineering, 46, 931-936 (2019a)

4. Rossi, F., Manenti, F., Buzzi-Ferraris, G., and Reklaitis, G., Stochastic NMPC/DRTO of batch operations: Batch-to-batch dynamic identification of the optimal description of model uncertainty, Computers & Chemical Engineering, 122, 395-414 (2019b)

5. Gelman, A., Carlin, J. B., Stern, H. S., Dunson, D. B., Vehtari, A., and Rubin, D. B., Bayesian Data Analysis, CRC press (2013)