(522b) A New Approximate Method for Rapid Estimation of Probability Distributions

Authors: 
Rossi, F., Purdue University
Mockus, L., Purdue University
Reklaitis, G. V. R., Purdue University

A
new approximate method for rapid estimation of probability distributions

Francesco Rossi*, Linas
Mockus, Gintaras Reklaitis

Purdue University,
Forney Hall of Chemical Engineering, 480 Stadium Mall Drive, West Lafayette, IN
47907-2100, United States

frossi@purdue.edu

Numerical algorithms for
estimation of probability distributions are the backbone of advanced process
design, monitoring and optimization techniques, e.g. stochastic dynamic
optimization (Rossi et al., 2016), robust state estimation (Mondal et al.,
2010) and real-time risk assessment (Si et al., 2012). Due to the increasing
availability of accessible experimental/process data, the latter have recently
attracted the attention of both industry and academia. Therefore, the
development of new and more efficient methods for PDF estimation may enable the
systematic application of advanced process design, monitoring and optimization
strategies to industrial plants, which may benefit process economics and
improve process safety.

Most of the conventional
algorithms for estimation of probability density functions (PDFs), e.g.
Metropolis-Hastings, Gibbs sampling and Hamiltonian Monte Carlo (Gamerman and
Lopes, 2006), consist of Monte Carlo sampling strategies, designed to draw random
samples from the high probability regions of the PDF of interest. These
algorithms are well-established, accurate and reliable but suffer from some
important drawbacks: (I) considerable computational cost; (II) poor scalability
and limited intrinsic concurrency; and (III) poor performance in the presence
of high parameter correlation. Moreover, they also suffer from an additional
disadvantage, specific to chemical engineering applications, which often
involve estimation of the probability distribution of the parameters of some process
model. More specifically, they require an implicit mathematical formulation of
the parameter PDF, which is computed by multiplying a prior distribution by a
likelihood function (Bayes theorem). This is a considerable disadvantage because,
under several circumstances, it may be challenging to derive an exact
expression for the aforementioned likelihood function (as an illustrative example,
consider those situations in which measurement errors are correlated and/or are
not normally distributed).

margin-bottom:0cm;margin-left:0cm;margin-bottom:.0001pt;text-align:center;
page-break-after:avoid">

margin-bottom:6.0pt;margin-left:0cm;text-align:center">Figure 1:
Architecture of the new PDF estimation algorithm proposed in this contribution.

Therefore, this contribution
proposes a new approximate strategy for PDF estimation, tailored for (chemical)
engineering applications, which can mitigate some of the aforementioned issues
(Figure 1). This new algorithm can
estimate the PDF of the parameters of any linear/nonlinear,
algebraic/differential process model in three consecutive phases. More
specifically, we first project the measurements of every process state onto the
space of the model parameters, i.e. the uncertainty space, by solving several
small-scale, independent, dynamic optimization problems (this is conceptually
the inverse of the problem of propagation of probability distributions through
mathematical models). This operation generates a set of parameter samples,
which are then used to construct a closed form approximation of the likelihood
function, using a combination of expectation maximization and kernel density
estimation techniques. Finally, we compute the posterior distribution by
multiplying this closed form estimate of the likelihood by a user-supplied
prior distribution (this is a straightforward application of Bayes theorem).
Thanks to this innovative, three-stage architecture, this new algorithm for PDF
estimation offers much higher computational efficiency and better scalability
than conventional Monte Carlo methods. In addition, it internally computes an
estimate of the likelihood function, thus relieves the user of this challenging
task. On the other hand, it provides an estimate of the posterior distribution,
which may not be as accurate as that computed by conventional Monte Carlo
methods, because of the approximate nature of the algorithm calculations.

The new PDF estimation
strategy, proposed in this contribution, has been demonstrated on two different
case studies, namely, the estimation of the probability distribution of the key
parameters of a simple fed-batch reactor model and of a more complex
physiologically-based pharmacokinetic model. As a basis for comparison, these
two problems have also been solved with conventional Monte Carlo methods. Both
case studies confirm that the new PDF estimation strategy, proposed in this
work, offers a very good trade-off between accuracy and computational
efficiency, which make it the ideal candidate for time-critical PDF estimation
tasks. The latter are common in stochastic dynamic optimization, robust data
reconciliation and robust soft-sensing problems.

References

Gamerman, D., Lopes, H.F.
(2006). Markov Chain Monte Carlo - Stochastic simulation for Bayesian inference.
Taylor & Francis Group, New York (NY).

Mondal, S., Chakraborty,
G., Bhattacharyy, K. (2010). LMI approach to robust unknown input observer
design for continuous systems with noise and uncertainties. International
Journal of Control, Automation and Systems, 8, 210-219.

Si, H., Ji, H., Zeng, X.
(2012). Quantitative risk assessment model of hazardous chemicals leakage and
application. Safety Science, 50, 1452-1461.

Rossi, F., Reklaitis, G., Manenti,
F., Buzzi-Ferraris, G. (2016). Multi-scenario
robust online optimization and control of fed-batch systems via dynamic
model-based scenario selection. AIChE Journal, 62, 3264-3284.