(226d) Tractable Global Solutions to Chance-Constrained Bayesian Optimal Experiment Design for Arbitrary Prior and Noise Distributions | AIChE

(226d) Tractable Global Solutions to Chance-Constrained Bayesian Optimal Experiment Design for Arbitrary Prior and Noise Distributions

Authors 

Rodrigues, D. - Presenter, Instituto Superior Tecnico
Makrygiorgos, G., UC Berkeley
Mesbah, A., University of California, Berkeley
Optimal experiment design (OED) uses a system model to select experimental conditions by maximizing the information content of observations [1]. In Bayesian OED, the design criteria are defined in terms of expected utility, which is often expressed in terms of prior and posterior distributions of the parameters [2]. Bayesian OED is especially advantageous when the system observations are noisy, incomplete, and indirect [3]. Gradient-based optimization approaches can be used to attain locally optimal designs [4]. However, these methods cannot guarantee the global optimality of the selected designs. Furthermore, the handling of path constraints remains a relatively unexplored topic in OED.

This contribution presents a tractable approach for obtaining globally optimal solutions to Bayesian OED for constrained nonlinear systems with probabilistic uncertainty in model parameters. We extend our previous work on Bayesian OED for the particular case of Gaussian prior and observation noise distributions [5] to the general case of arbitrary prior and observation noise distributions via Monte Carlo integration in the observation space. In addition, we propose a method to ensure that the selected design satisfies path constraints with a specified probability. Then, by using a parsimonious input parameterization, we formulate the problem in terms of as few as possible decision variables to enable tractable global solutions [6].

We consider continuous-time, nonlinear dynamical systems where the states depend on the uncertain parameters, the inputs are restricted between lower and upper bounds, and the initial conditions depend on the parameters and the design. Noisy measurements are collected at various instants. We aim to maximize the information content of the observations for estimation of the unknown parameters. To this end, we adopt a Bayesian perspective. Under a given design and a realization of the observations, the change in the information about the parameters between a prior probability density function (pdf) and a posterior pdf is given by Bayes' rule, which also depends on a likelihood function that results in an evidence.

In Bayesian OED, the optimal design is chosen by maximizing an expected utility that corresponds to the prior expectation of a utility function in terms of an integral over the parameter space. In this work, the utility function is the Kullback-Leibler (KL) divergence from the evidence to the likelihood function, and the optimal design is also subject to chance path constraints that specify a probability of violation for each path constraint. The resulting design also maximizes the expected gain in Shannon information from the prior to the posterior distributions.

The expected utility is typically evaluated using nested Monte Carlo integration over the joint observation and parameter space, which can become prohibitively expensive. The chance path constraints are also intractable in their original form due to their formulation in terms of an integral of a nonsmooth function. To address this computational challenge, we approximate the Bayesian OED problem. Under relatively mild assumptions about the prior pdf and observation noise, the optimal design is approximated as the one that maximizes the scalar metric that involves the prior expectation of an integral in the space of observations, which can be approximated via sampling. The chance path constraints are approximated in terms of their first two moments, which are integrals of smooth functions, by choosing appropriate back-off parameters.

The computation of the expected utility and the constraint moments via multivariate integration in the space of parameters typically requires computing the integrands for many samples, which is computationally prohibitive when this procedure is repeated for many designs. Thus, we seek an integration rule based on as few quadrature points as possible. Methods based on Gaussian quadrature minimize the number of points needed for exact integration of polynomials of a given degree. Hence, we use an efficient approach that corresponds to sparse stochastic collocation and is the multivariate equivalent of Gaussian quadrature.

One can express the integrand as an expansion in terms of orthogonal polynomials with respect to the prior pdf. The orthogonal polynomials are given by a truncation scheme to introduce sparsity since this reduces the number of quadrature points when the number of parameters is large. If the number of quadrature points is sufficiently large, one can choose weights and points for exact integration of the orthogonal polynomials. It follows that the approximation error vanishes when the integrand is exactly expressed in terms of these orthogonal polynomials. Hence, this method for multivariate integration is effective for smooth functions.

The approximate expected utility and constraint moments are explicit functions of the states for the different quadrature points. This means that the Bayesian OED problem can be approximated by an optimal control problem (OCP). The inputs that represent the solution to the OCP are composed of several arcs. For each input, each arc can be input constraint-seeking, such that it is determined by the input bounds, state constraint-seeking, such that it is determined by the path constraints, or sensitivity-seeking, such that it is determined by an equality that stems from the dynamics [6]. Hence, there is a finite number of arc types from which arc sequences can be formed. It follows that the number of plausible sequences is also finite.

We aim to show how Bayesian OED problems reformulated as OCPs can be solved efficiently to global optimality. The proposed approach for global optimality relies on computing the optimal values of the decision variables for each arc sequence. Once this is done for each sequence, it is trivial to determine the optimal arc sequence. For a given plausible arc sequence, the inputs are defined by the following decision variables: the switching times between arcs and the initial conditions of the sensitivity-seeking arcs.

For a given arc sequence, we describe the inputs in each interval by defining states for this interval and combining all the states into a vector of extended states. Then, one can eliminate input dependencies and reformulate the OCP in terms of the new decision variables, which is convenient for numerical optimization since there are only a few decision variables. We reformulate the OCP for each arc sequence as a polynomial optimization problem (POP) that is amenable to global optimization. This entails expressing the metric as a polynomial [7]. To this end, we compute the metric and its first-order partial derivatives with respect to the decision variables.

An efficient approach to approximating the metric as a polynomial consists in using multivariate Hermite interpolation to obtain a polynomial that fits the value and the first-order partial derivatives at some sample points. This requires no more than computing the extended states and the extended adjoint variables for each point. The number of sample points is polynomial in the number of decision variables, which is typically small owing to the parsimonious input parameterization. Hence, the OCP for that arc sequence is reformulated as a POP. This POP is solved efficiently to global optimality via reformulation as a hierarchy of convex semidefinite programs using sum-of-squares polynomials. Although the method to solve such problems to global optimality is out of the scope of this contribution, standard methods for this purpose have been described [7].

The proposed approach is applied to a Lotka-Volterra system that is widely used as a benchmark problem for OED. In this case study, the measurement error is state-dependent, the parameters follow a beta prior pdf, and a chance constraint is considered. For each plausible arc sequence, it is possible to extract the unique solution to the POP and certify its global optimality. One can determine the globally optimal solution with no more than 3 arcs, which only requires solving 6 problems in parallel in less than 3600 s. If we had only used local optimization, we could have obtained a worse local solution and it would not have been possible to provide any guarantee that the local solution is in any way close to the globally optimal solution.

In summary, this contribution presents solution methods for obtaining tractable global solutions to Bayesian OED problems in the case of arbitrary prior and observation noise distributions. The proposed methods also deal with designed initial conditions and chance path constraints.

[1] G. Franceschini and S. Macchietto. Model-based design of experiments for parameter precision: State of the art, Chem. Eng. Sci., 63(19):4846 – 4872, 2008.

[2] K. Chaloner and I. Verdinelli. Bayesian experimental design: A review, Stat. Sci., 10(3):273-304, 1995.

[3] X. Huan and Y. M. Marzouk. Simulation-based optimal Bayesian experimental design for nonlinear systems, J. Comput. Phys., 232(1):288-317, 2013.

[4] J. A. Paulson, M. Martin-Casas, and A. Mesbah. Optimal Bayesian experiment design for nonlinear dynamic systems with chance constraints, J. Process Control, 77:155-171, 2019.

[5] D. Rodrigues, G. Makrygiorgos, and A. Mesbah. Tractable global solutions to Bayesian optimal experiment design. In Proceedings of the 59th IEEE Conference on Decision and Control, 1614–1619, Jeju Island, Republic of Korea, 2020.

[6] D. Rodrigues and D. Bonvin. On reducing the number of decision variables for dynamic optimization, Optim. Control Appl. Meth., 41(1):292-311, 2020.

[7] D. Henrion and J. B. Lasserre. GloptiPoly: Global optimization over polynomials with Matlab and SeDuMi, ACM Trans. Math. Softw., 29(2):165-194, 2003.

Checkout

This paper has an Extended Abstract file available; you must purchase the conference proceedings to access it.

Checkout

Do you already own this?

Pricing

Individuals

AIChE Pro Members $150.00
AIChE Emeritus Members $105.00
AIChE Graduate Student Members Free
AIChE Undergraduate Student Members Free
AIChE Explorer Members $225.00
Non-Members $225.00