(234f) Exploiting Grey-Box Hybrid Models in Constrained Bayesian Optimization Using a Smoothed Sample Average Approximation | AIChE

(234f) Exploiting Grey-Box Hybrid Models in Constrained Bayesian Optimization Using a Smoothed Sample Average Approximation

Authors 

Lu, C. - Presenter, Ohio State University, Department of Chemical and
Paulson, J., The Ohio State University
Optimization is a crucial tool in the process system industry, as optimal solutions for key design parameters can lead to improved performance measured in terms of significant improvement in net profit, sustainability, and/or other key process indicators [1]. A typical optimization procedure not only requires access to an equation-oriented (also known as white-box) model, but also the ability to efficiently evaluate first- and second-order derivatives of this model [2]. However, in many cases, there is at least one or more components of the model that are unknown (commonly referred to as black-box models) that prevent us from applying established derivative-based optimization methods. In such situations, one often resorts to a general-purpose derivative-free optimization (DFO) solver. DFO methods can be broadly categorized into stochastic and deterministic approaches, with stochastic approaches (e.g., genetic algorithm, particle swarm optimization) being known to require many function evaluations, which prevents them from being applied to expensive-to-evaluate functions [3]. Bayesian optimization (BO) [4] is a class of deterministic DFO methods that are specifically built to handle expensive functions that produce noisy evaluations. There are two main steps in BO: (i) the construction of a probabilistic surrogate model using Bayesian statistics and (ii) the combination of this surrogate model with an expected utility (or acquisition) function to defines the “value” of the next query point, which is maximized to decide where next to sample the expensive objective. Through proper selection of the acquisition function, this sequential optimization procedure automatically tradeoffs between exploration of regions where the surrogate model is most uncertain, and exploitation of the regions predicted to be near-optimal based on the current model.

Due to recent successes in several different application domains [5], [6], [7], there has been significant interest in extending BO to settings beyond that of simply an unknown objective with known constraints. For example, two important features that are highly relevant in engineering problems are constraints and hybrid (or grey-box) models. There has been a significant amount of work on development of constrained BO methods that can broadly categorized as either implicit or explicit. Implicit methods directly modify the acquisition function to account for constraints whereas explicit methods include additional hard constraints in the acquisition optimization sub-problem. No matter which of these methods is chosen, they all assume that the entire problem is black box in nature, meaning we have little-to-no prior information about its structure, which is rarely the case in real-world problems. It has been shown in [8] that this black-box assumption fully limits the attainable rate of convergence, such that we can expect the biggest gains in performance to be achieved when this assumption is relaxed. This concept forms the basis of most grey-box optimization algorithms that have been developed within the process systems engineering community over the past several years, e.g., [9], [10], [11], [12], [13]. Despite the prevalence of grey-box modeling/optimization, there has been limited work in grey-box (constrained) BO due to the complexities introduced when combining known equations with a probabilistic model. Our group recently proposed such a method, COBALT [14], that is applicable to any constrained grey-box optimization model whose objective and/or constraints are represented by composite functions, i.e., f(x)=g(h(x)) where g(.) and h(.) are white-box and black-box functions, respectively. In [14], we demonstrated that COBALT could achieve significant performance gains by exploiting this composite structure in a variety of test and real-world problems. However, a potential challenge with the current version of COBALT is that it uses an explicit constraint handling method based on linearization of the probabilistic surrogate model around its mean function. This approximation method greatly simplifies the constrained acquisition optimization sub-problem (by allowing us to replace complex joint chance constraints with a simple moment-based approximation); however, it makes the theoretical analysis of the algorithm more difficult and may not provide ideal performance when any of the constraints are highly nonlinear.

In this talk, we develop a modified version of COBALT that overcomes the previous constraint handling limitation. In particular, we propose a novel acquisition function that directly incorporates constraints without having to define additional parameters related to the degree of “backoff” in the constraints. This acquisition is directly analogous to the expected improvement with constraints (EIC) function from [15] (which is one of the most popular acquisition functions in constrained BO) but extends it to readily account for the composite structure of the constraints. However, even when we use a Gaussian process (GP) model for the unknown portion of the objective and constraints, we cannot evaluate the modified EIC function directly since it is defined in terms of the probability of constraint satisfaction. Although we can easily estimate the required constraint satisfaction probability using the whitening transformation and sample average approximation (SAA), this estimator is non-differentiable due to the indicator functions that appear in the sample-based probability estimate. To overcome this limitation, we incorporate a smoothing function into the SAA that allows us to effectively optimize the modified EIC using state-of-the-art gradient-based optimization methods. Although this introduces some level of approximation error, we show how this error can be sequentially refined by updating the smoothing parameter that ensures the smoothing function converges to the indicator function under an appropriate limit. The resulting modified COBALT method, which is completely parameter-free, is compared to the original COBALT method along with other constrained grey-box optimization algorithms on a variety of test problems including optimization of a complex model of an ambient air & solar (AAS) power plant [16]. We will show that parameter-free version of COBALT can achieve improved performance (in terms of metrics related to sample efficiency) compared to these available alternatives.

References

[1] J. Herskovits, P. Mappa, E. Goulart, C. M. M. Soares, Mathematical programming models and algorithms for engineering design optimization, Computer Methods in Applied Mechanics and Engineering 194 (30-33) (2005) 3244-3268.

[2] N. V. Sahinidis, BARON: A general purpose global optimization software package, Journal of Global Optimization 8 (2) (1996) 201-205.

[3] L. M. Rios, N. V. Sahinidis, Derivative-free optimization: A review of algorithms and comparison of software implementations, Journal of Global Optimization 56 (3) (2013) 1247-1293.

[4] B. Shahriari, K. Swersky, Z. Wang, R. P. Adams, N. De Freitas, Taking the human out of the loop: A review of Bayesian optimization, Proceedings of the IEEE 104 (1) (2015) 148-175.

[5] Sorourifar, F., Makrygirgos, G., Mesbah, A., & Paulson, J. A. (2021). A data-driven automatic tuning method for MPC under uncertainty using constrained Bayesian optimization. IFAC-PapersOnLine, 54(3), 243-250.

[6] Sorourifar, F., Choksi, N., & Paulson, J. A. (2021). Computationally efficient integrated design and predictive control of flexible energy systems using multi-fidelity simulation-based Bayesian optimization. Optimal Control Applications and Methods.

[7] Frazier, P. I., & Wang, J. (2016). Bayesian optimization for materials design. In Information science for materials discovery and design (pp. 45-75). Springer, Cham.

[8] H. Chen. Lower rate of convergence for locating a maximum of a function. The Annals of Statistics, 1330-1334, 1988.

[9] J. P. Eason, L. T. Biegler, A trust region filter method for glass box/black box optimization, AIChE Journal 62 (9) (2016) 3124-3136.

[10] I. Bajaj, S. S. Iyer, M. M. F. Hasan, A trust region-based two phase algorithm for constrained black-box and grey-box optimization with infeasible initial point, Computers & Chemical Engineering 116 (2018) 306-321.

[11] S. H. Kim, F. Boukouvala, Surrogate-based optimization for mixed-integer nonlinear problems, Computers & Chemical Engineering 140 (2020) 106847

[12] B. Beykal, F. Boukouvala, C. A. Floudas, N. Sorek, H. Zalavadia, E. Gildin, Global optimization of grey-box computational systems using surrogate functions and application to highly constrained oil- eld operations, Computers & Chemical Engineering 114 (2018) 99-110

[13] B. Beykal, F. Boukouvala, C. A. Floudas, E. N. Pistikopoulos, Optimal design of energy systems using constrained grey-box multi-objective optimization, Computers & Chemical Engineering 116 (2018) 488-502.

[14] J.A. Paulson & C. Lu. COBALT: COnstrained Bayesian optimizAtion of computationaLly expensive grey-box models exploiting derivaTive information. Computers & Chemical Engineering, 160, 107700, 2022.

[15] J. R. Gardner, M. J. Kusner, Z. E. Xu, K. Q. Weinberger, and J. P. Cunningham, “Bayesian Optimization with Inequality Constraints,” vol. 32, 2014.

[16] S. Yang, “Solar-driven liquid air power plant modeling , design space exploration , and multi-objective optimization,” Energy, vol. 246, p. 123324, 2022, doi: 10.1016/j.energy.2022.123324.