(56f) Entmoot: A Framework for Optimization over Ensemble Tree Models | AIChE

(56f) Entmoot: A Framework for Optimization over Ensemble Tree Models

Authors 

Misener, R. - Presenter, Imperial College London
Thebelt, A., Imperial College London
Mistry, M., Imperial College
Lee, R. M., BASF SE
Using data-driven methods in chemical engineering remains challenging due to the high-dimensional input feature space of complex physical systems. Ever-increasing monitoring of chemical plants allows large-scale collection of system information. However, with control equipment keeping processes at only a few operating points [Pantelides and Renfro, 2013; Tsay et al., 2018], the resulting datasets show low variability. Experimental studies may help to gain more system information but are expensive and time-consuming, resulting in inherently sparse datasets. Coupled with high system dimensionality, this leads to two related challenges: large datasets with low variability and small datasets with high variability.

Using data-driven models as surrogates for mechanistic models in optimization settings is becoming increasingly popular [Bhosekar and Ierapetritou, 2018; Ning and You, 2017]. Successful applications of such surrogate models use algebraic equations [Boukouvala and Floudas, 2017; Wilson and Sahinidis, 2017], artificial neural networks [Henao and Maravelias, 2011; Schweidtmann and Mitsos, 2019], Gaussian processes [Palma and Realff, 2002; Davis and Ierapetritou, 2007; Caballero and Grossmann, 2008; Olofsson et al., 2019] and gradient boosted trees [Mišić 2017; Mistry et al., 2018].

We present the ENsemble Tree MOdel Optimization Tool (ENTMOOT) [Thebelt et al., 2020] which provides (i) strong regression performance, (ii) reliable model uncertainty quantification, (iii) natural support of discrete and categorical features, (iv) mathematically proven -global optimal solutions of the underlying acquisition function and (v) support for hard constraints. ENTMOOT uses gradient boosted trees [Friedman, 2001; Friedman, 2002], a well-established machine learning architecture that can effectively handle sparse data and naturally support discrete and categorical features. ENTMOOT encodes gradient boosted trees as a mixed-integer linear program (MILP) according to Mišić (2017). A distance-based model uncertainty measure captures the prediction performance of the surrogate model and identifies its weaknesses. ENTMOOT derives an acquisition function to trade-off the objective captured in the surrogate model and the model uncertainty.

We distinguish between two related applications: (1) optimal decision-making under uncertainty and (2) black-box optimization. Optimal decision-making uses surrogate models to approximate physical systems, e.g. the behavior of a manufacturing unit, and includes such models into larger optimization problems. Model-based black-box optimization trains surrogate models to guide sequential evaluations of the black-box to determine its optimum, e.g. propose new promising catalyst compositions to maximize reaction yield. ENTMOOT captures these modes in the acquisition function by either penalizing model uncertainty deriving optimal solutions where we expect good model performance, or incentivizing model uncertainty to efficiently explore promising areas of the feature space. The resulting mixed-integer nonlinear program (MINLP) is convex for the optimal decision-making and nonconvex for black-box optimization depending on how the distance-based uncertainty measure contributes to the objective function.

Both MINLP formulations are internally optimized by the third-party software tool Gurobi 9, providing -global optimal solutions and capturing the best trade-off between model uncertainty and predicted performance as intended in the acquisition function. Moreover, the MINLP can be extended with additional hard constraints to include safety related restrictions or existing physical domain knowledge enhancing the underlying data-driven model. Other approaches using tree-based models for model-based black-box optimization often rely on stochastic optimization techniques [B. Shahriari, 2016], and thereby fail to provide -global optimal solutions of the acquisition function and satisfy additional constraints.

In an extensive numerical study, we compare ENTMOOT’s performance with scikit-optimize, a popular library for Bayesian optimization using gradient boosted trees and Gaussian processes. We test ENTMOOT’s performance on a set of 14 black-box functions, including global optimization benchmark functions [S. Surjanovic and D. Bingham, 2020] and a fermentation model. In particular, the study shows: (i) ENTMOOT’s distance-based measure comparing well against other tree-model uncertainty measures, (ii) global optimization techniques becoming increasingly important for high-dimensional problems, and (iii) the ENTMOOT approach competing well against state-of-the-art Bayesian optimization frameworks using Gaussian processes.

ENTMOOT is an attractive framework to effectively handle sparse data in high-dimensional settings for optimal decision-making under uncertainty and black-box optimization to propose new promising data points to enhance system knowledge.

References:

A. Bhosekar, M. Ierapetritou, 2018. Advances in surrogate based modeling, feasibility analysis, and optimization: A review. Computers & Chemical Engineering 108, 250­–267

F. Boukouvala, C. A. Floudas, 2017. Argonaut: Algorithms for global optimization of constrained grey-box computational problems. Optimization Letters 11 (5), 895–913.

J. A. Caballero, I. E. Grossmann, 2008. An algorithm for the use of surrogate models in modular flowsheet optimization. AIChE Journal 54 (10), 2633–2650.

E. Davis, M. Ierapetritou, 2007. A kriging method for the solution of nonlinear programs with black-box functions. AIChE Journal 53 (8), 2001–2012.

J. H. Friedman, 2001. Greedy function approximation: A gradient boosting machine. Annals of Statistics.

J. H. Friedman, 2002. Stochastic gradient boosting. Computational statistics & data analysis 38 (4), 367–378.

C. A. Henao, C. T. Maravelias, 2011. Surrogate-based superstructure optimization framework. AIChE Journal 57 (5), 1216–1232.

M. Mistry, D. Letsios, G. Krennrich, R. M. Lee, R. Misener, 2018. Mixed-integer convex nonlinear optimization with gradient-boosted trees embedded. arXiv 1803.00952.

V. V. Mišić, 2017. Optimization of tree ensembles. arXiv 1705.10883.

C. Ning, F. You, 2017. Data-driven adaptive nested robust optimization: General modeling framework and efficient computational algorithm for decision making under uncertainty. AIChE Journal 63 (9), 3790– 3817.

S. Olofsson, M. Mehrian, R. Calandra, L. Geris, M. P. Deisenroth, R. Misener, 2019. Bayesian Multiobjective Optimisation with Mixed Analytical and Black-Box Functions: Application to Tissue Engineering. IEEE Transactions on Biomedical Engineering 66 (3), 727–739.

K. Palmer, M. Realff, 2002. Optimization and validation of steady-state flowsheet simulation metamodels. Chemical Engineering Research and Design 80 (7), 773 – 782.

C. C. Pantelides and J. G. Renfro. The online use of first-principles models in process operations: Review, current status and future needs. Computers & Chemical Engineering 51 (2013), 136–148.

A. M. Schweidtmann, A. Mitsos, 2019. Deterministic Global Optimization with Artificial Neural Networks Embedded. Journal of Optimization Theory and Applications 180 (3), 925–948.

B. Shahriari, K. Swersky, Z. Wang, R. P. Adams, N. De Freitas. Taking the Human Out of the Loop: A Review of Bayesian Optimization. Proceedings of the IEEE 104.1 (2016), 148–175.

S. Surjanovic and D. Bingham. Virtual Library of Simulation Experiments: Test Functions and Datasets. 2020. URL: http://www.sfu.ca/~ssurjano.

A. Thebelt, J. Kronqvist, M. Mistry, R. M. Lee, N. Sudermann-Merx, R. Misener, 2020. ENTMOOT: A Framework for Optimization over Ensemble Tree Models arXiv 2003.04774.

C. Tsay, R. C. Pattison, M. R. Piana, M. Baldea, A survey of optimal process design capabilities and practices in the chemical and petrochemical industries. Computers & Chemical Engineering 112 (2018), 180–189.

Z. T. Wilson, N. V. Sahinidis, 2017. The ALAMO approach to machine learning. Computers & Chemical Engineering 106, 785–795.