(384d) Gaussian Processes for Hybridizing Analytical & Data-Driven Decision-Making

Misener, R., Imperial College
Olofsson, S., Imperial College
Wiebe, J., Imperial College
Deisenroth, M. P., Imperial College
Surrogate models are widely appreciated in process systems engineering [1]. The typical setting focuses on expensive-to-evaluate, possibly uncertain functions. Examples include: modular process simulators [2], integrated gasification combined cycle processes [3], a carbon capture absorber [4], and many other applications, e.g. [5-8]. Resources are typically limited, so effective decision making requires data-efficient learning.

The data science and statistical machine learning communities typically focus on models learned solely from observed data. But chemical engineering applications may also require explicit, parametric models, e.g. modeling known process constraints, operations constraints, and cost objectives [9]. So, work has integrated semi-algebraic functions with those learned from data [10] or developed semi-physical modeling techniques [11, 12].

This presentation surveys the state-of-the-art in hybridizing analytical and data-driven decision making [13]. We consider three probabilistic modeling applications to these hybrid situations:

Design of experiments for model discrimination [14]. We bridge the gap between classical, analytical methods [15] and Monte Carlo-based approaches [16]. Classical methods may have difficulty managing non-analytical model functions and data-driven Monte Carlo approaches come at a high computational cost. We replace the original, parametric models with probabilistic, non-parametric Gaussian process surrogates learned from model evaluations. The surrogates are flexible regression tools that extend classical analytical results to non-analytical models, while providing us with model prediction confidence bounds and avoiding the computational complexity of Monte-Carlo approaches.

Multi-objective optimization [17, 18]. We make novel extensions to Bayesian multi-objective optimization in the case of one analytical objective function and one black-box, i.e. simulation-based, objective function. The resulting method has been applied to a bone neotissue application [19] and a more general test suite.

Scheduling plant operations under uncertainty. For processes with equipment degradation, we use Gaussian processes to approximate large-scale, mixed-integer optimization problems.

We close by offering a broad outlook on applying probabilistic surrogate models to chemical engineering. Statistical machine learning has recently attracted significant interest in process systems engineering [20]. Here we show that state-of-the-art research in Gaussian processes [21, 22] and probabilistic modeling more generally [23] can have a big impact on chemical engineering.


[1] Bhosekar A, Ierapetritou M. Advances in surrogate based modeling, feasibility analysis and and optimization: A review. Comput Chem Eng. 108: 250-267, 2018.

[2] Caballero JA, Grossmann IE. An algorithm for the use of surrogate models in modular flowsheet optimization. AIChE J. 54(10):2633-2650, 2008.

[3] Lang Y, Zitney SE, Biegler LT. Optimization of IGCC processes with reduced order CFD models. Comput Chem Eng. 35(9):1705-1717, 2011.

[4] Cozad A, Sahinidis NV, Miller DC. Learning surrogate models for simulation-based optim-ization. AIChE J. 60(6):2211-2227, 2017.

[5] Boukouvala F, Ierapetritou M. Surrogate-based optimization of expensive flowsheet modeling for continuous pharmaceutical manufacturing. J Pharm Innov. 8: 131, 2013.

[6] Soepyan FB, Cremaschi S, Sarica C, and others. Estimation of percentiles using the Kriging method for uncertainty propagation. Comput Chem Eng. 93:143-59, 2016.

[7] Shokry A, Ardakani MH, Escudero G, Graells M, Espuña A. Dynamic Kriging-based fault detection and diagnosis approach for nonlinear noisy dynamic processes. Comput Chem Eng. 106:758-76, 2017.

[8] Tran AP, Georgakis C. On the estimation of high-dimensional surrogate models of steady-state of plant-wide processes characteristics. Comput Chem Eng. DOI 10.1016/j.comp-chemeng.2018.02.014, 2018.

[9] Boukouvala F, Hasan MF, Floudas CA. Global optimization of general constrained grey-box models: new method and its application to constrained PDEs for pressure swing adsorption. J Glob Optim. 67(1-2): 3-42, 2017.

[10] Boukouvala F, Floudas CA. ARGONAUT: AlgoRithms for Global Optimization of coNstrAined grey-box compUTational problems. Optim Lett. 11(5): 895-913, 2017.

[11] Pearson RK, Pottmann M. Gray-box identification of block-oriented nonlinear models. J Process Contr. 10(4):301-315, 2000.

[12] Cozad A, Sahinidis NV, Miller DC. A combined first-principles and data-driven approach to model building. Comput Chem Eng. 73: 116-27, 2015.

[13] Boukouvala F, Misener R, Floudas CA. Global optimization advances in mixed-integer
nonlinear programming, MINLP, and constrained derivative-free optimization, CDFO. Eur J Oper Res. 252: 701-727, 2016.

[14] Olofsson S, Deisenroth MP, Misener R. Design of Experiments for Model Discrimination Hybridising Analytical and Data-Driven Approaches. arXiv:1802.04170. 2018.

[15] Buzzi-Ferraris G, Forzatti P, Emig G, Hofmann H. Sequential experimental design for model discrimination in the case of multiple responses. Chem Eng Sci. 39(1): 81-85, 1984.

[16] Vanlier J, Tiemann CA, Hilbers PAJ, van Riel NAW. Optimal experiment design for model selection in biochemical networks. BMC Syst Biol, 8(20), 2014.

[17] Olofsson S, Mehrian M, Geris L, Calandra R, Deisenroth MP, Misener R. Bayesian Multi-Objective Optimisation of Neotissue Growth in a Perfusion Bioreactor Set-Up. Computer Aided Chemical Engineering. 40: 2155-2160, 2017.

[18] Beykal B, Boukouvala F, Floudas CA, Pistikopoulos EN. Optimal design of energy systems using constrained grey-box multi-objective optimization. Comput Chem Eng. DOI 10.1016/j.compchemeng.2018.02.017, 2018.

[19] Mehrian M, Guyot Y, Papantoniou I, Olofsson S, Sonnaert M, Misener R, Geris L. Maximizing neotissue growth kinetics in a perfusion bioreactor: An in silico strategy using model reduction and Bayesian optimization. Biotechnol Bioeng. 115(3):617-29, 2018.

[20] Lee JH, Shin J, Realff MJ. Machine learning: Overview of the recent progresses and implications for the process systems engineering field. Comput Chem Eng. DOI 10.1016/j.compchemeng.2017.10.008, 2017.

[21] Damianou A, Lawrence N. Deep Gaussian processes. In Artificial Intelligence and Statistics. 207-215, 2013.

[22] Deisenroth MP, Ng JW. Distributed Gaussian processes. In 32nd International Conference on International Conference on Machine Learning. 37: 1481-1490, 2015.

[23] Ghahramani Z. Bayesian non-parametrics and the probabilistic approach to modelling. Phil Trans R Soc A. 371(1984):20110553, 2013.