(11h) Hybrid Modeling and Optimization of Process Flowsheets Using Bayesian Symbolic Regression | AIChE

(11h) Hybrid Modeling and Optimization of Process Flowsheets Using Bayesian Symbolic Regression


Vázquez, D., ETH Zürich
Guillén-Gosálbez, G., Imperial College London

Surrogate formulations of first-principles models that mimic their behavior while improving their numerical robustness have become popular to address the optimization of process flowsheets. However, such purely black-box models show poor extrapolation capabilities (they lead to large errors outside the training set) and lack interpretability (due to the absence of mechanistic insights). Additionally, a large number of data points are often required to construct accurate surrogate models [1].

In the recent past, black-box models have been refined by adding mechanistic equations, leading to hybrid models combining the complementary strengths of first-principles and data-driven approaches. Notably, Artificial Neural Networks (ANNs) and Gaussian processes (GPs) have been widely applied in process optimization, yet they are hardly interpretable and require tailored optimization approaches. The latter is particularly true when global optimization is sought, like in the work by Schweidtmann and Mitsos [2], which proposed a global optimization algorithm for ANNs based on propagating McCormick relaxations in a reduced space.

In recent years, symbolic regression has attracted increasing interest as an alternative to ANNs and GPs. In contrast to the latter two, symbolic regression generates a closed-form analytical expression based on expression trees [3]. In essence, the goal here is to find a suitable analytical expression to explain given data without any aprioristic knowledge of the model structure. Such an analytical model can be more easily interpreted while facilitating the use of derivative-based optimization techniques. Furthermore, having analytical surrogates defined in the space of the degrees of freedom could improve the numerical performance of global optimization solvers such as BARON [4].

Despite the appealing properties of symbolic regression, hybrid modeling of process flowsheets often rely on ANNs and GPs. Azarpour et al. [5] developed hybrid models for the terephthalic acid production process and the methanol production process based on ANNs complemented with first-principles (mass and energy balances). Moreover, Kahrs and Marquardt [6] constructed a hybrid model of ethylene glycol production using ANNs coupled with mass balances and vapor-liquid equilibrium equations, which was optimized to maximize the ethylene glycol yield.

Here we explore the use of Bayesian symbolic regression to build hybrid models of process flowsheets. In essence, we develop hybrid process models applying symbolic regression tools based on Bayesian learning [7] to generate closed-form analytical expressions of individual process units. These are then inserted into a mechanistic backbone based on mass and energy balances, giving rise to a fully analytical model. The latter can be optimized following an equation-oriented approach implemented in an algebraic modeling system interfacing with off-the-shelf solvers. Notably, the problem is formulated as a mixed-integer non-linear programming (MINLP) model. Here, continuous variables denote process variables (e.g., flow rates and temperatures of process streams), while integers represent structural variables (e.g., selection of process units, number of trays in a distillation column, etc.). We demonstrate the successful implementation of this approach in two case studies, i.e., propylene glycol and green methanol production. The results show that the hybrid modeling approach outperforms the standalone optimization of the rigorous process simulation, avoiding convergence issues and enabling the use of state-of-the-art solvers, including global optimization packages.


[1] Fahmi, I., & Cremaschi, S. (2012). Process synthesis of biodiesel production plant using artificial neural networks as the surrogate models. Computers & Chemical Engineering, 46, 105-123.

[2] Schweidtmann, A. M., & Mitsos, A. (2019). Deterministic global optimization with artificial neural networks embedded. Journal of Optimization Theory and Applications, 180(3), 925-948.

[3] Cozad, A., & Sahinidis, N. V. (2018). A global MINLP approach to symbolic regression. Mathematical Programming, 170(1), 97-119.

[4] Tawarmalani, M., & Sahinidis, N. V. (2005). A polyhedral branch-and-cut approach to global optimization. Mathematical programming, 103(2), 225-249.

[5] Azarpour, A., Borhani, T. N., Alwi, S. R. W., Manan, Z. A., & Mutalib, M. I. A. (2017). A generic hybrid model development for process analysis of industrial fixed-bed catalytic reactors. Chemical Engineering Research and Design, 117, 149-167.

[6] Kahrs, O., & Marquardt, W. (2007). The validity domain of hybrid models and its application in process optimization. Chemical Engineering and Processing: Process Intensification, 46(11), 1054-1066.

[7] Guimerà, R., Reichardt, I., Aguilar-Mogas, A., Massucci, F. A., Miranda, M., Pallarès, J., & Sales-Pardo, M. (2020). A Bayesian machine scientist to aid in the solution of challenging scientific problems. Science advances, 6(5), eaav6971.