(728e) Comparison of Surrogate Modeling Techniques for Surrogate-Based Optimization | AIChE

(728e) Comparison of Surrogate Modeling Techniques for Surrogate-Based Optimization


Williams, B. - Presenter, Auburn University
Cremaschi, S., Auburn University
Surrogate models, also known as response surfaces, black-box models, metamodels, or emulators, are simplified approximations of more complex, higher order models (Wang et al., 2014). These models are used to map input data to output data when the actual relationship between the two is unknown or computationally expensive to evaluate (Han and Zhang, 2012). Surrogate models also allow the use of traditional derivative-based optimization techniques for optimization of systems defined by these complex, higher order models. Surrogate modeling techniques are of particular interest where high-fidelity, thus expensive, simulations are used, for example in computational fluid dynamics (CFD) or computational structural dynamics (CSD). These techniques are of great importance to the chemical engineering community as they have wide reaching related applications, including in process synthesis, process controls, and supply chain management. Some recent examples of applications of surrogate modeling approaches include several process synthesis applications, such as in optimization of multiphase flow networks (Grimstad et al., 2016), and process controls applications in the pharmaceutical production industry (Icten et al., 2015).

Construction of a surrogate model is comprised of three steps: (1) selection of the sample points, (2) optimization or “training” of the model parameters, and (3) evaluation of the accuracy of the surrogate model (Wang et al., 2014). Although several machine learning and regression techniques have been developed for surrogate model construction, there has been little work done on how to best select the appropriate model for a particular application for both surrogate modeling and surrogate based optimization. The majority of the studies comparing surrogate model performance only compare a few models on a limited number of functions or complex models. Davis et al. (2017) investigated a more extensive selection of surrogate modeling techniques for 35 challenge functions and concluded that Artificial Neural Networks, Automated Learning of Algebraic Models using Optimization, and Extreme Learning Machines yielded the most accurate predictions for the challenge functions tested. This work aims to build upon that study and to further address the knowledge gap by comparing the ability of eight different surrogate modeling techniques to both learn and accurately model the responses of a set of challenge functions and to locate the extrema of these functions using surrogate based optimization. The surrogate-modeling techniques considered include Artificial Neural Networks (ANN), Automated Learning of Algebraic Models using Optimization (ALAMO), Radial Basis Networks (RBN), Extreme Learning Machines (ELM), Gaussian Progress Regression (GPR), Random Forests (RF), Support Vector Regression (SVR), and Multivariate Adaptive Regression Splines (MARS). These techniques are used to construct surrogate models for the 47 optimization challenge functions from the Virtual Library of Simulation Experiments (Surjanovic, 2013). The effects of the challenge function characteristics, including function shape and number of inputs, and sampling methods on the surrogate model performance are evaluated. The sampling methods studied are Sobol sequence sampling and Latin Hypercube sampling (LHS). Four performance measures are used to evaluate the accuracy of the surrogate models: root mean squared error (RMSE), maximum percent error (MPE), the R-squared value, and the Akaike Information Criteria (AIC). The models’ ability to locate the extrema of the functions are evaluated by calculating the distance between the extreme point(s) estimated by the model and the actual function extrema. The results provide guidance on selecting which surrogate modeling technique to use based on the specifics and characteristics of the function or data set being modeled.


Davis, S., Cremaschi, S., Eden, M., 2017, “Efficient Surrogate Model Development: Optimum Model Form Based on Input Function Characteristics”,Computer Aided Chemical Enginering 70.1, 457-462.

Grimstad, B., Foss, B., Heddle, R., Woodman, M., 2016, “Global optimization of multiphase flow networks using spline surrogate models”, Computers and Chemical Engineering 84.1, 237-254.

Han, ZH. and Zhang, KH., 2012,“Surrogate-Based Optimization”, Real-World Applications of Genetic Algorithms. InTech. 343-362.

Icten, E., Nagy, Z., Reklaits, G, 2015, “Process control of a dropwise additive manufacturing system for pharmaceuticals using polynomial chaos expansion based surrogate model”, 2015, Computers and Chemical Engineering 83.1, 221-231.

Surjanovic, S., Bingham, D., 2013, “Virtual Library of Simulation Experiments: Test Functions and Datasets”, http://www.sfu.ca/~ssurjano.

Wang, C., Duan, Q., Gong, W., Ye, A., Di, Z., Miao, C., 2014, “An evaluation of adaptive surrogate modeling based optimization with two benchmark problems”, Environmental Modelling and Software 60.1, 167-179.