(284c) Accurate Surrogate Models for Stochastic Simulations
AIChE Annual Meeting
2021
2021 Annual Meeting
Computing and Systems Technology Division
Advances in Computational Methods and Numerical Analysis II
Tuesday, November 9, 2021 - 1:08pm to 1:27pm
The surrogate models built using traditional techniques do not accurately represent the outputs of high-fidelity stochastic simulations, e.g., simulations with uncertain parameters (Staum, 2009). Current literature for modeling the outputs of a stochastic simulation can be grouped into two categories. In the first category, the uncertain parameter(s) are fixed at a select subset of their values, and a surrogate model is trained for each value of the uncertain parameter (Hüllen et al., 2019). In some cases, the subset may include only one nominal value. However, part of the uncertainty information is lost while using this approach because of the fixed parameter values. The second category employs stochastic kriging to construct the surrogate model (Ankenman et al., 2008). Although stochastic kriging has yielded promising results in predicting the expected output results (Ankenman et al., 2008), the surrogate model technique is fixed. The comparative analysis of different surrogate modeling techniques reveals that the best surrogate modeling technique depends on the input-output data characteristics (Williams and Cremaschi, 2021, 2019).
This work introduces a new approach for building accurate surrogate models of stochastic simulations and compares its performance to the existing approaches. Our approach considers the uncertain parameter(s) as uncertain input(s) to the simulations. By this approach, the high-fidelity stochastic simulation is converted to a deterministic one. The most appropriate surrogate modeling technique is used to approximate the output of the deterministic model accurately. Next, the impact of uncertain parameters is propagated to the outputs using the surrogate model and an efficient uncertainty propagation method (Mohammadi and Cremaschi, 2019), yielding the stochastic simulation output approximation.
The proposed approach is illustrated in Figure 1. Let Y=g(X;K) be a high-fidelity simulation model, where Y is the stochastic output, X is the input vector with dimension d1, and K is the vector of system uncertain parameters with d2 components. The new simulation model is defined as Yâ²=gâ²(X*), where X* is a d dimensional vector of inputs with d = d1 + d2, which contains all inputs and the uncertain parameters of the stochastic simulation (g(X;K)). We assume that the distributions of the uncertain parameters (K) are known, and the parameters of the distributions are constant. With these definitions, the new simulation model (Yâ²=gâ²(X*)) becomes deterministic, and any of the surrogate modeling techniques can be utilized to train a model representing this simulation, gâ²(X*)âFâ²(X*), where Fâ²(X*) is the trained surrogate model of the deterministic simulation (Figure 1).
Seven different machine learning techniques are studied to build the surrogate model, Fâ²(X*), for the deterministic model, gâ²(X*). The performance of the new method is evaluated computationally for a set of test functions with uncertain parameters. The test functions are chosen from the Virtual Library of Simulation Experiments (Surjanovic and Bingham, 2013) with different numbers of inputs and uncertain parameters to investigate their impact. We also compare the performance of the new method and the existing approaches in the literature. In the first approach, the uncertain parameter(s) are fixed at their nominal values, which is considered the base case. The second approach considers a subset of fixed values for the uncertain parameter(s), and the third one is stochastic kriging. The quality of the estimates is evaluated based on the number of inputs of the function (dimension of X), number of uncertain parameters (cardinality of K), and nonlinearity of the uncertain parameters as features of the test functions. The metrics for evaluation of the performance are the root mean square error and the mean absolute error calculated for the predicted outputs, which are YË and the standard deviation associated with it. This presentation will discuss the improvements in performance metrics of all the methods compared to the base case.
References
Ankenman, B., Nelson, B.L., Staum, J., 2008. Stochastic kriging for simulation metamodeling. Proc. - Winter Simul. Conf. 362â370. https://doi.org/10.1109/WSC.2008.4736089
Breiman, L., 2001. Random forests. Mach. Learn. 45, 5â32.
Eaton, A.N., Beal, L.D.R., Thorpe, S.D., Hubbell, C.B., Hedengren, J.D., Nybø, R., Aghito, M., 2017. Real time model identification using multi-fidelity models in managed pressure drilling. Comput. Chem. Eng. 97, 76â84. https://doi.org/10.1016/j.compchemeng.2016.11.008
Friedman, J.H. (stanford U., 1991. Multivariate adaptive regression splines.
Haleem, K., Gan, A., Lu, J., 2013. Using multivariate adaptive regression splines ( MARS ) to develop crash modification factors for urban freeway interchange influence areas. Accid. Anal. Prev. 55, 12â21. https://doi.org/10.1016/j.aap.2013.02.018
Han, Z.-H., Zhang, K.-S., 2012. Surrogate-based optimization. Real-world Appl. Genet. algorithms 343â362.
Hüllen, G., Zhai, J., Kim, S.H., Sinha, A., Realff, M.J., Boukouvala, F., 2019. Managing Uncertainty in Data-Driven Simulation-Based Optimization. Comput. Chem. Eng. 106519. https://doi.org/10.1016/j.compchemeng.2019.106519
Liu, B., Koziel, S., Zhang, Q., 2016. A multi-fidelity surrogate-model-assisted evolutionary algorithm for computationally expensive optimization problems. J. Comput. Sci. 12, 28â37. https://doi.org/10.1016/j.jocs.2015.11.004
Mohammadi, S., Cremaschi, S., 2019. Efficiency of Uncertainty Propagation Methods for Estimating Output Moments, in: Muñoz, S.G., Laird, C.D., Realff, M.J. (Eds.), Proceedings of the 9th International Conference on Foundations of Computer-Aided Process Design, Computer Aided Chemical Engineering. Elsevier, pp. 487â492. https://doi.org/https://doi.org/10.1016/B978-0-12-818597-1.50078-3
Peherstorfer, B., Kramer, B., Willcox, K., 2017. Combining multiple surrogate models to accelerate failure probability estimation with expensive high-fidelity models. J. Comput. Phys. 341, 61â75. https://doi.org/10.1016/j.jcp.2017.04.012
Quirante, N., Javaloyes, J., Ruiz-Femenia, R., Caballero, J.A., 2015. Optimization of chemical processes using surrogate models based on a kriging interpolation, in: Computer Aided Chemical Engineering. Elsevier, pp. 179â184.
Staum, J., 2009. Better Simulation Metamodeling: The why, what, and how of Stochastic Kriging 119â133.
Surjanovic, S., Bingham, D., 2013. Virtual Library of Simulation Experiments: Test Functions and Datasets.
Szilágyi, B., Agachi, P.Å., Nagy, Z.K., 2018. Chord Length Distribution Based Modeling and Adaptive Model Predictive Control of Batch Crystallization Processes Using High Fidelity Full Population Balance Models. Ind. Eng. Chem. Res. 57, 3320â3332. https://doi.org/10.1021/acs.iecr.7b03964
Williams, B., Cremaschi, S., 2021. Selection of Surrogate Modeling Techniques for Surface Approximation and Surrogate-Based Optimization. Chem. Eng. Res. Des. https://doi.org/https://doi.org/10.1016/j.cherd.2021.03.028
Williams, B.A., Cremaschi, S., 2019. Surrogate Model Selection for Design Space Approximation And Surrogatebased Optimization, in: Computer Aided Chemical Engineering. Elsevier, pp. 353â358.
Williams, C.K.I., Rasmussen, C.E., 2006. Gaussian processes for machine learning. MIT press Cambridge, MA.