(362ac) Physics-Based Penalization for Hyperparameter Optimization in Gaussian Process Regression | AIChE

(362ac) Physics-Based Penalization for Hyperparameter Optimization in Gaussian Process Regression

Authors 

Boukouvala, F., Georgia Institute of Technology
Paynabar, K., Georgia Institute of Technology
Luettgen, C. O., Georgia Institute of Technology
Gaussian Process Regression (GPR) is a powerful non-parametric model which can flexibly approximate continuous functions with an accompanied measure of the uncertainty of the prediction. Since GPR is a kernel-based method, the choice of the kernel and the optimization of its hyperparameters using Maximum-Likelihood Estimation (MLE), both significantly affect model performance [1]. Due to the nonconvexity of this optimization problem, convergence to a global optimum can be expensive and rarely guaranteed. Locally optimal hyperparameters can lead to poor extrapolation and interpretability, and cause overfitting problems. A common approach to tackle this issue is to use multiple starting points from a specific prior distribution and choose the hyperparameters with the largest marginal likelihood [2]. However, this process depends on the starting points and may fail if a prior distribution is not properly chosen. Moreover, the traditional GP model is a black-box surrogate, without a guarantee of satisfaction of the underlying physics, and poor extrapolation properties especially in sparse data scenarios. If first-principle knowledge is available, then embedding this in various forms during training has been shown to improve the generalizability of the fitted surrogate. GPR with different physical constraints (e.g., bounded constraints, monotonicity constraints, convexity constraints) has been studied [3] via the truncated Gaussian assumption [4-6], bounded likelihood function [7-9], and constrained hyperparameter optimization [10].

While different physics-based equality and inequality constraints have been successfully implemented into GPR models, there is no systematic study on how physics-based knowledge affects hyperparameter tuning, especially if physics-based knowledge is directly incorporated into the marginal likelihood function. In this paper, we utilize physics-based knowledge as a penalization term in the MLE objective. We formulate the augmented MLE with a physics violation function by using GP’s analytical property that any linear transformation of a Gaussian Process also follows GP [1]. After formulating the physics violation function as an L2-norm square of the mean prediction of the linear transformation, the physics violation function is directly incorporated into the marginal likelihood function and an unconstrained optimization formulation is formed. Through several case studies that can be represented as linear PDEs, including the Heat and Laplace equations, we present GPR accuracy and tuning sensitivity for cases where initial and boundary conditions are not available. We have observed that by penalizing the MLE objective, we can find the hyperparameter set that improves the prediction performance of GPR, while reducing the violation of physics much more consistently than conventional initialization approaches even under sparse data scenarios.

  1. Rasmussen, C.E. Gaussian processes in machine learning. in Summer school on machine learning. 2003. Springer.
  2. Chen, Z. and B. Wang, How priors of initial hyperparameters affect Gaussian process regression models. Neurocomputing, 2018. 275: p. 1702-1710.
  3. Swiler, L.P., et al., A survey of constrained Gaussian process regression: Approaches and implementation challenges. Journal of Machine Learning for Modeling and Computing, 2020. 1(2).
  4. Maatouk, H. and X. Bay, Gaussian process emulators for computer experiments with inequality constraints. Mathematical Geosciences, 2017. 49(5): p. 557-582.
  5. Da Veiga, S. and A. Marrel. Gaussian process modeling with inequality constraints. in Annales de la Faculté des sciences de Toulouse: Mathématiques. 2012.
  6. López-Lopera, A.F., et al., Finite-dimensional Gaussian approximation with linear inequality constraints. SIAM/ASA Journal on Uncertainty Quantification, 2018. 6(3): p. 1224-1255.
  7. Jensen, B.S., J.B. Nielsen, and J. Larsen. Bounded gaussian process regression. in 2013 IEEE International Workshop on Machine Learning for Signal Processing (MLSP). 2013. IEEE.
  8. Riihimäki, J. and A. Vehtari. Gaussian processes with monotonicity information. in Proceedings of the thirteenth international conference on artificial intelligence and statistics. 2010. JMLR Workshop and Conference Proceedings.
  9. Bachoc, F., A. Lagnoux, and A.F. López-Lopera, Maximum likelihood estimation for Gaussian processes under inequality constraints. Electronic Journal of Statistics, 2019. 13(2): p. 2921-2969.
  10. Pensoneault, A., X. Yang, and X. Zhu, Nonnegativity-enforced Gaussian process regression. Theoretical and Applied Mechanics Letters, 2020. 10(3): p. 182-187.
  11. Wang, Y.B., et al., A numerical method for solving the inverse heat conduction problem without initial value. Inverse Problems in Science and Engineering, 2010. 18(5): p. 655-671.
  12. Xiong, X., C. Fu, and H.-F. Li, Fourier regularization method of a sideways heat equation for determining surface heat flux. Journal of Mathematical Analysis and Applications, 2006. 317: p. 331-348.
  13. Vessella, S., Stability Estimates for an Inverse Hyperbolic Initial Boundary Value Problem with Unknown Boundaries. SIAM J. Math. Anal., 2015. 47: p. 1419-1457.
  14. Wang, Z., X. Huan, and K. Garikipati, Variational system identification of the partial differential equations governing microstructure evolution in materials: Inference over sparse and spatially unrelated data. Computer Methods in Applied Mechanics and Engineering, 2021. 377: p. 113706.