(11a) A Hybrid Model Feature Relevance Analysis for First-Principle Model Refinement Suggestions | AIChE

(11a) A Hybrid Model Feature Relevance Analysis for First-Principle Model Refinement Suggestions

Authors 

Deng, Y. - Presenter, Auburn University
Cremaschi, S., Auburn University
Eden, M., Auburn University
Cheng, S., Chevron Energy Technology Company
Gao, H., Chevron Energy Technology Company
Mathematical models representing process behavior can be classified as first-principle model (FPM), data-driven model (DDM), or hybrid model (HM) based on the dependence of their development on process mechanisms versus data inference [1]. The FPM is derived from the process knowledge while the DDM is dependent on the information obtained from process data. The hybrid model combines FPM with DDM inferred from data where there is a lack of understanding of the mechanistic details. Within the hybrid model, the DDM compensates for the FPM prediction disagreement with experimental measurement due to the absence of process knowledge in FPM. The FPM expands the DDM’s applicability and enables the hybrid model extrapolation over a wider operating condition range. As a result, the hybrid model outperforms the DDM in extrapolation capability and exceeds the FPM in enabling the modeling of a system with incomplete mechanism knowledge.

According to the model structure, the hybrid models are generally arranged in two ways, the serial structure, and the parallel structure. The serial hybrid model is suitable for systems with few precise underlying mechanisms but rich data sets. The parallel hybrid model is preferred when the system can be modeled by decomposing the system into certain effects and modeling each part separately [2]. For a typical parallel hybrid model, the system is decomposed into the FPM prediction and the model discrepancy, which is the difference between the experimental measurements and the FPM prediction [3]. Compared to the FPM prediction, the model discrepancy includes the information that the FPM fails to capture from the experimental measurement. If the model discrepancy is fitted using a DDM, the mechanism that the FPM fails to include can be inferred from the DDM. If this missing mechanism can be allocated to which input variable it comes from, the input variables the impact of which the FPM does not correctly incorporate can be identified. Identification of these features provides the FPM developers with potential avenues for refinement.

In our previous study [4], a framework that provides suggestions for refining a mechanistic model from an input variable perspective is introduced. The framework is developed based on a parallel structure hybrid model composed of FPM and DDM built using model discrepancy. After building the DDM using Gaussian Process Regression (GPR), the feature relevance analysis approaches are used to evaluate each feature's importance with respect to the FPM prediction and the model discrepancy. Each feature's importance from the FPM and the DDM are compared to determine which features the FPM fails to capture information from. The comparison enabled the qualitative inference of the equations and parameters that the mechanistic model refinement studies should focus on.

This study extends our framework by modifying the feature relevance comparison method by adding a quantitative evaluation metric. A quantitative evaluation of the amount of information that the FPM fails to capture from each variable is estimated. After building the hybrid model, the feature importance is evaluated using the partial derivative of the output over the inputs. For each sample point, the partial derivatives are decomposed into contributions from FPM and DDM. The magnitude of partial derivative from the DDM output over one input variable quantitatively evaluates the amount of information FPM fails to capture from this input. Then, the FPM feature parameter refinement space, which is the feature parameter tuning space to meet the relationships inferred from experimental data, is interpreted from DDM. Feature parameters with a larger refinement space should be modified with priority.

The validation experiments consider various scenarios and functions to test the effectiveness of the proposed method, and they are designed to validate if the amount of information the FPM failed to capture can be assessed from the DDM feature importance evaluated using the proposed method. To simulate the true model structure, for each experiment, we generate samples from one function fe to simulate the experimental data and defined another function fm to simulate the FPM. Then, a GP is built using samples generated from the difference between fe and fm outputs to simulate the model discrepancy. The feature importance inferred from GP is then compared with the true feature importance difference between fe and fm to test the effectiveness of the approach. We employed the Sobol-G function, the Ishigami function, and polynomial functions as test functions. Scenarios, where the FPM was built using a subset of features from the experimental data, are considered in the validation experiments. From the validation results, the updated feature relevance comparison approach effectively quantifies the amount of information the FPM fails to capture. This approach identifies feature parameter refinement space accurately for the FPM, whether it was built using a subset or all of the features.

References

  1. Zendehboudi, S.; Rezaei, N.; Lohi, A. Applications of Hybrid Models in Chemical, Petroleum, and Energy Systems: A Systematic Review. Appl. Energy 2018, 228, 2539–2566, doi:10.1016/j.apenergy.2018.06.051.
  2. von Stosch, M.; Oliveira, R.; Peres, J.; Feyo de Azevedo, S. Hybrid Semi-Parametric Modeling in Process Systems Engineering: Past, Present and Future. Comput. Chem. Eng. 2014, 60, 86–101, doi:10.1016/j.compchemeng.2013.08.008.
  3. Jiang, Z.; Chen, W.; Fu, Y.; Yang, R.J. Reliability-Based Design Optimization with Model Bias and Data Uncertainty. SAE Int. J. Mater. Manuf. 2013, 6, doi:10.4271/2013-01-1384.
  4. Deng, Y.; Selen, C.; Eden, M. A Hybrid Model Feature Relevance Analysis for White-Box Model Refinement Suggestions. In Proceedings of the 2021 AIChE Annual Meeting; 2021.