(152a) Combining functional group and Graph Neural Networks Towards Interpretable Molecular Property Models | AIChE

(152a) Combining functional group and Graph Neural Networks Towards Interpretable Molecular Property Models

Authors 

Aouichaoui, A. - Presenter, Technical University of Denmark
Abildskov, J., Technical University of Denmark
Mansouri, S. S., Technical University of Denmark
Sin, G., Technical University of Denmark
Fan, F., Technical University of Denmark
Recent developments in the field of machine learning and deep learning have resulted in a surge in the integration and application of these developments in almost every field related to engineering and life science ranging from fault diagnostics in production facilities (Bao et al., 2022), to forecasting supply and demand (Hubbs et al., 2020) as well as predicting molecular properties (Wieder et al., 2020). The ability to infer the properties of molecules from their structure also known as Quantitative Structure-Property Relations (QSPRs) has long been a subject of great interest and importance in the development of computer-aided tools for process and product design as well as high throughput in-silico screening.

The introduction of the Graph Neural Network (GNN) models as an alternative modeling approach for QSPR modeling has eliminated the tedious task of developing molecular descriptors capable of expressing the molecular structural information and instead streamlined this procedure through the message passing concept (Gilmer et al., 2017). GNNs operate on molecular graphs which can be considered a natural representation of the molecular topology with nodes and edges representing the atoms and the bonds in a molecule respectively to extract a suitable representation that correlates well with the property of interest (Jiménez-Luna et al., 2020). The flexibility of these models and their ability to model properties in an end-to-end learning framework has resulted in the development of a large number of models for various applications such as predicting the critical properties (Aouichaoui et al., 2022) and various ADMET-related properties of chemicals (Yang et al., 2019).

While these models have proven they are capable of matching and exceeding the performance of descriptor-based models (Jiang et al., 2021), they come with an added disadvantage in the form of a lack of transparency and interpretability (Jiménez-Luna et al., 2020). The black-box nature of these models might hinder their wider applicability, especially in fields that rely on a first-principle understanding of the desired phenomenon. An added aspect of interpretability would potentially provide new insights into the targeted phenomenon as well as avoid the “clever-Hans effect” (provide correct predictions for the wrong reasons) (Jiménez-Luna et al., 2020).

In this work, we will present a model that combines functional groups as defined in the well-known group additivity models with GNNs. This model combines the strength of both concepts especially the interpretable aspect of group-contribution models and the flexibility and ability to extract neighborhood information of the GNN models. The constructed model is benchmarked against state-of-the-art GNN models (Gilmer et al., 2017; Xiong et al., 2020; Yang et al., 2019; Zhang et al., 2021). The results show comparable accuracy with the added aspect of interpretability, which is then shown to be consistent with chemistry and thermodynamic insights of the target properties investigated such as Aqueous solubility, enthalpy of fusion, and heat of combustion. In short, the presentation will provide the following:

  • Showcase how chemistry knowledge is combined with GNN models.
  • Showcase the interpretable aspect of the model
  • Benchmark the performance of the model against the state-of-the-art GNN models and group-additivity models.
  • Showcase the insights obtained through such models and compare them to prior knowledge

References

Aouichaoui, A.R.N., Mansouri, S.S., Abildskov, J., Sin, G., 2022. Uncertainty estimation in deep learning‐based property models: Graph neural networks applied to the critical properties. AIChE Journal.

Bao, Y., Wang, B., Guo, P., Wang, J., 2022. Chemical process fault diagnosis based on a combined deep learning method. Can J Chem Eng 100, 54–66.

Gilmer, J., Schoenholz, S.S., Riley, P.F., Vinyals, O., Dahl, G.E., 2017. Neural Message Passing for Quantum Chemistry, in: 34th International Conference on Machine Learning, ICML 2017. pp. 2053–2070.

Hubbs, C.D., Li, C., Sahinidis, N. v., Grossmann, I.E., Wassick, J.M., 2020. A deep reinforcement learning approach for chemical production scheduling. Comput Chem Eng 141, 106982.

Jiang, D., Wu, Z., Hsieh, C.Y., Chen, G., Liao, B., Wang, Z., Shen, C., Cao, D., Wu, J., Hou, T., 2021. Could graph neural networks learn better molecular representation for drug discovery? A comparison study of descriptor-based and graph-based models. J Cheminform 13, 1–23.

Jiménez-Luna, J., Grisoni, F., Schneider, G., 2020. Drug discovery with explainable artificial intelligence. Nat Mach Intell.

Wieder, O., Kohlbacher, S., Kuenemann, M., Garon, A., Ducrot, P., Seidel, T., Langer, T., 2020. A compact review of molecular property prediction with graph neural networks. Drug Discov Today Technol 37, 1–12.

Xiong, Z., Wang, D., Liu, X., Zhong, F., Wan, X., Li, X., Li, Z., Luo, X., Chen, K., Jiang, H., Zheng, M., 2020. Pushing the boundaries of molecular representation for drug discovery with the graph attention mechanism. J Med Chem 63, 8749–8760.

Yang, K., Swanson, K., Jin, W., Coley, C., Eiden, P., Gao, H., Guzman-Perez, A., Hopper, T., Kelley, B., Mathea, M., Palmer, A., Settels, V., Jaakkola, T., Jensen, K., Barzilay, R., 2019. Analyzing Learned Molecular Representations for Property Prediction. J Chem Inf Model 59, 3370–3388.

Zhang, Z., Guan, J., Zhou, S., 2021. FraGAT: a fragment-oriented multi-scale graph attention model for molecular property prediction. Bioinformatics 37, 2981–2987.