(416b) Uncertainty Quantification for Molecular Property Predictions Using Automatic Graph Neural Architecture Search | AIChE

(416b) Uncertainty Quantification for Molecular Property Predictions Using Automatic Graph Neural Architecture Search

Authors 

Jiang, S. - Presenter, University of Wisconsin-Madison
Zavala, V., University of Wisconsin-Madison
Qin, S., University of Wisconsin-Madison
Van Lehn, R., University of Wisconsin-Madison
Balaprakash, P., Argonne National Laboratory
Generating and evaluating candidates for new molecules using data-driven models is essential in modern synthesis platforms. Quantitative structure−activity relationships (QSARs) provide a powerful data-driven modeling framework for calculating molecular properties using expensive and time-consuming experiments but they tend to be limited in predictability and in that the user needs to pre-define structural features [1]. Over the past few years, the fields of molecular property prediction have been advanced by using deep learning techniques, especially neural networks (NNs) [2]. NNs can leverage flexible representations of molecules that can be tailored to tasks rather than depending on expert-defined molecular features [3]–[5]. However, despite their progress, the use of NNs for molecular modeling still faces a number of limitations. Specifically, the lack of expressiveness and transparency of NNs makes it difficult to assess their robustness, out-of-domain applicability, and potential failure modes. This is particularly relevant when seeking to embed NNs in design of experiments tasks. To overcome these limitations, uncertainty quantification (UQ) capabilities need to be provided [6]. UQ is a set of mathematical techniques that aims to characterize data uncertainty and model uncertainty; the former occurs as a result of the inherent variability or noise in the data, while the latter is associated with the model parameter estimation or out-of-distribution predictions [7]. While data uncertainty is typically assumed to be irreducible, model uncertainty can be reduced by gathering more training data in appropriate regions of an experimental space [8].

Recent work has been done in UQ for NN models within the context of molecular prediction [9]–[14]. Recently, Hirschfeld et al. systematically evaluated a variety of UQ techniques on message-passing neural nets (MPNNs), which learn parameterized mappings from graph-structured objects to feature vectors and have achieved state-of-the-art performance across various industrial datasets. Probabilistic models, such as full Bayesian formulations, can quantify uncertainty but is computationally intractable for MPNN models that contain millions of trainable parameters. Ensemble approaches that use multiple independently trained MPNNs have shown promise in terms of scalability [15]. Here, each candidate model in the ensemble can be trained in parallel, drastically reducing the computational time. A critical component of the ensemble is its diversity, without which uncertainty cannot be properly quantified. For example, models with the same NN architecture but different weight initializations can result in poor estimates of model uncertainty [16]. The lack of scalable UQ capabilities in NN models limits the use of active learning and experimental design.

In this work, we propose an automated approach to construct diverse MPNN models using an adaption of the AutoDEUQ method [16]. Here, we used aging evolution to find candidate models automatically. The aging evolution (AE) method constructs the initial population of MPNNs by sampling a set N of random architectures. It evaluates the initial population and record the validation loss from each individual. To quantify uncertainty, the approach uses the negative log-likelihood loss (as opposed to the usual mean squared error) in the training. Following the initialization, AE samples S random architectures uniformly from the population with replacement. The architecture with the lowest validation loss within the sample is selected as a parent. A mutation is performed on the parent, and new child architecture is constructed. A mutation corresponds to choosing a different MPNN layer operation (e.g., activation function, number of hidden units). The child is trained, and the validation loss is recorded. Consequently, the child is added to the population by replacing the oldest architecture in the population. Over multiple cycles, architectures with lower validation loss are retained in the population via repeated sampling and mutation. The next step is to select the top-k MPNNs from the search to build the ensemble. We leveraged variance decomposition to separate the data and model uncertainty from the predicted variance of the ensemble. The data uncertainty is the mean of the predicted variance of the ensemble, and the model uncertainty is the variance of the predicted mean. We used the negative log-likelihood as the UQ metric and demonstrated an improved UQ compared to previous ensemble methods on various benchmark datasets, including QM7 and ESOL. We also extended our UQ method to a new multi-molecular dataset for activity coefficient estimation and demonstrated promising results.

References

[1] A. Cherkasov et al., “QSAR modeling: Where have you been? Where are you going to?,” Journal of Medicinal Chemistry, vol. 57, no. 12. 2014. doi: 10.1021/jm4004285.

[2] W. P. Walters and R. Barzilay, “Applications of Deep Learning in Molecule Generation and Molecular Property Prediction,” Accounts of Chemical Research, vol. 54, no. 2, 2021, doi: 10.1021/acs.accounts.0c00699.

[3] E. N. Feinberg et al., “PotentialNet for Molecular Property Prediction,” ACS Central Science, vol. 4, no. 11, 2018, doi: 10.1021/acscentsci.8b00507.

[4] Z. Hao et al., “ASGN: An Active Semi-supervised Graph Neural Network for Molecular Property Prediction,” 2020. doi: 10.1145/3394486.3403117.

[5] J. Gilmer, S. S. Schoenholz, P. F. Riley, O. Vinyals, and G. E. Dahl, “Neural message passing for quantum chemistry,” in 34th International Conference on Machine Learning, ICML 2017, 2017, vol. 3.

[6] Y. Gal and Z. Ghahramani, “Dropout as a Bayesian approximation: Representing model uncertainty in deep learning,” in 33rd International Conference on Machine Learning, ICML 2016, 2016, vol. 3.

[7] E. Hüllermeier and W. Waegeman, “Aleatoric and epistemic uncertainty in machine learning: an introduction to concepts and methods,” Machine Learning, vol. 110, no. 3, 2021, doi: 10.1007/s10994-021-05946-3.

[8] Y. Gal and Z. Ghahramani, “Dropout as a Bayesian approximation: Representing model uncertainty in deep learning,” in 33rd International Conference on Machine Learning, ICML 2016, 2016, vol. 3.

[9] J. P. Janet, C. Duan, T. Yang, A. Nandy, and H. J. Kulik, “A quantitative uncertainty metric controls error in neural network-driven chemical discovery,” Chemical Science, vol. 10, no. 34, 2019, doi: 10.1039/c9sc02298h.

[10] N. Aniceto, A. A. Freitas, A. Bender, and T. Ghafourian, “A novel applicability domain technique for mapping predictive reliability across the chemical space of a QSAR: Reliability-density neighbourhood,” Journal of Cheminformatics, vol. 8, no. 1, 2016, doi: 10.1186/s13321-016-0182-y.

[11] R. Liu and A. Wallqvist, “Molecular Similarity-Based Domain Applicability Metric Efficiently Identifies Out-of-Domain Compounds,” Journal of Chemical Information and Modeling, vol. 59, no. 1, 2019, doi: 10.1021/acs.jcim.8b00597.

[12] Y. Zhang and A. A. Lee, “Bayesian semi-supervised learning for uncertainty-calibrated prediction of molecular properties and active learning,” Chemical Science, vol. 10, no. 35, 2019, doi: 10.1039/c9sc00616h.

[13] S. Ryu, Y. Kwon, and W. Y. Kim, “Uncertainty quantification of molecular property prediction with Bayesian neural networks,” Mar. 2019, doi: 10.48550/arxiv.1903.08375.

[14] K. Tran, W. Neiswanger, J. Yoon, Q. Zhang, E. Xing, and Z. W. Ulissi, “Methods for comparing uncertainty quantifications for material property predictions,” Machine Learning: Science and Technology, vol. 1, no. 2, 2020, doi: 10.1088/2632-2153/ab7e1a.

[15] L. Hirschfeld, K. Swanson, K. Yang, R. Barzilay, and C. W. Coley, “Uncertainty Quantification Using Neural Networks for Molecular Property Prediction,” Journal of Chemical Information and Modeling, vol. 60, no. 8, 2020, doi: 10.1021/acs.jcim.0c00502.

[16] R. Egele, R. Maulik, K. Raghavan, B. Lusch, I. Guyon, and P. Balaprakash, “AutoDEUQ: Automated Deep Ensemble with Uncertainty Quantification,” Oct. 2021, Accessed: Mar. 04, 2022. [Online]. Available: https://arxiv.org/abs/2110.13511v2