(203c) Characterizing Uncertainty and Error in Machine Learning Chemical Property Prediction | AIChE

(203c) Characterizing Uncertainty and Error in Machine Learning Chemical Property Prediction


Heid, E. - Presenter, Massachusetts Institute of Technology
McGill, C. J., North Carolina State University
Vermeire, F., Massachusetts Institute of Technology
Green, W., Massachusetts Institute of Technology
Deep neural networks recently have made a tremendous impact in chemical engineering disciplines, where graph-convolutional neural networks can predict molecular properties with higher accuracy than state-of-the-art techniques. One of the main remaining challenges is to better understand and quantify the different sources of uncertainty associated with molecular property prediction. The uncertainty of the predictions of a machine learning model and thus the deviations from the true target values are characterized by uncertainties due to the model (epistemic uncertainty, including model bias, parameter uncertainty and interpolation uncertainty), as well as uncertainties due to the underlying data and its noise (aleatoric uncertainty). The allocation of the error of a model into epistemic and aleatoric contributions is non-trivial, but essential to characterize possibilities for performance improvements. The categorization of sources of uncertainty in a machine learning model is especially difficult for chemical applications, where the vast chemical space and the diverse nature and number of targets enable a multitude of possible sources for prediction errors. The high dimensionality of chemical space furthermore complicates the differentiation between interpolation and extrapolation for a model prediction.

We systematically study the influence of model bias, such as errors due to the model architecture and input representation, model variance, as well as target data noise on the performance of graph-convolutional neural networks on chemical prediction tasks. Through a clever design of molecular prediction tasks for which an exact solution is known and achievable for a graph-convolutional neural network, we are able to add errors to the data and models in a controlled manner and study the effects on model performance. We combine the addition of controlled errors with different uncertainty estimation techniques, changes to model architecture, and changes in the size or makeup of the dataset to demonstrate trends important to users of machine learning for property prediction. We show that under the influence of random noise in the training and test set, the true performance of a model can continue to improve with larger datasets but the apparent performance will approach an asymptote and cease to improve. Further, we demonstrate the utility of using heteroscedastic and homoscedastic loss functions to assess the presence of noise errors in the dataset, when those errors are associated with model features and when they are not. We apply measured ensemble variance as a method of assessing epistemic error and use statistical analysis of the results to project how much of the model error is due to variance that can be observed with ensembling and how much is a baseline bias. Using trends over batch size and observed interactions between different uncertainty characterizations, we provide methods for estimating the contribution of different error types to the model performance, the likely effects of adding more data, and the maximum benefit available from ensembling.