(142b) Machine-Learning Enabled Optimization of Force Fields

Authors: 
Befort, B. - Presenter, University of Notre Dame
DeFever, R. S., Clemson University
Tow, G., University of Notre Dame
Maginn, E., University of Notre Dame
Dowling, A., University of Notre Dame
Molecular simulation is a powerful tool for studying the thermodynamic and dynamic properties of materials. Envisioning the future, molecular simulation shows great promise for screening vast molecular design spaces which could be expensive or infeasible to probe experimentally. The rise of molecular engineering, e.g., designer solvents, porous materials, energy storage technology, and biological systems, combined with ever-increasing computing power has enabled the field of computational molecular science and engineering to rapidly progress. With these new capabilities, simulations can be integrated within multiscale modeling and design frameworks and used to inform experimental design decisions. However, to utilize molecular simulation in this capacity requires quantitatively accurate molecular models, called force fields. Classical molecular simulations model intra- and intermolecular interactions with force fields, which use a functional form and parameters to describe the potential energy of a system. Developing generalized, or transferable, force fields to describe large swaths of chemical space has historically been a laborious endeavor, often taking months to years to complete. Though these off-the-shelf force fields offer accurate predictions for some systems, they inevitably lack quantitative accuracy across the extraordinary range of chemistries found in the natural and synthetic world. Further manual parameter tuning is often necessary to ensure the model has the required accuracy for the molecule(s) and properties of interest [Wang and Kollman, 2001]. Thus, force field optimization represents a bottleneck to applying molecular simulation to new systems.

In this work we propose a machine learning-enabled automated force field optimization framework. Specifically, we show integrating Gaussian Process (GP) regression (e.g., surrogate) models and support vector machine (SVM) classifiers facilitates rapid tuning of force fields and provides a quick and efficient route to highly accurate, physics-based molecular models. We train a GP surrogate model on the results of molecular simulations so that for a given set of parameters, our GP model predicts the resulting simulation experimental property prediction. We explore different methods of constructing our GP model, including various kernel and mean functions, to determine how to best harness a GP model’s ability to identify optimal regions of force field model parameter space and obtain the most reliable simulation result predictions. We also examine the use of a SVM classifier to capture discontinuities within experimental properties which a GP model cannot replicate.

As a demonstration case, we optimize force fields for two hydrofluorocarbons (HFCs), HFC-32 and HFC-125, for properties including liquid and vapor densities, vapor pressure, and enthalpy of vaporization. Results show we can find at least 26 HFC-32 and 45 HFC-125 force field parameter sets in a timeframe of weeks which give mean absolute percent error in all of the properties of interest of at most 5%. Additionally, we find that these parameter sets are able to predict transport and critical properties accurately for HFC-32 and HFC-125 without the need for further tuning. We also applied our framework to tuning an ammonium perchlorate force field to predict solid properties, including lattice parameters, unit cell structure, and Hydrogen bond distances, angles, and symmetry. Multiple parameter sets have been found that outperform existing force fields in reproducing experimental observations of the listed quantities. Future work involves expanding this tuning method to encompass additional thermodynamic and transport properties of interest in an automated parameterization workflow. As further capabilities are developed, we envision this tool will be integrated within computer aided molecular and process design schemes to facilitate rapid multiscale design and optimization.

References:

Wang, J., & Kollman, P. A. (2001). Automatic parameterization of force field by systematic search and genetic algorithms. Journal of Computational Chemistry, 22(12), 1219-1228.