(118b) A Universal Framework for Featurization of Atomistic Systems | AIChE

(118b) A Universal Framework for Featurization of Atomistic Systems


Medford, A. - Presenter, Georgia Institute of Technology
A major challenge in the development of machine-learning force fields is the fact that most featurization schemes are element-specific, causing them to scale poorly with the number of elements in the system, and requiring the development of specialized machine-learned force fields for systems containing different combinations of elements. This prevents the construction of general-purpose machine-learned force fields, and makes datasets containing many (5+) elements impractical to treat with feature-based machine-learning models. In this talk we introduce the Gaussian multi-pole (GMP) featurization scheme (https://arxiv.org/abs/2102.02390) that utilizes physically-relevant multipole expansions of the electron density around atoms to yield feature vectors that interpolate between element types and have a fixed dimension regardless of the number of elements present. We combine GMP with neural networks to directly compare it to the widely-used Behler-Parinello symmetry functions for the MD17 dataset, revealing that it exhibits improved accuracy and computational efficiency. Further, we demonstrate that GMP-based models can achieve chemical accuracy for the QM9 dataset, and their accuracy remains reasonable even when extrapolating to new elements. Finally, we test GMP-based models for the Open Catalysis Project (OCP) dataset, revealing comparable performance and improved learning rates when compared to graph convolutional deep learning models. The results indicate that this featurization scheme fills a critical gap in the construction of efficient and transferable reactive force fields.