(595e) Mapping Transition Metal Chemical Space for Machine Learning Models

Authors: 
Janet, J. P., Massachusetts Institute of Technology
Kulik, H. J., Massachusetts Institute of Technology
The unique and tunable electronic properties of transition metal complexes make them ideal targets for molecular design. However, the high complexity and dimensionality of transition metal chemical space both presents challenges for and necessitates new approaches to virtual screening. Data-driven models from machine learning can circumvent the high computational cost of first-principles simulation. However, predictive machine learning models require knowledge of how to optimally map heuristic chemical and topological properties of transition metal complexes to energetic outputs. We have recently trained the first artificial neural network (ANN) to predict electronic structure properties of transition metal complexes to 3 kcal/mol accuracy, outperforming previously developed descriptors (i.e., for organic molecules) by an order of magnitude1. We have also implemented this ANN in our virtual high-throughput screening toolkit, molSimplify2, to enable both prediction of structure and electronic properties prior to first-principles simulation. We will discuss this set and recent modifications to the descriptors through our development of fully continuous variable representations. We will describe how we have used established feature selection techniques from this widened space of candidate descriptors to identify the most important subsets of variables for predictive models. We analyze selected feature sets through how well they resolve differences and similarities among representative transition metal complexes using self-organizing maps and principal component analysis. Finally, we conclude with our approaches for decoding continuous variable representations to lead candidate transition molecules for integration of our predictive machine learning models into multi-level molecular design workflows.

1J. P. Janet and H. J. Kulik “Predicting Electronic Structure Properties of Transition Metal Complexes with Neural Networks” arXiv preprint arXiv:1702.05771 (2017)

2E. I. Ioannidis, T. Z. H. Gani, and H. J. Kulik “molSimplify: A toolkit for automating discovery in inorganic chemistry” J. Comput. Chem. 37, 2106-2117 (2016).