(195a) Multi-Objective, Machine-Learning Assisted First-Principles Design of Transition Metal Complexes for Redox Couples
- Conference: AIChE Annual Meeting
- Year: 2019
- Proceeding: 2019 AIChE Annual Meeting
- Group: Topical Conference: Applications of Data Science to Molecules and Materials
Monday, November 11, 2019 - 3:30pm-3:45pm
Truly automatic, in silico design of materials with targeted properties is a central goal in the fields of chemistry, chemical engineering and materials science. This challenge is exacerbated by the enormous number of possible chemistries that could be considered, and by the fact that real materials must fulfil multiple criteria to be useful, e.g. we might seek redox couples that are active, stable and soluble. Unfortunately, many promising materials lie in relatively poorly explored regions of chemical space, for example the space of open-shell transition metal (TM) complexes, which are promising candidates for homogeneous catalysis and molecular electronics. While first-principles simulation with density functional theory (DFT) provides a probe to explore large, heterogenous design spaces, long simulation times and complicated electronic structure severely limits the number of candidates that can be considered. Machine learning (ML) methods can potentially address this problem by using the results of already-complete simulations to generate low-cost surrogate models for target properties, greatly increasing the number of candidates that can be screened. We have previously demonstrated that neural networks based on graph descriptors can be used to predict various properties of TM complexes to near baseline DFT uncertainty with a relatively small number (100sâ1000s) of DFT simulations. However, when applying these models to tackle large design spaces, we observed highly variable generalization performance, reflecting the biases and limited variety of chemical motifs in the training data. Therefore, we have introduced geometric metrics to equip our surrogate models with reliable estimates of uncertainty, and we have been able to use these to avoid regions of low data coverage when designing spin crossover complexes and complexes with targeted orbital properties. Here, we use our uncertainty-equipped surrogate models to search a large, million-compound design space for TM complexes that have both target redox energetics and solubility properties for application in redox flow batteries. We use a multi-dimensional formulation of expected improvement that balances exploration (to improve model coverage) and exploitation (to find complexes with targeted properties) to screen this space for lead complexes that we then evaluate with DFT. By repeating this cycle, we can achieve first-principles coverage of the design space only where it is needed, iteratively refining a pareto set of candidate complexes that sample a range of redox and solvation properties.