(189ay) Applications of Chemml Program Suite in Predicting Properties of Organic Materials: A Path to Data-Driven Discovery in Chemistry
AIChE Annual Meeting
2018
2018 AIChE Annual Meeting
Computational Molecular Science and Engineering Forum
Poster Session: Computational Molecular Science and Engineering Forum (CoMSEF)
Monday, October 29, 2018 - 3:30pm to 5:00pm
Data mining and machine learning (ML) algorithms have received increased attention in chemical and materials research in recent years. To facilitate the broader dissemination of these techniques, we have developed an open-source program suite, called ChemML. Recent technical advances in ChemML allow us to uncover hidden structure-property relationships that govern the behavior of chemical and materials systems. These insights are a prerequisite for rational design and inverse engineering capability as outlined in the White House Materials Genome Initiative. This presentation highlights capabilities of ChemML for accurately predicting properties of organic molecules in different applications. For a benchmark data set of electronic properties of more than 2 million organic semiconductors, we demonstrate substructure-based fingerprints allowed for the construction of ML models with state-of-the-art performance. We also employ ChemML to predict refractive index of polymers through different deep learning architectures. These models merge different feature representation of polymers and using design methodologies achieve predictions that are competitive with significantly more complex machine learning techniques. All these models can further be used in virtual high-throughput screening for accelerating materials discovery and as a guide for rational design.