(290h) Towards Development of Novel Heterogeneous Catalysts Using Extrapolative Machine Learning Methods | AIChE

(290h) Towards Development of Novel Heterogeneous Catalysts Using Extrapolative Machine Learning Methods

Authors 

Toyao, T. - Presenter, Hokkaido University
Shimizu, K. I., Hokkaido university
Molecular/materials informatics has become a central paradigm in molecular and materials science thanks to the enormous potential it holds to revolutionize the design of functional molecules/materials. However, although we have already seen proof-of-concept examples that artificial intelligence (AI) can reduce the time and costs involved and also can find new compounds, most of them have only been tested on benchmark problems and no fundamentally-new molecules/materials or synthetic-transformations have been found. This is primarily due to lack of data and that machine learning (ML), the main player in this campaign, is highly focused on optimization rather than finding novel compounds and phenomena (extrapolation).

It should also be mentioned that establishing “Catalysis Informatics” is even more challenging. Although it is highly related to materials informatics and chemoinformatics, it is distinguished by the fact that catalysis is a time-dependent dynamic event controlled by the structures and chemical nature of catalytically active sites. In particular, heterogeneous catalysis is still a largely empirical science due to the complexity of the surface chemistry involved. This situation causes lack of data as the computational costs to obtain accurate theoretical models for such complex heterogeneous catalysis are currently prohibitively high and high-throughput experimental methods, which have been applied successfully to relevant fields, have not been explored fully at the current time. In this regard, building ML models that effectively find novel catalysts within diverse chemical space from “real world” experimental catalysis data (not from well-behaved computational data) is highly desirable.

In this context, we have proposed a ML approach which uses elemental features as the input representations rather than inputting the catalyst compositions directly. Namely, in our proposed method, the elemental composition ratios are multiplied by elemental descriptors such as electronegativities, melting points, atomic radii, etc. which are unique for each element. We call this approach the Sorted Weighted Elemental Descriptor (SWED) representation. Importantly, this new ML method has the potential to guide catalyst design and discovery in areas where limited catalyst composition overlap exists and even for elements previously unseen in the given data, enabling us extrapolative and ambitious exploration beyond the training data. We have used the developed ML approach to analyze literature data on oxidative coupling of methane (OCM) and water gas shift (WGS) reactions. The ML method was found to be effective for predicting novel promising catalyst candidates that include elements unseen in the original dataset for future studies (Figure 1).

It should also be noted that analysis using the extrapolative ML can reveal not only effective catalyst compositions but also the required elemental features and electronic properties so that ideal catalysts can be designed in a highly precise manner. Because catalytic properties of materials in principle should be determined by their electronic structures, the strategy is to design target electronic structures by changing the composition and physical nature of selected materials. The concept of controlling the properties of matter at the molecular scale by engineering electronic structure should not only be relevant to catalytic materials but also more generally applicable to other challenges in materials science.

Topics