Concluding Remarks | AIChE

Concluding Remarks

The revitalization of artificial intelligence (AI) over the last decade due to the dramatic progress in data science has caused resurgence in the development of machine learning methods for materials science. The Holy Grail in materials science is to be able to design and/or discover new materials with desired macroscopic properties rationally, systematically, and quickly, rather than by the slow and expensive trial-and-error Edisonian approach that has dominated for decades. To accomplish this, one has to solve two different but related problems: (i) the forward problem, where one predicts the material properties given the structure or formulation, and (ii) the inverse problem, where one determines the appropriate material structure or formulation given the desired properties. AI methods have the potential to address this challenge in important ways.

However, applying AI to materials science is not new – it has a 35-year history and a rich literature. The forward-inverse conceptual framework, for instance, and its solution using hybrid neural networks and directed evolution, was demonstrated in 1992. What is exciting now is the ability to do all this more easily for more complicated materials due to the availability of powerful and user-friendly hardware and software environments, and, of course, plenty of data.

I classify the materials design problems into three categories – “easy”, “hard”, and “harder” classes. They correspond to the “good”, the “bad”, and the “ugly” problems, or the other way around, depending on your persuasion. The relatively “easy” ones are those where there is plenty of data that can be analyzed using off-the-shelf machine learning tools – for example, many structure-to-property prediction problems fall into this class. These can be, and are being, addressed right now. The “hard” problems are those which require combining first-principles knowledge of the underlying physics and/or chemistry with data-driven methods. Although researchers demonstrated how to do this a couple of decades ago, much work still remains to be accomplished to do this systematically and quickly for wider impact. This might take another decade or so.

Harder still is the last class, where one needs to develop domain-specific representations, languages, compilers, ontologies, molecular structure search engines, etc. – i.e., domain-specific “Watson-like” materials discovery engines. These really interesting and intellectually challenging problems would require going beyond purely data-centric machine learning, despite all the current excitement, and leveraging other knowledge representation and reasoning methods from the earlier phases of AI. They would require a proper integration of symbolic reasoning with data- driven processing. This might take a couple of decades to accomplish for extensive and routine usage. I will discuss these challenges and opportunities using materials design examples.