(7ep) Integration of Machine-Learning and Data Management Methods for Accelerated Catalyst Modeling and Exploration | AIChE

(7ep) Integration of Machine-Learning and Data Management Methods for Accelerated Catalyst Modeling and Exploration

Authors 

Boes, J. R. - Presenter, Stanford University
Research Interests:

My primary research interest is the development of new machine-learning based techniques to help accelerate the discovery of increasingly effective catalysts. As the quality of life continues to improve on a global scale, the demand for energy resources continues to grow. However, use of current fossil fuels results in harmful CO2 and NOx emissions, contributing to dangerous global climate change. As a result, sustainable energy sources are becoming increasingly necessary and popular. However, these technologies are still in early stages of development, and there is still a need to improve their energy efficiency to support commercial viability. Thus, the rapid development of new catalysts which can help these technologies compete in the current energy economy is of critical importance.

Towards these ends, I have spent the last five years studying computational catalysis at Carnegie Mellon University. I worked with Professor John Kitchin to develop a new kind of atomic potential constructed from neural networks. These potentials have become increasingly popular in the last decade with the advent of fingerprinting schemes, proposed by Behler and Parrinello, which allow them to be trained to a wide variety of atomic input structures. This has unlocked their ability to reproduce potential energy surfaces (PES) of a multitude of chemical systems with levels of accuracy unrivaled by their physical potential counterparts1. They are also significantly faster to evaluate than a full quantum chemistry calculation resulting in the ability to study much larger unit cells and more complex chemistry. Utilizing these techniques, we have demonstrated that it is possible to model the high-dimension PES for the mobile surface of Pd with coverage dependence of oxygen2. It has also been used to make quantitative predictions of segregation for AuPd under vacuum conditions at all bulk compositions of the alloy3; an integral step for relating the active sites, which computationalists model, to the factors which we have control over when creating the catalyst.

Currently, I am a post-doc at Stanford working with Thomas Bligaard on developing grey-box models to accurately predict adsorbate coverage effects across transition metals. While here, I will gain additional insight into input parameter selection as well as inplementing unsupervised machine-learning for characterizing these parameters. Selecting the inputs for these problems is critical because it is only through these inputs that the algorithms are able to make any useful predictions. However, when there are so many possible input parameters available, it becomes challenging to decide which are most important for characterizing the system of interest. For example, many machine-learning potentials are based on descriptions of the unit cell geometry and compositions of the atoms. While this is quite tractable for modeling a PES, the inputs quickly become too numerous when considering all possible adsorbate and transition metal elements. Moreover, we would intuitively no longer expect the positions of the atoms to be an effective descriptor for the interaction energies of different species. Fortunately, based on known relationships of adsorption across metal surfaces, we know such effective input descriptors are likely to exist. It then follows that the accuracy can be improved using other inputs which are important to co-adsorption in a machine-learning framework. By learning and testing these new aspects of machine-learning, I will acquire the tools I need for streamlining and automating the technical aspects of the search for new catalysts.

As available computational resources continue to increase, the rate of data generation in the field of computational chemistry is quickly beginning to outstrip the ability of those in the field to interpret it all. Furthermore, there is often also a great deal of usefulness in results beyond that intended by those who created it. To improve the ability of all those in the field of catalysis to continue making rapid scientific progress, and to achieve more with less, there is a powerful need to develop data management techniques which are capable of storing and searching this information; similar to the way the Materials Project and others have done for the field of material science. But for catalysts, this is significantly more challenging due to the large number of possible catalytic systems of interests thereby creating a greater need for machine-learning. Even with a robust database structure capable of storing a breadth of relevant information, there is a critical need for tools which are capable of intelligently querying that database. Machine-learning not only benefits from such large repositories of information, it can also assist in making such intelligent identifications which allow the user to find new and interesting results. All of these tools would provide an incredible resource to the field, and my work tailors these methods to best suit the field of catalysis.

Teaching Interests:

As an instructor, I teach: thermodynamics, kinetics, molecular simulation (and underlying quantum theory), reaction engineering, functional coding languages, and general chemical engineering. Chemical engineers fulfill a diverse range of demand in industry positions today. As modern tools become increasingly computationally intensive, there is an ever growing need to shift the emphasis of education to computational tools and techniques. As such, I strongly support the integration of coding into the classroom. This also ties into my interest in educational research which I have outlined below.

As an undergraduate, I spent a cumulative three years working in a general chemistry tutoring center, mentoring individuals and small groups. I often attribute my initial love of education to this time, and I look forward to a future as an educator. In my first year at CMU, I was a teaching assistant for the undergraduate introductory courses for general chemical engineering as well as thermodynamics. Since then I have also been the TA for the graduate level molecular simulation and reaction engineering courses. Most recently, I have been the math software TA, providing introductory level courses for MatLab to all students in the chemical engineering department. For all of these courses I have provided regular office hours and often additional recitations for the students. I have also given multiple guest lectures in graduate level courses in the mechanical and chemical engineering departments, for which I planned the subject material. For my educational contributions to the department, I received The Mark Dennis Karl Outstanding Teaching Assistant Award in 2015.

I have also pursued instruction on educational techniques for engineering by taking an evidence-based teaching course specific to STEM fields. I am a strong proponent of the flipped classroom as a pedagogical teaching model. This model requires students to study lecture material before the class and implement it on assignments given during class. Among other benefits, such a system allows the instructor to provide immediate feedback to students which is shown to improve student retention rates and longevity of student knowledge. Further details of my pedagogical teaching strategies and achievements can be found on my website.

I would like to study the potential benefits of teaching code-based chemical engineering tools in an immersive environment. The concept is similar to that of language immersion, in which various subject material is taught through the medium of a second language. Language immersion is known to dramatically improve student comprehension of the second language in technical communication with no discernible negative impact on primary language skills. Teaching students to solve chemical engineering problems can similarly be framed in the medium of a coding language, such as Python or MatLab.

Relevant Work

1. Jacob R. Boes, Mitchell C. Groenenboom, John A. Keith, and John R. Kitchin. Neural network and ReaxFF comparison for Au properties. International Journal of Quantum Chemistry, 116(13):979–987, 2016. DOI

2. Jacob R. Boes and John R. Kitchin. Neural network predictions of oxygen interactions on a dynamic Pd surface. Molecular Simulation, 43(5-6):346-354, 2017. DOI

3. Jacob R. Boes and John R. Kitchin. Modeling Segregation on AuPd(111) Surfaces with Density Functional Theory and Monte Carlo Simulations. The Journal of Physical Chemistry C, 121(6):3479-3487, 2017. DOI

Curriculum Vitae

ORCID

Website

Phone: (607) 342-1846

Email: jacobboes@gmail.com | jrboes@stanford