(681h) Application of High Performance Computing and Machine Learning to Accelerate Material Discovery for Energy Capture and Storage

Authors: 
Beckner, W., University of Washington
Pfaendtner, J., University of Washington
High performance computing (HPC) and open-source software are revolutionizing the strategies for material discovery. In contrast to material discovery in the lab, which involves at every stepâ??design, implementation, and testingâ??the ardent attention of the scientist, material discovery driven by the best use of digital resources shifts the focus of the researcher to the creation of automated and adaptive experimental architectures. Such architectures built on the foundations of machine learning (ML), HPC, and database management and allow the exploration of design spaces that would otherwise be insurmountable.[1]

The design space for materials in alternative energy is one such space that is simply too vast to be explored by laboratory methods alone. The first part of this talk will discuss ways that ML algorithms can be applied to the experimental characterization of new materials like solar cells. In particular, we employ the random forest regressor in the scikit-learn python package to predict the photoluminescence properties of perovskites. The random forest is trained on experimentally measured photoluminescence, height, potential, phase, and amplitude atomic force microscopy (AFM) images and then predicts the photoluminescence using the latter four AFM images as inputs to the learning algorithm. Random forests are convenient for simple applications such as this as they require little parameterization from the user; are easy to interpret as they can be visualised; require comparatively little data preparation as most learning algorithms need input data to be preprocessed in some form; scale logarithmically with the number of data points the tree is trained on; and is a white box model such that if a pattern is to be found among the data it can be articulated by the boolean logic generated by the tree (vs black box models such as artificial neural networks where relationships between features of the data are highly convoluted).

Following this, we will describe how we are applying data science tools to the design of new electrolyte solutions for batteries. Our entire discovery process takes place within HPC: we design an all-atom electrolyte solution, perform molecular dynamics (MD) simulations to measure the physical properties that are of interest, and then perform a search algorithm to increment in a single dimension of the solution before reiterating. In this case, we search for the eutectic ratio of an ionic liquid (IL) to a hydrogen bond donor. These so-called deep eutectic solvents (DESs) have recently been shown to be promising charge-carrying fluids for redox flow batteries but require further reduction in viscosity before they will be viable commercial products. While this one dimensional search allows us to find eutectic ratios for a given IL and hydrogen bond donor, it does not address the much larger design space of the individual components themselves. Extensions of our approach to larger design spaces via the use of artificial neural networks (ANN) will also be discussed. The talk will end with a short perspective on how these tools can be further generalized and applied to a wide range of problems in Chemical Engineering.

[1] Beck, D. A. C., Carothers, J. M., Subramanian, V. R. and Pfaendtner, J. (2016), Data science: Accelerating innovation and discovery in chemical engineering. AIChE J., 62: 1402â??1416. doi:10.1002/aic.15192