(272j) The Physical Analytics Pipeline - a Bayesian Optimization of the Hybrid Organic-Inorganic Perovskite Compositional Space | AIChE

(272j) The Physical Analytics Pipeline - a Bayesian Optimization of the Hybrid Organic-Inorganic Perovskite Compositional Space

Authors 

Herbol, H. C. - Presenter, Cornell University
Poloczek, M., University of Arizona
Clancy, P., Cornell University
Hybrid Organic-Inorganic Perovskites (HOIPs) are an exciting class of photovoltaic materials due to their combination of high solar cell efficiency and the ability to be fabricated by a low-energy room temperature solution process. Perovskites are classified by their characteristic ABX3 structure, where A is an organic (or inorganic) cation (e.g., methyammonium, formamadinium or Cs), B is a metal cation (typically Pb or Sn), and X is a halide (where X= I, Br, Cl). Choice of solvent is also very important in the fabrication process and can be one of perhaps eight common solvent choices. With all these possible choices of ABX3, including mixing metal cations, mixing halides and use of binary solvent blends, the HOIP compositional space quickly becomes unwieldy, with at least 500,000 possible combinations. Experimental exploration of this space is Edisonian and limited by time and resources and a lack of underlying rational design guidelines. Even computational approaches are limited by the lack of available force fields for the multi-species HOIP systems, which restricts the calculations to expensive quantum mechanical ones. The Physical Analytics pipeLine (PAL), which we have developed, tackles this problem by bridging Gaussian Process Bayesian Optimization with computational methods to minimize the number of necessary calculations needed to locate the global maximum. As a proof of concept, PAL has been implemented for pure and mixed halide perovskites, and is benchmarked against the Sequential Model-based Algorithm Configuration (SMAC). PAL performs extremely well in comparison to SMAC, and is robust upon replication with different initial training sets. Further improvements to the PAL model have been made to take into account the ability to combine a variety of information sources, with different accuracy and cost, allowing us to use cheaper, less accurate calculations to aid in improving our predictions, whilst minimizing the total cost of optimization. This combination of exploration and exploitation allows for efficient location of broad parameter spaces where data may be sparse.