(346q) Discovery of Self-Assembling ?-Conjugated Peptides By Active Learning-Directed Coarse-Grained Molecular Simulation | AIChE

(346q) Discovery of Self-Assembling ?-Conjugated Peptides By Active Learning-Directed Coarse-Grained Molecular Simulation

Authors 

Shmilovich, K. - Presenter, University of Chicago
Ferguson, A., University of Chicago
Sidky, H., University of Chicago
Mansbach, R. A., Los Alamos National Laboratory
Panda, S. S., Johns Hopkins University
Tovar, J. D., Johns Hopkins University
Dunne, O. E., University of Chicago
In this work we integrate coarse-grained molecular dynamics simulation, deep representational learning, and Bayesian optimization to discover pi-conjugated peptides capable of self-assembling into biocompatible optoelectronic nanoaggregates. The pi-conjugated peptides studied in this work are triblock molecules consisting of a central aromatic core flanked by peptide wings. This class of molecules have surfaced as an extensible building block for self-assembling electronics as they have experimentally been demonstrated to form mesoscopic fibers micrometers in length and nanometers in diameter, where overlaps between pi-orbitals in these supramolecular assemblies lead to the emergence optical and electronic properties. Edisonian trial-and-error discovery of these molecules through either experiment or simulation is rendered impossible due to the combinatorial exploration in the molecular design space of pi-cores and peptide wings. We efficiently navigate the design space in search of high-performing candidates by deploying an active learning procedure which integrates three machine learning components: (i) an unsupervised deep representation learning approach to learn continuous low-dimensional embeddings of the discrete molecular design space, (ii) a supervised surrogate model using Gaussian process regression to predict molecular performance measured in simulation as a function of this embedded space, and (iii) a Bayesian optimization of the surrogate model to dictate which molecules should be evaluated next. Using this protocol, we derive a converged surrogate model for predicting molecular performance of one particular peptide family comprising tetrapeptide wings and an oligophenylenevinylene pi core after sampling only 2.3% of the design space. We identify molecules we predict to possess unprecedented self-assembly behavior and optoelectronic activity while uncovering design rules to guide the rational engineering of these molecular systems.