(323a) Data-Driven Design of Self-Assembling Pi-Conjugated Oligopeptides (Invited Talk) | AIChE

(323a) Data-Driven Design of Self-Assembling Pi-Conjugated Oligopeptides (Invited Talk)


Ferguson, A. - Presenter, University of Chicago
Shmilovich, K., University of Chicago
Synthetic oligopeptides containing pi-conjugated cores present an attractive molecular building block for the self-assembly of biocompatible nanoaggregates with engineered optical and electronic properties. The peptidic and pi-conjugated blocks can be tailored to control the aggregation morphology and emergent optoelectronic properties such as electronic delocalization, electron/hole transport, and optical absorption spectrum. The design space of possible peptide sequences and pi-conjugated cores is so large as to make Edisonian trial-and-improvement of the structure and properties of the self-assembled aggregates intractable by either experimentation or simulation. This motivates a data-driven guided search wherein surrogate models trained over data for a small number of oligopeptide chemistries informs an active learning protocol to identify the next most promising chemistries to study. In this work we combine coarse-grained molecular dynamics (CGMD) simulations, variational autoencoders (VAEs), and Gaussian process regression (GPR) to establish an integrated pipeline for the rational exploration of pi-conjugated oligopeptide chemical space. CGMD is used to predict the quality of core-core alignment within the self-assembled nanoaggregates as a measure of fitness, VAEs are trained from scratch to identify a low-dimensional parameterization of chemical space within which to perform the search, and GPR with active learning is used to efficiently navigate the space through an acquisition function balancing exploitation (i.e., chemistries predicted to possess good assembly behavior) and exploration (i.e., searching in under-sampled regions of chemical space). This protocol leads to the identification of new unanticipated pi-conjugated oligopeptide chemistries for all-atom molecular modeling, electronic structure calculations, and experimental synthesis and characterization.