(274a) Data-Driven Protein Engineering Using Deep Representational Active Learning | AIChE

(274a) Data-Driven Protein Engineering Using Deep Representational Active Learning

Authors 

Ferguson, A. - Presenter, University of Chicago
Ranganathan, R., University of Chicago
Lian, X., University of Chicago
Praljak, N., University of Chicago
The design of synthetic proteins with desired function is a long-standing goal in biomolecular science with broad applications in biochemical engineering, agriculture, medicine, and public health. Rational de novo design and experimental directed evolution have achieved remarkable successes but are challenged by the requirement to find functional needles in the vast haystack of protein sequence space. We have developed data-driven models based on semi-supervised deep representational learning and Bayesian optimization to provide a predictive genotype to phenotype map that can prospectively identify functional sequences. Designs are realized by high-throughput gene synthesis and experimentally assayed along one or more dimensions of function. Experimental measurements passed back to retrain the models within a virtuous feedback loop that rationally traverses protein sequence space to optimize protein function. We will discuss applications of this data-driven design platform to Sho1, S1A, and avGFP proteins.

A.L.F. is a co-founder and consultant of Evozyne, LLC and a co-author of US Provisional Patents 62/853,919 and 62/900,420 and International Patent Applications PCT/US2020/035206 and PCT/US20/50466. R.R. is a co-founder and consultant of Evozyne, LLC and a co-author of US Provisional Patent 62/900,420 and International Patent Application PCT/US20/50466.