(347f) Protid: Systematic Identification of Protein Scaffolds for Molecular Recognition

Authors: 
Hackel, B. J., University of Minnesota
Holec, P., University of Minnesota
Kruziki, M. A., University of Minnesota

Numerous protein scaffolds have been proposed and validated for the generation of molecular recognition reagents for clinical targeting, biotechnology, and fundamental scientific studies. Yet while varying degrees of success have been observed, the causes of the success or failure have not been systematically explored. In this study, we strive to merge computational and experimental tools to elucidate the biophysical factors that dictate evolutionary potential – particularly for novel binding function – to empower rational design of future protein scaffolds. 

We have developed and implemented a computational algorithm to efficiently evaluate naturally occurring protein domains on numerous metrics with potential impact on evolution of binding function. In one implementation, we considered scaffolds in which two loops would be diversified to serve as the paratope or binding region. We identified all domains in the Protein Data Bank with two accessible loops and scored them on the size, shape, orientation, and target accessibility of their paratopes as well as protein size, stability upon mutation, and dependence on disulfide bonds and/or cofactors. Note that several metrics, most notably paratope shape, orientation, and accessibility, have numerous methods of measurement that were executed. The utility of each will be assessed based on scaffold performance. A machine learning algorithm was used to preliminarily weigh each metric based on historical scaffold performance. The top 30 scaffolds were identified via the resultant algorithm and chosen for experimental analysis. For each scaffold, combinatorial libraries of at least 10 million mutants with diversified loops were synthesized at the genetic level and introduced into the yeast surface display system for comparative analysis of the ability to evolve specific, high affinity binding to numerous targets as identified via magnetic and fluorescent target-binding selections. We will discuss the relative efficacy of each scaffold as well as the scaffold design metrics that correlate most strongly with performance. Implications for future scaffold design, including synthetic protein topologies, will be addressed as well as other designs. For example, in another implementation, rather than loops we consider paratopes that are integral to secondary structure but have exposed side chains.