(526b) Princeton_Tigress: Protein Geometry Refinement Using Simulations and Support Vector Machines

Authors: 
Khoury, G., Princeton University
Floudas, C. A., Princeton University
Pinnaduwage, N., Princeton University
Tamamis, P., Texas A&M University
Smadbeck, J., Princeton University
Kieslich, C. A., Texas A&M University

Protein structure refinement aims to perform a set of operations given a predicted structure to improve model quality and accuracy with respect to the native in a blind fashion1,2. Despite numerous computational3-12 and even collaborative13 approaches to the protein refinement problem reported in the previous three Critical Assessment of Techniques for Protein Structure Prediction (CASPs) (2008-2012)14,15, an overwhelming majority of methods degrade models rather than improve them. This is a result of many generated models being significantly similar to one another and the inability of current forcefields to distinguish between such better or worse models. We first developed a method tested in blind predictions during CASP10 which was officially ranked in 5th place among all methods in the refinement category.

Here, we present Princeton_TIGRESS, which when benchmarked on all CASP 7,8,9, and 10 refinement targets, simultaneously increased GDT_TS 76% of the time with an average improvement of 0.83 GDT_TS points per structure. The protocol begins with the derivation of constraints from the input structure which include backbone Cα-Cα, disulfide bridge, α-helix hydrogen bond, and β-sheet hydrogen bond constraints. Next, a decision is made based on the sequence length. For structures with sequence lengths less than or equal to 154 amino acids, the procedure proceeds with CYANA sampling, Rosetta FastRelax relaxation, SVM Filtering, scoring with the dDFIRE energy function, and a constrained molecular dynamics simulation in CHARMM. If the structure is larger than the cutoff of 154 residues, the procedure proceeds directly to a CHARMM MD stage. The procedure results in one refined structure with improved model accuracy and quality as benchmarked in this work.

The method was additionally benchmarked on models produced by top performing three-dimensional structure prediction servers during CASP10, refining predictions with only 1 model 79% of the time on average. Cumulative comparisons of the extent of refinement and number of wins were performed. The robustness of the Princeton_TIGRESS protocol was also tested for different random seeds. We make the Princeton_TIGRESS refinement protocol freely available as a web server at http://atlas.princeton.edu/refinement. Using this protocol, one can consistently refine a prediction to help bridge the gap between a predicted structure and the actual native structure. We will additionally present our recent blind benchmarking results in CASP11, advances in the methodology, and describe how Princeton_TIGRESS can be used to aid protein structure prediction and design given only an initial model generated through existing protein structure prediction servers and techniques.

References

1.         Khoury GA, Tamamis P, Pinnaduwage N, Smadbeck J, Kieslich CA, Floudas CA. Princeton_TIGRESS: Protein geometry refinement using simulations and support vector machines. Proteins: Struct, Funct, Bioinf 2013(In Press):DOI:10.1002/prot.24459.

2.         Khoury GA, Smadbeck J, Kieslich CA, Floudas CA. Protein folding and de novo protein design for biotechnological applications. Trends Biotechnol 2014;32(2):99-109.

3.         Bhattacharya D, Cheng J. 3Drefine: Consistent protein structure refinement by optimizing hydrogen bonding network and atomic‐level energy minimization. Proteins: Struct, Funct, Bioinf 2013;81(1):119-131.

4.         Adams PD, Baker D, Brunger AT, Das R, DiMaio F, Read RJ, Richardson DC, Richardson JS, Terwilliger TC. Advances, interactions, and future developments in the CNS, Phenix, and Rosetta structural biology software systems. Annual review of biophysics 2013;42:265-287.

5.         Rodrigues JP, Levitt M, Chopra G. KoBaMIN: a knowledge-based minimization web server for protein structure refinement. Nucleic Acids Res 2012;40(W1):W323-W328.

6.         Raval A, Piana S, Eastwood MP, Dror RO, Shaw DE. Refinement of protein structure homology models via long, all-atom molecular dynamics simulations. Proteins: Struct, Funct, Bioinf 2012;80(8):2071-2079.

7.         Zhang J, Liang Y, Zhang Y. Atomic-level protein structure refinement using fragment-guided molecular dynamics conformation sampling. Structure 2011;19(12):1784-1795.

8.         Nugent T, Cozzetto D, Jones DT. Evaluation of predictions in the CASP10 model refinement category. Proteins: Struct, Funct, Bioinf 2014;82(S2):98-111.

9.         Mirjalili V, Feig M. Protein Structure Refinement through Structure Selection and Averaging from Molecular Dynamics Ensembles. J Chem Theory Comput 2013;9(2):1294-1303.

10.       Mirjalili V, Noyes K, Feig M. Physics based protein structure refinement through multiple molecular dynamics trajectories and structure averaging. Proteins: Struct, Funct, Bioinf 2014;82(S2):196-207.

11.       Heo L, Park H, Seok C. GalaxyRefine: protein structure refinement driven by side-chain repacking. Nucleic Acids Res 2013:W384–W388.

12.       Li J, Bhattacharya D, Cao R, Adhikari B, Deng X, Eickholt J, Cheng J. The MULTICOM Protein Tertiary Structure Prediction System. Methods Mol Biol 2014;1137:29-41.

13.       Khoury GA, Liwo A, Khatib F, Zhou H, Chopra G, Bacardit J, Bortot L, Delbum ACB, Deng X, Faccioli R, He Y, Krupa P, Li J, Mozolewska M, Baker D, Cheng J, Floudas CA, Keasar C, Levitt M, Popavić Z, Scheraga HA, Skolnick J, Crivelli SN, Players F. WeFold: A Coopetition for Protein Structure Prediction. Proteins: Structure, Function, Bioinformatics 2014:In Press.