(573a) A High Resolution Side Chain Centroid Based Distance Dependent Force Field - Effect of Protein Side Chain Interactions | AIChE

(573a) A High Resolution Side Chain Centroid Based Distance Dependent Force Field - Effect of Protein Side Chain Interactions

Authors 

Rajgaria, R. - Presenter, Princeton University
McAllister, S. R. - Presenter, Princeton University
Floudas, C. A. - Presenter, Princeton University


As we go from medium resolution structure prediction to high resolution structure prediction we need a force field which can distinguish between structures with very low root mean square deviations (RMSD). Recently, a high resolution Cα-Cα distance dependent force field was proposed and found to be very successful([1]). However, Cα distance based force fields completely disregard the presence of the side chain atoms. It is expected that a force field which considers the effect of the side chains of an amino acid might be more effective in its energy calculation. This work is a continuation of the previous work ([1]) to generate a side chain centroid based high resolution force field.

The force field was trained on a well represented PDB (protein data bank) set and high quality decoys were generated for each of the training proteins. The generation of decoy structures is split into two stages. The first stage identifies the hydrophobic core of the protein and uses a set of tolerances to establish a varying degree of protein flexibility within the bounds. Then an ensemble of decoy structures can be created via DYANA ([2]) which uses simulated annealing with torsion angle dynamics. Using this method, high resolution decoys (i.e., decoy structures with a minimum rmsd < 2.0 Å to the native structure) have been generated for a set of 1489 non-homologous proteins that are expected to span the experimentally-determined structures in the Protein Data Bank([3]). A linear programming based approach was used to train these decoys. Two different distance dependencies were studied (6 bin definition and 7 bin definition). To incorporate a large number of decoys (> 612,500) an iterative dropping scheme (``Rank and Drop") based on RMSD and energy criteria was used. Decoys for each protein were ranked based on RMSD and then a force field was generated using the 45 lowest RMSD decoys of each protein. This force field was then used to rank all decoys and the best decoys (ranked by energy) were used to generate the next force field. This iterative scheme was repeated until there was no more change in the ranking of the decoys.

The effectiveness of a force field can be rigorously evaluated by applying it to an independent test set. 148 randomly selected proteins (41-200 amino acids in length) were used as the test set. The 6bin-HRSC (high resolution side chain) force field was capable of correctly identifying the native structure of 128 test proteins out of 148. The average ranking and average Z-score for this test set was 2.49 and 3.62 respectively. The performance of this force field on the high resolution decoy set is better than other leading force fields ([4],[5],[6],[7]).

[1] Rajgaria, R., McAllister, S. R., and Floudas, C. A.(2006). Development of A Novel High Resolution Calpha-Calpha Distance Dependent Force Field Using A High Quality Decoy Set. Proteins: Structure, Function, and Bioinformatics, Submitted.

[2]-P. Guntert, C. Mumenthaler and K. Wuthrich.(1997). Torsion angle dynamics for NMR structure calculation with the new program DYANA. Journal of Molecular Biology, 273: 283-298.

[3] Zhang, Y., and Skolnick, J.(2004). Automated structure prediction of weakly homologous proteins on a genomic scale. Proceedings of the National Academy of Sciences, 101: 7594-7599.

[4] Rajgaria, R., McAllister, S. R., and Floudas, C. A.(2006). Improving the Performance of A High Resolution Distance Dependent Force Field By Including Protein Side Chains. In Preparation.

[5] Tobi, D. and Elber, R. (2000). Distance-dependent, pair potential for protein folding: Results from linear optimization. Proteins: Structure, Function, and Bioinformatics, 41: 40-46.

[6] Loose, C., Klepeis, J. L. and Floudas, C. A. (2004). A new pairwise folding potential based on improved decoy generation and side-chain packing. Proteins: Structure, Function, and Bioinformatics, 54: 303-314.

[7] Hinds, D. A. and Levitt, M. (1994). Exploring Conformational Space with a Simple Lattice Model for Protein Structure. Journal of Molecular Biology, 243: 668-682.