(677a) New Advances in the First Principles Protein 3D Structure Prediction Method ASTRO-FOLD

Wei, Y., Princeton University
Subramani, A., Princeton University
Floudas, C. A., Princeton University

The ASTRO-FOLD approach [1] is an ab initio method for the tertiary structure predictions of proteins from their primary amino acid sequences. There have been several advances at various stages of ASTRO-FOLD. These consist of secondary structure prediction using integer linear programming [2]; distance and angle constraints derivation from secondary structure geometries, residue contact prediction [3], structural topology prediction and loop prediction; 3D prediction algorithm combining a deterministic global optimization method aBB and conformational space annealing [1], and near-native structure identification using a novel clustering method [4] and a high resolution force field [5].

In addition to the existing first principles secondary structure perdition methods, a consensus secondary structure prediction method using mixed integer linear programming is developed based on 7 well-known secondary structure predictors and it shows better performance than each individual methods. The improvement of derivation of constraints are due to a) residue contact prediction has been enhanced by a new mixed integer linear optimization model which predicts residue contacts and structural topologies for alpha, beta and mixed alpha/beta proteins [3]; b) a new loop structure prediction method based on iterative improvement of bounds and nonlinear local optimization; c) a new sheet topology prediction method based on support vector machines and integer linear optimization. The protein 3D structure prediction algorithm based on a hybrid method is enhanced by adding/modifying some elements, for example, initial conformation selection using a more detailed torsional angle dynamics annealing procedure and side chain rotamer optimization as an effective local minimizer [6]. In order to identify the near-native structures, a novel traveling-salesman-problem-based clustering method (ICON) has been developed and on average, it selects the top 3.5% of the conformers in the ensemble [4]. The identified structures can then be refined iteratively through chemical shift predictions using SPARTA [7] and structure predictions using CS23D [8].


1. Klepeis J.L. and Floudas C.A., ASTRO-FOLD: Ab initio secondary and tertiary structure prediction in protein folding, in European Symposium on Computer Aided Process Engineering - 12, Elsevier (2002).

2. Klepeis J.L. and Floudas C. A., Prediction of beta-sheet topology and disulfide bridges in polypeptides, Journal of Computational Chemistry, 24:191?208, (2003)

3. Rajgaria R., Wei Y., and Floudas C.A.: Contact prediction for beta and alpha-beta proteins using integer linear optimization and its impact on the first principles 3d structure prediction method ASTRO-FOLD, Proteins, 78, 1825-1846 (2009).

4. Subramani A., DiMaggio P.A., and Floudas C.A., Selecting high quality protein structures from diverse conformational ensembles, Biophysical J., 97, 1728-1736 (2009).

5. Rajgaria R., McAllister S.R. and Floudas C.A., A novel high resolution Ca-Ca distance dependent force field based on a high quality decoy set, Proteins, 65(3), 726-741 (2006).

6. McAllister S.R. and Floudas C.A.: An improved hybrid global optimization method for protein tertiary structure prediction, Computational Optimization and Applications, 45, 377-413, (2009).

7. Shen Y. and Bax A., Protein backbone chemical shifts predicted from searching a database for torsion angle and sequence homology, J. Biomol. NMR, 38, 289-302 (2007)

8. Wishart D.S., Arndt D., Berjanskii M., Tang P., Zhou J., Lin G. CS23D: a web server for rapid protein structure generation using NMR chemical shifts and sequence data. Nucleic Acids Res. 36(Web Server issue):W496-502. (Jul. 1, 2008)