(508f) Novel Framework for Beta-Sheet Topology Prediction Using Low-Homology Template-Based Constraints
Accurate prediction of beta-sheet topology is a major unresolved challenge in the area of protein structure prediction. Current state-of-art approaches utilize sequence alignments, secondary structure assignments, and pairwise potentials to derive rank ordered lists of solutions [1-3]. Due to the large combinatorial complexity that arises for even a small number of beta strands, Mixed Integer Linear optimization (MILP) models have been proposed to identify the optimum topology [1-2]. Involvement of non-local tertiary contacts can make the prediction of beta-sheets based on sequence information alone very difficult. Therefore, we propose a novel framework for beta-sheet topology prediction, which utilizes structural templates of low-homology to derive likely beta-strand pairs that serve as constraints for a MILP model.
Given a query sequence, structural templates are first identified using a modified version of SPARKS-X . Distance constraints are extracted from each template and are used as input for CYANA  to generate structural models. The initial set of structural templates is reduced using hierarchical clustering based on pairwise GDT, a measure of protein structure similarity. The beta-sheet topology of each template structure is extracted based on sequence alignments to the query and secondary structure assignment. A final set of structural templates is selected based on clustering of the template beta-sheet topologies, and a set of observed strand pairs is obtained.
The presented MILP model utilizes pairwise potentials calculated by BetaPro , as well as template-based constraints derived as described above. Additional constraints, initially proposed by Subramani and Floudas , are also imposed to ensure that only biologically relevant topologies are generated. Ultimately, a rank-ordered list of likely beta-sheet topologies is produced. We present results for the application of the proposed framework to all beta and mixed alpha-beta proteins of the PDBSelect25 data set, as well as to the most difficult targets from recent CASP competitions.
1. Klepeis, J. L.; Floudas, C. A. Prediction of Beta-Sheet Topology and Disulfide Bridges in Polypeptides. Journal of Computational Chemistry 2003, 24, 191-208.
2. Subramani, A.; Floudas, C. A. β-sheet Topology Prediction with High Precision and Recall for β and Mixed α/β Proteins. PLoS ONE 2012, 7 (3), e32461.
3. Ho, H.K.; Zhang, L.; Ramamohanarao, K.; Martin, S. A survey of machine learning methods for secondary and supersecondary protein structure prediction. Methods Molecular Biology 2013, 932, 87-106.
4. Yang, Y.; Faraggi, E.; Zhao, H.; Zhou, Y. Improving protein fold recognition and template-based modeling by employing probabilistic-based matching between predicted one-dimensional structural properties of query and corresponding native properties of templates. Bioinformatics 2011, 27(15), 2076-82.
5. López-Méndez, B.; Güntert, P. Automated protein structure determination from NMR spectra. Journal of American Chemical Society 2006, 128, 13112-13122.
6. Cheng, J.; Baldi, P. Three-stage prediction of protein beta-sheets by neural networks, alignments and graph algorithms. Bioinformatics 2005, 21, 75-84.