(230d) Protein Structure Alignment by Derivative-Free Optimization | AIChE

(230d) Protein Structure Alignment by Derivative-Free Optimization

Authors 

Shah, S. B. - Presenter, Carnegie Mellon University
Sahinidis, N. - Presenter, Carnegie Mellon University


The functional properties of a protein are strongly dependent on its structural conformation. The primary question addressed in this work is how to determine structural and therefore functional similarity from 3D protein structures. The approach we take relies on protein structure alignment, which elucidates functional protein relationships that are not depicted by the sequence.

There exist a variety of ways for analyzing the similarity of proteins, the most common one being calculation of the root mean square deviation (RMSD) within superimposed protein structures. To obtain an alignment, one protein is rotated and translated to superimpose onto the other protein structure while minimizing the RMSD within the superimposed structures. The alignment is characterized by the rotation-translation transformation, and the assignment of the amino acid residues of the two proteins. An approximate alignment may be obtained by iteratively determining the assignment through some dynamic programming methodology, and computing the rotation-translation variables through a least square methodology.

We introduce a novel approach to the problem is different from the traditional approach, where the assignment evaluation is the main focus, while obtaining a suitable rotation-translation transformation for structure superposition is easy when the assignment is known. Our approach deals with assignment evaluation and protein superposition simultaneously by formulating the alignment problem as a single optimization problem. The new formulation also eliminates the sequentiality constraints on alignments, thus generalizing the scope of the alignment methodology to also include non-sequential protein alignments. We employ derivative-free optimization (DFO) methodologies for searching for the global optimum of the highly nonlinear and non-differentiable RMSD and weighted-RMSD (wRMSD) functions encountered in the proposed model. We analyze the performance of 28 different DFO solvers and techniques, including direct search, surrogate management frameworks, and optimization by partitioning the search space [1], to determine the most effective techniques suitable for this approach. We also evaluate the sensitivity of the DFO solvers towards RMSD and wRMSD based models. Alignments obtained with the proposed model capture similarity within proteins accurately and are in agreement with the fold classification of the SCOP database. Our extensive computational studies provide insights on developing fast protein structure alignment tools as well as on the behavior and relative performance of a large number of numerical optimization techniques for algebraic model free optimization.

Reference: [1] Rios, L. M. and N. V. Sahinidis, Derivative-free optimization: A review of algorithms and comparison of software implementations, INFORMS Journal on Computing, submitted for publication (2010).