(575f) Protein Structure Guided Machine Learning Models for Enzyme Kinetic Parameter Prediction | AIChE

(575f) Protein Structure Guided Machine Learning Models for Enzyme Kinetic Parameter Prediction


Boorla, V. S. - Presenter, Pennsylvania State University
The turnover number (kcat) and Michaelis constant (Km) of the Michaelis-Menten equation are important enzymatic kinetics parameters needed for both assessing individual enzyme performance and for parameterizing metabolic models. Their values tend to span orders of magnitude reflecting the nature of the reaction chemistry, the protein fold of the enzyme, and the bioenergetic/biosynthetic demands imposed by the cell. De novo prediction of these values has been quite challenging due to the mechanistic complexity of the underlying biological processes. Nevertheless, a few recent studies have shown the possibility of training organism independent machine learning models for prediction of kcat and Km using only the enzyme’s amino acid sequence and substrate’s chemical features. These results coupled with advances in protein structure prediction algorithms (e.g., AlphaFold2.0) have motivated our efforts at embedding protein structural features alongside sequence within machine learning models for kcat and Km prediction. We used a graph neural network architecture for extracting features from three-dimensional structures of enzymes and two-dimensional topologies of their substrates which are combined using an attention network to learn features for enzyme-substrate interactions. We integrated the interaction features learned by the model with evolutionary embeddings of enzymes extracted from state-of-the-art protein-language models and “expert crafted” molecular fingerprints for the substrates to predict the kcat and Km values of enzyme-substrate pairs. The constructed Deep Learning models for Michaelis Menten parameter Prediction (DeepMMPred) show better generalizability to blind-test datasets compared to baseline-models that do not use structural features as well as perform on par with existing methods with increased coverages of enzymes and substrates. Further, we extend the applicability of DeepMMPred for inhibitor constant (Ki) prediction by training on experimental measurements of corresponding enzyme-inhibitor pairs. Physical interpretability of the trained models was explored by estimating attention coefficients for amino acid residues that maximally contribute to the kinetic parameter predictions. We envision these models as comprehensive tools for automated function annotation of enzymes, for parameterizing large-scale genome scale metabolic models and to generate hypotheses for investigating the presence of substrate-level enzyme inhibitions. The trained models and codes will be made available in Github https://github.com/maranasgroup/DeepMMPred and also as an interactive easy-to-use graphical interface at https://maranasgroup.com/DeepMMPred