(509dg) Computer-Aided Prediction of Enzymatic Reactions | AIChE

(509dg) Computer-Aided Prediction of Enzymatic Reactions

Authors 

Heid, E. - Presenter, Massachusetts Institute of Technology
Goldman, S., Massachusetts Institute of Technology
Sankaranarayanan, K., Massachusetts Institute of Technology
Coley, C., MIT
Jensen, K. F., Massachusetts Institute of Technology
Flamm, C., University of Vienna
Green, W., Massachusetts Institute of Technology
The prediction of enzymatic reaction properties, including the activity of known enzymes towards non-natural substrates, the ranking of different options for co-substrates and co-factors, the estimation of regio- and stereoselectivity, as well as the promiscuity of an enzyme is a problem of long standing interest. Computational tools traditionally score the feasibility of a desired reaction based on simple pair comparison strategies to one or more known substrates, and often rely on general, hand-curated reaction templates. This limits the accuracy of predictions, as well as the applicability of reaction templates severely, which led us to develop novel tools to extract and score reaction templates. Here, we present the open-source softwares EHreact (Extended Hasse diagrams for the extraction and scoring of reaction templates) and ESPsim (ElectroStatic Potential similarity).

EHreact is able to extract meaningful reaction templates at various degrees of generality by searching for common substructures around the imaginary transition state of a reaction from a set of known substrates. The extracted templates are arranged into a tree like structure, a Hasse diagram, which allows for a more complex and accurate scoring of the applicability of the extracted reaction templates compared to previous approaches. The scoring algorithm takes into account chemical similarity, but also includes an estimate of enzyme promiscuity via the shape of the template tree, as well as volume and steric effects, optionally electrostatic effects, and a penalty if a query substrate does not comply with conserved chemical substructures in the known substrates. EHreact scoring outperforms simpler scoring schemes on predicting high-throughput measurements of selected enzymatic activities on a range of substrates, as well as predicting co-substrates for multi-substrate reactions.

To score electrostatic effects in EHreact, we furthermore developed the python package ESPsim, an open-source software to calculate electrostatic similarities between molecules. ESPsim allows for a constrained embedding of the coordinates of a query and reference molecule with a common substructure, and subsequently calculates the overlap integrals of the electrostatic potentials of each molecule. Electrostatic potentials are calculated via Gasteiger, Merck Molecular Force Field, or custom partial charges (including an option to utilize machine-learned partial charges) and are integrated either analytically via fitting to Gaussian functions, numerically on a grid of scaled van-der-Waals surfaces, or via Monte Carlo integration. Electrostatic potentials and their similarities can furthermore be visualized easily. ESPsim thus comprises a versatile and flexible tool to inspect electrostatic similarities between a query and a set of reference molecules, and was shown to improve enzymatic activity predictions within EHreact.

Further efforts to predict enzymatic reaction properties, including progress towards incorporating imaginary transition states into machine-learning models of reaction properties beyond EHreact, as well as incorporating EHreact and ESPsim scores into computer-aided design strategies of multi-enzyme cascade reactions will also be reported.