Mpath: Computationally-Aided Design of Synthetic Metabolic Pathways | AIChE

Mpath: Computationally-Aided Design of Synthetic Metabolic Pathways


Mpath: Computationally-aided design of synthetic metabolic pathways

Robert Sidney Cox III1 Masahiko Nakatsui1 Hiroki Makiguchi2 Teppei Ogawa2 Akihiko Kondo1

Michihiro Araki1

We introduce a computational platform, Mpath, for exploring synthetic metabolic pathways including putative enzyme and compound information. Mpath samples randomly combinations of known chemical reaction steps to identify potential metabolic routes between target compounds. We demonstrate this heuristic algorithm to design putative pathways for the production of novel synthetic amino acids in vivo.

Keywords â?? metabolic engineering, database, chemoinformatics

I. BACKGROUND:
HE growing catalogue of known enzymes promises new biological products by combining transgenes in vivo and engineering enzymes with novel catalytic capabilities. Amino acid production is part of the core metabolism of every free-living organism, but derivatives of the standard
20 amino acids are often produced as part of the â??secondary metabolismâ?? [1]. For example, many neurotransmitters are derivatives of standard amino acids, and their derivatives are rich targets for drug discovery.
In this study we have developed an efficient method

handling comprehensive enzymatic reaction data to design extensive metabolic pathways including putative compounds and enzymatic reactions. We first developed a simple method to represent chemical structures in the form of feature vectors and enzymatic reactions using feature differences between chemical pairs. An algorithm was then developed to find possible combinations of chemicals and enzymatic reactions from start to target compounds on the basis of linear programming.
The design of metabolic pathways is finding combinations
of reaction features that satisfy differences between two chemical features. The reaction features are rearranged to yield chemical features in sequence, which are used to assign compounds from our chemical database by similarity. Our method significantly reduces the computational time to find extensive metabolic pathways. The resulting metabolic pathways including putative compounds and enzymatic reactions are ranked on the basis of feasibility criteria using chemical similarity and stored in a pathway database. A web user interface is also developed to check pathway candidates.

1Organization of Science and Technology, Kobe University, 1-1

Rokkodai Nada, Kobe 657-8501, Japan

2Mitsui Knowledge Industry Co., Osaka Mitsui-Bussan Bldg. 6F, 2-3-33

Nakanoshima, Kita-ku, Osaka 530-0005, Japan

*Presenter E-mail: Sidney@dna.caltech.edu

II. RESULTS
A. We began with the rich set of more than 11,000 L- amino acid-like compounds in the PubChem database [3] and using these as target molecules, calculated acceptable metabolic pathways for their synthesis from glucose. For this we used the annotated enzymes from the KEGG database [2] and identified reaction steps between compounds that are currently present in nature, along with enzymatic steps which might be easily engineered due to high chemical similarity between known and target compounds. We scored each reaction step by chemical similarity. Mpath correctly reconstructed pathways for 50 amino acid derivatives which are contained in KEGG, but are not part of the core reference pathway.
B. From the 1,987 putative synthetic amino acid pathways, we analyzed 100 and chose the 50 most feasible for classification. Most commonly, the sidechains of lysine, glutamic acid, cysteine, and serine were found to participate in several 'sidechain linking' derivative reactions. Several small molecules were found to attach to these amino acids including carboxylic acids such as acetic acid, formic acid, succinic acid, and also small primary amines. Other pathways classifications included amino transfer reactions, catabolic degradation pathways, and aromatic ring substitutions.
III. CONCLUSION
The Mpath algorithm exhibits several useful properties compared to other methods, including speed when calculating synthetic pathways composed of many steps from large reaction databases. We found many putative pathways for making new amino acid derivatives, which might be useful for pharmaceutical screening and materials engineering applications. In particular we found a high variety of glutamic acid side-chain linked compounds, and we present these for possible applications.
REFERENCES

[1] Akashi, H. and Gojobori, T. 2002. Metabolic efficiency and amino acid composition in the proteomes of Escherichia coli and Bacillus subtilis. Proceedings of the National Academy of Sciences of the United States of America. 99, 6 (Mar. 2002), 3695â??3700.

[2] Kanehisa, M. et al. 2010. KEGG for representation and analysis of molecular networks involving diseases and drugs. Nucleic Acids

Research. 38, Database issue (Jan. 2010), D355â??60.

[3] Wang, Y. et al. 2009. PubChem: a public information system for analyzing bioactivities of small molecules. Nucleic Acids Research.

37, Web Server issue (Jul. 2009), W623â??33.