(281g) Automated Metabolic Network Reconstruction: Method, Application and Web Tool
Genome-wide metabolic network reconstruction has proved to be a powerful tool for system-level study of whole-cell metabolisms. A great amount of efforts have been spent in reconstructing metabolic networks of a few tens of organisms, including bacteria, archaea and eukaryotes. However, it remains a time- and labor-consuming task to generate a complete and accurate metabolic network reconstruction that is ready for further applications. On the other hand,a rapidly increasing number of organisms have had or are having their genomes sequenced, which provides first-hand data for whole-cell metabolic network reconstruction. Here, we present a bioinformatics pipeline that can automatically reconstruct metabolic networks of organisms from their genome sequences and annotations. Our bioinformatics pipeline is consists three stages. First, a preliminary metabolic network reconstruction is derived by matching the genes in the genome with the enzymes catalyzing metabolic reactions collected in the Kyoto Encyclopedia of Genes and Genomes (KEGG) database. Second, potential network gaps are predicted by searching for putative metabolic reactions that are required to achieve specific cellular functions, e.g. biomass synthesis. Finally, the probabilities of all the metabolic network gaps are evaluated according to sequence alignment results and the most possible metabolic network are reconstructed by a mixed integer linear programming (MILP) model. In addition, this pipeline fills the network gaps with putative gene candidates in the genome. We first tested this bioinformatics pipeline with Escherichia coli K12 MG1655, the metabolic network of which has been extensively studied. The results from our pipeline are comparable, in terms of the scale of the network, to metabolic network models that were carefully curated manually. Furthermore, we evaluated the phenotype phase plane (PPP) under two conditions using our reconstructed metabolic network, and the results were also in agreement with previous work. Then, the pipeline was applied to twelve strains of Prochlorococcus marinus, the numerically most abundant photosynthetic organism on the earth. By comparing the metabolic reconstructions of different strains of one genus, we can predict the pan and core metabolic networks of Prochlorococcus marinus, which represent the properties of the genus rather than those of individual strains. A publicly available web tool of this bioinformatics pipeline is also under intense development currently. Once established, it will provide services of automated whole-cell metabolic network reconstruction for individual organisms based on genome sequences and annotations.