(475c) High-Throughput Reconstruction and Optimization of 130 New Genome-Scale Metabolic Models | AIChE

(475c) High-Throughput Reconstruction and Optimization of 130 New Genome-Scale Metabolic Models


Henry, C. - Presenter, Argonne National Laboratory
DeJongh, M. - Presenter, Hope College
Best, A. - Presenter, Hope College
Stevens, R. - Presenter, Argonne National Laboratory

Genome-scale metabolic models have emerged as a crucial resource for translating detailed knowledge of thousands of distinct enzymatic processes into global predictions of organism behavior [1]. These models provide a means of predicting essential genes, organism phenotypes, gene expression patterns, organism response to mutation, and metabolic engineering strategies [2]. Despite the great demand for genome-scale metabolic models, the rate of model development has lagged far behind the rate of genome sequencing. To improve the rate of development of new models, we have integrated a biochemical database, a set of thermodynamic estimations [3], a gap filling algorithm [4], and a model optimization algorithm[5] with the SEED framework for updating, correcting, and propagating annotations across hundreds of genomes simultaneously [6]. Within this framework we have implemented an automated pipeline for the high-throughput reconstruction and optimization of genome-scale metabolic models of prokaryotes [7], and we have applied this pipeline to produce functioning models for a diverse set of 130 organisms across 14 bacterial subdivisions. On average, these models comprise of 957 reactions associated with 678 genes covering 21% of the organism genome. Application of the gap filling algorithm resulted in the addition of an average of 53 reactions with no known corresponding genes. Whenever gene essentiality or phenotyping data was available, the model optimization algorithm was applied producing models with prediction accuracies that exceed 90%. Analysis of the reactions added by the gap filling process resulted in the following key discoveries; (i) by identifying the biomass components causing a reaction to be added by the gap filling, we were able to refine our biomass definitions for every organism modeled; (ii) by identifying reactions added to many different models by the gap filling algorithm, we were able to determine portions of the metabolism of various genera for which additional annotation, curation, and experimental work is required; and (iii) by applying the completed models to the prediction of essential gene sets, we were able to identify the metabolic functions that were consistently essential in every organism as well as those functions that were essential in only a small subset of organisms.


1. Feist AM, Herrgard MJ, Thiele I, Reed JL, Palsson BØ: Reconstruction of Biochemical Networks in Microbial Organisms. Nat Rev Microbiol 2009, 7(2):129-143.

2. Feist AM, Palsson BO: The growing scope of applications of genome-scale metabolic reconstructions using Escherichia coli. Nat Biotechnol 2008, 26(6):659-667.

3. Jankowski MD, Henry CS, Broadbelt LJ, Hatzimanikatis V: Group contribution method for thermodynamic analysis of complex metabolic networks. Biophys J 2008, 95(3):1487-1499.

4. Satish Kumar V, Dasika MS, Maranas CD: Optimization based automated curation of metabolic reconstructions. BMC Bioinformatics 2007, 8:212.

5. Henry CS, Zinner J, Cohoon M, Stevens R: iBsu1103: an improved genome scale metabolic model of B. subtilis based on SEED annotations. Genome Biol 2009:submitted.

6. Overbeek R, Disz T, Stevens R: The SEED: A peer-to-peer environment for genome annotation. Communications of the Acm 2004, 47(11):46-51.

7. DeJongh M, Formsma K, Boillot P, Gould J, Rycenga M, Best A: Toward the automated generation of genome-scale metabolic networks in the SEED. BMC Bioinformatics 2007, 8:-.