(692b) Insights From High-Throughput Reconstruction and Analysis of 3500 Genome-Scale Metabolic Models | AIChE

(692b) Insights From High-Throughput Reconstruction and Analysis of 3500 Genome-Scale Metabolic Models

Authors 

Henry, C. - Presenter, Argonne National Laboratory
Xia, F. - Presenter, Argonne National Laboratory
Devoid, S. - Presenter, Argonne National Laboratory
DeJongh, M. - Presenter, Hope College
Best, A. - Presenter, Hope College
Vonstein, V. - Presenter, Fellowship for Interpretation of Genomes
Stevens, R. - Presenter, Argonne National Laboratory


As a result of next generation sequencing technology, exciting new opportunities are emerging for the application of model-based comparative approaches to improving our understanding of biological systems through the high-throughput analysis of genomic data. Here we present our work in the application of the Model SEED framework (http://seed-viewer.theseed.org/models/) [1] to construct draft genome-scale metabolic models for approximately 3500 complete genome sequences. In the reconstruction of these models, we develop new optimization-based algorithms for identifying and filling gaps in the metabolic network with increased accuracy over previous methods; we propose a new approach for predicting essential biomass precursors in microbes based on genomic evidence; and we explore how metabolic modeling may be applied to assign confidence to gene annotations and identify incorrect annotations that should be reassigned.

In the process of generating and gap-filling our 3500 draft models, we apply comparative genomics approaches to identify candidate genes that may be associated with the gapfilled reactions in each model. This includes the use of chromosomal clustering and functional co-occurrence in addition to BLAST scores to identify new high-confidence annotations that may be experimentally validated. This work has resulted in improved, consistent subsystems-based annotations for all 3500 genomes analyzed. All models and annotations are available for download from the SEED (http://pubseed.theseed.org/) and Model SEED sites.

We analyze how the properties and behavior of our 3500 draft models are conserved across the diverse set of microbial genomes included in the study. We identify the least and most variable metabolic pathways; we identify significant variability in the predicted essential metabolic genes and the redundancy of essential metabolic functions; and we explore metabolic properties that are most conserved amongst phylogenetically close organisms. This work reveals insights into the diversity of microbial genomes, the completeness of our knowledge of these genomes, and the areas of our knowledge where more gaps presently exist.

1.            Henry, C.S., et al., High-throughput generation, optimization, and analysis of genome-scale metabolic models. Nature Biotechnology, 2010. Nbt.1672: p. 1-6.