(262c) Mixture Model On Graph: A Bayesian Approach to Understanding Metabolic Effects of Bioengineering Processes

Noirel, J. - Presenter, The University of Sheffield
Ow, S. - Presenter, The University of Sheffield
Pandhal, J. - Presenter, The University of Sheffield
Sanguinetti, G. - Presenter, The University of Sheffield
Wright, P. C. - Presenter, The University of Sheffield

Systems-level understanding in systems biology chiefly rests upon the
generation of high-throughput data and the system-scale network
description of elements connected by functional relationships
(protein-protein interactions, reaction chains, regulation). A
particularly useful layer in the systems biology framework is the
proteome, the expressed protein content of the cell.

Proteomics is the study of the proteome and it has come to maturity by
producing high-throughput data. The interpretation and the
organisation of colossal amounts of data require specially devised
techniques in proteomics. Although mass-spectrometry-based proteomics
has greatly improved over recent years, it usually generates sparse
data (typically 10-20% of the theoretical proteome vs near 100% for
the transcriptome), which are therefore more complex to analyse. iTRAQ
(isobaric tags for relative and absolute quantitation) uses isotopic
labelling which is then observed in the low mass region of a tandem
fragmentation mass spectrum (mass/charge ratio of 113-121 Da) to
quantify the peptides from trypsin-digested proteomes extracted from
cells grown in different experimental conditions. The development of
the enhanced fragmentation modes on 3-dimensional quadrupole ion trap
mass spectrometers (MS) allows one to use a ion trap during MS/MS mode
to relatively quantify the peptides, albeit with more missing peaks
than one would typically expect from time of flight tandem mass
spectrometers. Ion trap MS is typically not the method of choice to
examine low mass ions in MS/MS spectra due to the "below 1/3rd cut-off
rule" that normally means that fragmentation ions below several
hundred Da are not measured.

The interpretation of the quantitations at the protein level calls for
a least-square minimisation procedure that can handle missing peaks in
the ion trap spectra. At the systems level, the network-based
techniques only recently started to attract attention from the
proteomics community and are still under development. Our approach
"Mixture model on graphs" (MMG) is an attempt to tackle this problem
and to help the integration of the typically sparse proteomic datasets
with biological-network information, such as that provided by KEGG or
MetaCyc. MMG is based on a Bayesian model of down- and up-regulation
that is informed by the topology of biological networks through a
conditional prior. This conditional prior relies on the coherent
behaviour of enzyme expression along metabolic pathways that has
resulted from natural selection. The coherent response of the enzymes
along a pathway can be seen as a solution to the problem of optimising
fluxes and responses. We shall explain the details of the method and
the assumptions upon which it relies, and how it can help to devise
hypotheses in the context of quantitative proteomics. An experiment
carried out on an Escherichia coli synthetic biology construct
expressing a light responsive circuit allows us to show that the
systems approach manages to extract meaningful information from the
proteomic data that cannot be recovered by naive thresholding of the
data. We also present a validation of MMG through bootstrapping on
Saccharomyces cerevisiae.