Modeling Suboptimal Community Growth and Enhanced Species Diversity of the Gut Microbiota

Henson, M. A., University of Massachusetts Amherst
Phalak, P., University of Massachusetts Amherst
Underlying cellular responses is a transcriptional regulatory network (TRN) that modulates gene expression. Current descriptions of the TRN cannot quantitatively deconvolute the transcriptome into the relative contributions of individual transcriptional regulators. Here, we applied unsupervised learning to a compendium of high-quality Escherichia coli RNA-seq datasets (median R2 = 0.99 between biological replicates) to identify 71 statistically independent regulatory signals. Summation of the 71 signals explained over 80% of expression variation across 115 unique experimental conditions. Of these, 50 signals were directly linked to characterized transcriptional regulators. Condition-specific signal strengths were validated by exposure to new environmental conditions, confirming 76% of predicted signal activations. The resulting decomposition of the transcriptome provided: 1) a quantitative, mechanistic explanation of responses to environmental and genetic perturbations, 2) a guide to gene function discovery, 3) characterization of a novel pyruvate-responsive transcription factor, and 4) a basis for comparing transcriptional regulation across closely related strains. Thus, we find that signal summation forms an underlying principle that can describe the composition of the prokaryotic transcriptome.