(306c) Uncovering “Hidden” Variability and Dynamic Patterns: Strategies for Analyzing High-Dimensional Data Sets | AIChE

(306c) Uncovering “Hidden” Variability and Dynamic Patterns: Strategies for Analyzing High-Dimensional Data Sets

Authors 

Staehle, M. M. - Presenter, Rowan University
Ogunnaike, B. A. - Presenter, University of Delaware


In the field of systems biology, new technologies have enabled the acquisition of large amounts of data with high-dimensional experimental designs.  The resulting data sets provide an opportunity to examine multiple experimental variables simultaneously and to uncover “hidden,” physiologically-relevant sources of variability. 

To introduce the analytical strategies we have developed for analyzing high-dimensional data sets, we will utilize a specific set of high-dimensional gene expression data that was developed for systematic comparisons of cellular responses to chronic alcoholism and alcohol withdrawal.  This data set includes 195 rat samples from two brain regions, three alcohol states, and five experimental time points, with at least five replicates per condition.  For each of the 195 samples, the gene expression of 145 transcripts was measured with Fluidigm’s BioMark™ high-throughput quantitative reverse transcription PCR platform.  

Our analysis of this data set revealed “hidden” changes in gene expression across the three times of the day that the samples were collected, indicating that gene expression varies in dynamic, diurnal patterns.  We developed methods for characterizing these as both qualitative and quantitative patterns, and we found that, in some cases, the patterns are brain region specific.  Furthermore, principal component analysis (PCA) of the transcriptomic data revealed that the two brain regions have distinct molecular phenotypes and therefore must be analyzed separately in order to capture perturbation-induced changes in expression.  The “hidden” sources of variability in gene expression data provide additional insight into the overall functionality of these brain regions, but they also obscure the desired dynamics.  When the variability due to diurnal patterns and intra-brain region differences is removed, PCA of the data illuminates dynamic patterns of gene expression that evolve during alcohol withdrawal.  The translation of these analytical strategies to characterize “hidden” variability and analyze gene expression patterns in other high-dimensional data sets would lead to systems-level insight in biological systems.

Topics