(410f) Identifying Tightly Regulated and Variably Expressed Pathways by Differential Rank Conservation (DIRAC)

Eddy, J. A., University of Illinois
Price, N. D., University of Illinois at Urbana-Champaign
Geman, D., Johns Hopkins University

Methods for analyzing high-throughput microarray data are shifting towards an increased focus on biologically meaningful pathways. Pathway-based gene expression analysis typically focuses on identifying related sets of genes that are individually or collectively over (or under) expressed in different phenotypes. Differential Rank Conservation (DIRAC) is a novel approach for studying gene ordering within pathways and is based on the relative expression ranks of participating genes. DIRAC provides quantitative measures of how pathway rankings differ both within and between phenotypes. DIRAC between pathways in a selected phenotype contrasts the scenarios where either (i) pathways are ranked similarly in all samples (high rank conservation); or (ii) the ordering of pathway genes is highly varied (low conservation). We examined a number of disease phenotypes including cancer subtypes and neurological disorders and identified pathways that appear to be tightly regulated based on high conservation of gene ordering. Tightness of regulation for a selected pathway can best be seen as the amount of variation in gene expression levels that is allowed for by the cell. Whereas studying up- or down-regulation of a pathway provides a specific measure of how a process may be modified, identifying pathways that are tightly or loosely regulated indicates the level of control across samples in a population. Pathways under tight control in a phenotype may be important for maintaining a specific cellular function and would therefore be important targets of study in different diseases. Interestingly, we observed a strong trend of greater shuffling overall (loose regulation of ordering across many pathways) in more malignant phenotypes and later stages of disease. This global pattern of increased disorder with malignancy highlights the utility of studying gene ordering within pathways, and also reveals a striking phenomenon that will drive future investigation and may lead to new understandings of gene expression in disease. The second form of DIRAC manifests as a change in ranking (i.e., shuffling) between phenotypes for a selected pathway. Variably expressed pathways identified by DIRAC serve as signatures for molecular classification, and were used to identify expression profiles with accuracies above 95% in many cases. The ability to accurately classify microarray samples provided strong validation for the pathway-level expression differences identified by DIRAC. The DIRAC approach for identifying expression profiles is noteworthy because it (i) is independent of data normalization; (ii) results in an elegant classifier where binary phenotype (e.g. disease) diagnosis can be done based simply on whether the score is computed to be above or below zero (and is bounded by 1 and -1); and (iii) appears thus far to be comparably accurate to state-of-the-art classification methods. By learning how the ranking of genes within pathways are regulated in different states, we can identify molecular signatures to distinguish between phenotypes and to aid in the identification of potential therapeutic targets.