(191dg) Comparative Transcriptomics Analysis Pipeline for a Customized CHO Microarray Platform

Authors: 
Chen, C., Amgen Inc.
Le, H., University of Minnesota
Goudar, C., Amgen
Comparative transcriptomics analysis, which has been enabled by microarray and emerging RNA-Seq profiling technologies, provides a powerful tool to understand different phenotypes in bioprocessing related cell lines at the gene expression level. To translate large transcriptomics data sets into biologically relevant information that can be leveraged for cell line development and process engineering, multiple tools that span the spectrum of raw data processing, gene expression analysis, pathway analysis, and data visualization are required. However, a seamless integration of these tools into an automated pipeline that is specifically designed for analyzing transcriptomics data from CHO cells, the leading cell line for commercial recombinant protein production, is lacking. Moreover, commercial software are usually not structured to conveniently enable CHO pathway analysis and interoperation. To address this gap, we have developed an automated pipeline in R which leverages public domain statistical methods to analyze gene expression data from a customized microarray platform for proprietary CHO cell lines. The pipeline not only tests the significance of difference between two conditions, but also identifies differentially expressed genes as well as differentially expressed pathways with multiple intuitive data analysis visualizations as outputs. The utility of this pipeline was demonstrated by comparing the transcriptome profiles of recombinant protein expressing CHO cells before and after a bioreactor process temperature shift. In addition, we used this pipeline to analyze microarray data where CHO cell lines with different recombinant protein expressing capacities were compared. Moreover, to justify the utility of the customized microarray based comparative analysis pipeline for CHO cells, we compared the outcome from microarray based analysis with the results from RNA-seq based analysis. With similar differential pathway analysis results and interpretation, the majority of differentially expressed genes identified in microarray analysis were also recognized as differentially expressed in the RNA-Seq analysis. We hope an automated microarray based comparative transcriptomics analysis pipeline such as the one presented in this study will shorten the time required from acquiring data to biological interpretation and can help accelerate the introduction of mechanism-driven approaches for cell line and bioprocess optimization.