(2eg) Big Data Analytics for Biopharmaceutical Production Platform Development | AIChE

(2eg) Big Data Analytics for Biopharmaceutical Production Platform Development


Gopalakrishnan, S. - Presenter, University of California San Diego
Research Interests

Therapeutic biologics are the primary modality for treating chronic, non-communicable, and infectious illnesses including autoimmune disorders, Alzheimer’s disease, cancer, and diabetes. Because biologics must be produced by a living organism, the development of a stable and efficient producer cell line remains the primary bottleneck in the design of a production process. Currently, producer cell lines are selected using a manual and time-intensive screening process to screen a large library of transfected host cell lines. Predictive models of metabolism capture the interplay between complex biological processes, predict cellular responses to environmental and genetic perturbations, and inform metabolic engineering strategies for de-bottlenecking and cell line engineering. Construction of quantitative and predictive models for biological networks relies on efficient analysis and mining of large-scale biological data to correlate phenotypic and metabolic features with desired process characteristics such as high product titer and quality. The challenge of designing novel supervised learning algorithms based on transcriptomic, proteomic, metabolomic, and fluxomic datasets involves simultaneous parameterization and simulation of biological processes at various time scales. Such hybrid models combine structural details with empirical statistical models generated using machine learning to provide valuable insights into mechanisms driving cell-state transitions and accelerate build-design-test cycles for the development on industrial producer cell lines.

Research Experience

My research career has focused on integrating various types of omics-data with genome-scale models of metabolism to characterize the physiological state of organisms and response to genetic and environmental perturbations. My doctoral work at the Maranas Lab (Penn State University) focused on the development of tools and resources for generating genome-scale fluxomic datasets and then using those generated datasets to construct predictive kinetic models of metabolism. My postdoctoral work at the Lewis Lab (UCSD) expands on my doctoral work by integrating transcriptomics, CRISPR screening data, and process data to extract context-specific models that accurately emulate the cell’s physiological state and simulate a multi-scale process for therapeutic production using CHO cells. This poster will explore the following works.

Doctoral Projects

The overarching goal of my doctoral work was to establish a platform to construct and simulate predictive models of metabolism to inform metabolic engineering strategies. This task was limited by the high computational cost associated with model training as well as a shortage of large-scale datasets to improve the predictive capabilities of the metabolic models, and was addressed using a two-pronged strategy:

  1. Development of tools and algorithms for genome-scale 13C-Fluxomics: Computational challenges associated with inferring the in vivo fluxome from stable-isotope tracers limited the scope of models used for data analysis. This work expanded the scope of 13C metabolic flux analysis to genome-scale models for the first time while providing insights into energy allocation in coli and uncovering a novel bifurcated topology for carbon conservation in Synechocystis. In addition to this, guidelines were proposed for the analysis of stable-isotope labeling data while generating previously unavailable large-scale fluxomic datasets for the construction of kinetic models of metabolism.
  2. Accelerated parameterization of near-genome-scale metabolic models: This work involved the development of K-FIT, a robust and scalable platform for constructing predictive models of metabolism for coli using large-scale metabolomic and fluxomic datasets.

Postdoctoral Projects (ongoing work)

  1. Identification of biologically relevant models of metabolism using transcriptomics and CRISPR screen data: Although various algorithms exist for building cell-type-specific genome-scale metabolic models using transcriptomics data, these extracted models fail to emulate gene dispensability effects leading to over- or under-estimation of pathway usage by the organism. CRISPR screens report the sensitivity of the cell’s physiological state to genetic perturbations and provides a high-confidence list of inactive biological processes to improve the quality of constructed metabolic models.
  2. Biological characterization of producer CHO cell lines: Understanding changes in cell state and metabolism in response to reactor conditions is critical to optimizing and controlling a bioprocess. In this ongoing work, we compare and contrast the metabolic and transcriptomic characteristics of selected drug-producing clones with their parent pool cell lines to elucidate key differences in gene expression and pathway usage that contribute to increased antibody production in producer clones.
  3. Development of a multiscale bioreactor model for CHO bioprocessing: This ongoing work involves the construction of a bioreactor model for antibody production using CHO cells that interfaces reactor conditions with cell metabolism using parameterized boundary conditions. The model itself will be applied to process optimization and predictive control.

Select Publications

  1. Gopalakrishnan, S., & Maranas, C. D. (2015a). 13C metabolic flux analysis at a genome-scale. Metab Eng, 32, 12-22. doi:10.1016/j.ymben.2015.08.006
  2. Gopalakrishnan, S., Pakrasi, H. B., & Maranas, C. D. (2018). Elucidation of photoautotrophic carbon flux topology in Synechocystis PCC 6803 using genome-scale carbon mapping models. Metab Eng, 47, 190-199. doi:10.1016/j.ymben.2018.03.008
  3. Gopalakrishnan, S., Dash, S., & Maranas, C. (2020). K-FIT: An accelerated kinetic parameterization algorithm using steady-state fluxomic data. Metab Eng. doi:10.1016/j.ymben.2020.03.001

Future Research Plan

  1. coli is a preferred platform for producing non-glycosylated biologics due to easy cultivation and the availability of robust expression systems. However, the natural cellular objective of E. coli strains is to maximally channel resources towards cell growth, which goes against the process objective of titer maximization. The availability of accurate predictive models capable of navigating the complexities of biological systems can identify cellular states conducive to product titer maximization and greatly accelerate the development of therapeutic-specific production platforms. Being a well-studied model organism with available genome-scale models, large-scale kinetic models and a large library of multi-omics datasets makes E. coli the perfect candidate organism for the development of large-scale modeling frameworks. Motivated by this, I would like to expand on my previous work with multi-omics data integration (K-FIT algorithm) and incorporate intracellular signaling, gene regulation, and the protein secretory pathway to construct a detailed mechanistic model for a producer E. coli. This involves a three-phase plan consisting of: (i) Integrating the protein secretory pathway with the current genome-scale model for E. coli, (ii) Reconstructing and overlaying the signaling and gene regulatory networks with the genome-scale metabolic model for E. coli, (iii) Comparing and contrasting the transcriptomic, fluxomic, and metabolomic characteristics of producer and wild-type E. coli cells in the context of the integrated model, and (iv) Identifying appropriate environmental triggers to induce state sift from growth to production in a bioreactor.

Teaching Interests

I served as a Teaching Assistant for the core Chemical Engineering course “Process Heat Transfer” in the Fall semester of 2016. My responsibilities included organizing recitation sessions and grading of homework and exams for a class strength of 135 students. As a future faculty, I am interested in teaching a specialized course in Applied Systems Biology and Metabolic modeling at the graduate level that is tied to my research program, a more generalized course in Mathematical Modeling Techniques at both undergraduate and graduate level, as well as an introductory course in Material and Energy Balances at the undergraduate level.