(100c) Dynamic Modelling As a Tool for Increasing Single-Chain Antibody Fragment Specific Productivity in Pichia Pastoris

Royle, K. - Presenter, Imperial College London
Leak, D., University of Bath

Pichia pastoris is a commonly used expression host for heterologous protein production, predominantly because it is amenable to genetic manipulation and can grow to high cell densities in cheap culture media. While considerable yields can be achieved in this way, the specific productivity is relatively low. Consequently, the full impact of this host on industrial biotechnology has not yet been realised.

Previous studies to increase productivity have focused on bioengineering, such as targeting gene and strain characteristics, and fermentation conditions. Whereas the latter has been subject to multiple systematic multivariate optimisations with computational modelling techniques, the former has not. The majority of bioengineering studies have targeted one factor in isolation, and despite comparable strategies have variable outcomes. Here, a dynamic computational modelling approach has been taken to understand how the factors interact and develop a global optimisation strategy.

Initially, a deterministic single-cell model was derived recapitulating the essential features of protein production: transcription and translation were modelled with mass action kinetics, and the folding of nascent protein into its three-dimensional conformation with Michaelis-Menten kinetics based on the foldase Pdi. Additionally an endoplasmic reticulum (ER) stress response, the unfolded protein response (UPR), was included as it is well documented that this is activated during heterologous protein production. In essence, too much unfolded protein in the ER is detected by the receptor Ire1 which triggers a cascade of reactions concluding in an upregulation of proteins to counteract stress. Although this pathway has been modelled in yeast once before, our work has expanded both the breadth and the kinetics. Specifically, stress occurs when the chaperone Kar2 dissociates from Ire1 to bind unfolded protein, allowing excess unfolded protein to bind Ire1. This activates the endonuclease domain of Ire1, which subsequently splices HAC1 mRNA allowing translation of this transcription factor. The protein-protein interactions have been modelled with equilibrium kinetics, and the splicing of HAC1 mRNA with Michaelis-Menten kinetics. The transcription factor Hac1 retrotranslocates to the nucleus and the upregulation of KAR2, PDI, HAC1 and E3 has been modelled with Michaelis-Menten kinetics. Finally, the ER associated degradation pathway (ERAD), which acts to remove misfolded protein from the ER and is intertwined with the UPR, has been incorporated - a unique feature making this model more comprehensive than those previously published. Specifically, the binding of Kar2 to misfolded protein has been modelled with equilibrium kinetics and the subsequent degradation with Michaelis-Menten kinetics based on E3, the rate limiting step in the ERAD pathway.

Whilst simulations of the model proved qualitatively accurate, a lack of literature values hindered quantitative accuracy. Consequently, we sought to construct and characterise strains of P. pastoris producing a heterologous protein for which single-chain antibody fragments (scFvs) were chosen. Although monoclonal antibodies are currently the dominant biopharmaceutical in manufacture, smaller antibody fragments such as scFvs are expected to feature more heavily in the future. While mammalian cells are most frequently used to produce full antibodies, the smaller fragments are better suited to a microbial host.

Therefore, novel P. pastoris strains have been constructed producing two scFvs: BC1, which targets new blood vessels of tumours; and MFE23, which targets gastrointestinal cancers. These genes were cloned into the commercial vector pPICZαA with C-terminal polyhistidine and myc tags, and subsequently integrated into the AOX1 locus. Expression was driven by the AOX1 promoter, and secretion conferred by the N-terminal α-Factor from Saccharomyces cerevisiae. As there is biological variation in specific productivity, different clones of the same strain can produce varying amounts of scFv. In order to include this variation in the model and help identify relevant factors, both low and high yield clones were isolated. As multiple copies of the heterologous gene can increase productivity, the low and high yield clones were subjected to Southern blot analysis to confirm single copy integrants, allowing the gene copy number in the model to be set to one.

A major source of quantitative inaccuracy in the model was the concentration of two proteins in the ER, the chaperone Kar2 and the foldase Pdi. These proteins were identified as having a large influence over yield; however, literature values varied over three-fold. We therefore sought to accurately quantify these and developed an LC-MS/MS method to do so. Initially, an in silico tryptic digestion of the proteins was conducted and the peptides ranked according to their enhanced signature peptide prediction score, a computational method which evaluates their physicochemical properties. MRM methods were written to target the superior peptides, and these were experimentally validated with analysis of a tryptic digest of Escherichia coli BL21 DE3 strains overexpressing the proteins. Isotopically labelled versions were subsequently expressed, purified and used to optimise the MRM in terms of collision energy and collision cell exit potential. As the proteins are difficult to express at high yields, however, isotopically labelled peptides were purchased from JPT Peptide Technologies for the experiment.

As the UPR upregulates Kar2 and Pdi, the model required experimental data for the baseline level in unstressed cells, the higher level in stressed cells and the time over which this occurs. As transcription of the SCFV genes are specifically induced by changing the carbon source, we have devised an experimental system to generate this profile. Specifically, uninduced strains at an OD600 of 1 were incubated at 30°C, 250 rpm for 24 hours. Samples were taken at 22 and 23 hours, at which point the culture is in stationary phase, allowing evaluation of the baseline Kar2 and Pdi values. At 24 hours, scFv expression was induced and further samples taken at 0, 2, 4 and 6 hours. These data provide a complete picture of the unstressed baseline, activation of stress and the final maximum the system can achieve.

Simultaneously, the samples were analysed for the magnitude of the UPR and ERAD pathways. Traditionally, activation of the UPR has been measured using reverse-transcriptase quantitative PCR targeting KAR2, PDI and also HAC1 mRNA, and this strategy was implemented here. Indeed, for Kar2 and Pdi it also allowed for interesting comparisons between transcriptional and protein upregulation as it is widely appreciated that they do not always equate. Therefore, primers were designed and optimised to target KAR2, PDI and HAC1 in conjunction with those against actin, an internal control for the experiment. Upregulation was calculated with the Pfaffl method, compared to the wild-type GS115. Activation of the ERAD was assessed in a similar way; however, this pathway is poorly described compared to the UPR. Primers were designed to target major contributors, specifically a ubiquitin-activating enzyme (E1), a ubiquitin-conjugating enzyme (UBC7), two ubiquitin protein ligases (HRD1 and DOA10) and an ATPase (CDC48). Interestingly, some studies have suggested that the ubiquitin protein ligases may be substrate specific; consequently analysing both HRD1 and DOA10 allowed an insight into whether this is the case for scFvs in P. pastoris.

Whilst the above data refined model accuracy, validation was required. Many studies of the UPR chemically induce stress to test hypotheses. This, however, produces pleiotropic effects which are not ideal. Here, we constructed strains which overexpress Kar2 and Pdi both individually and together. The increase in availability of these proteins provides the necessary perturbation to the system, yet they are also expected to have beneficial effects on yield imparting added industrial relevance. Specifically, each gene was cloned into the pIB2 vector under the control of the GAP promoter, with an AOX1 transcription terminator and homology to the his4 locus for integration. This approach resulted in a constitutive upregulation of the proteins for incorporation into the model. Simultaneous upregulation required a more developed strategy. Ideally, the same promoter was required; however, ongoing research suggests that P. pastoris frequently recombines between large regions of homology preventing the use of two copies of the promoter. To circumvent this, both genes were expressed as a single bicistronic mRNA with an IRES sequence joining the two. IRES sequences allow cap-independent translation such that the second gene in the bicistronic mRNA can be translated in addition to the first. Interestingly, there is some evidence that P. pastoris may be subject to translational downregulation during ER stress. If this is the case, cap-independent translation will not be hindered. Therefore, KAR2 was strategically cloned into pIB2 after PDI and under control of an IRES sequence from Saccharomyces cerevisiae, as it is better able to relieve ER stress. Further RT-QPCR and mass spectrometry analysis confirmed the perturbation and provided a more thorough analysis of the use of IRES sequences in P. pastoris – a technique used only once previously.

In conclusion, we have developed a deterministic model which mimics scFv production and the initiation of ER stress pathways. This model has been tested with constructs that over-express Kar2 and Pdi, both validating its predictive capabilities and generating industrially relevant constructs. Finally, it has been simulated to predict the optimal strategy for increasing specific productivity in P. pastoris, constituting an integrated modelling and experimental approach to solving a life sciences problem.