Genome-Scale Strain Designs Based on Regulatory Minimal Cut Sets | AIChE

Genome-Scale Strain Designs Based on Regulatory Minimal Cut Sets

Authors 

Klamt, S., Max Planck Institute for Dynamics of Complex Technical Systems
von Kamp, A., Max Planck Institute for Dynamics of Complex Technical Systems



-

Genome-scale strain designs based on regulatory minimal cut sets
R. Mahadevan, A. von Kamp, S. Klamt
Recent advances in genome-wide characterization of cellular systems have allowed the opportunity to catalog a significant fraction of the metabolic reactions in the cell. These advances coupled with the development of metabolic modeling methods have enabled the construction of detailed models for many industrially relevant microbial hosts including Escherichia coli, and Saccharomyces cerevisiae. In parallel, the increasing price and volatility associated with petroleum based feedstocks stimulated the use of biological processes for renewable chemical synthesis. Together, these two factors further motivated the development of computational algorithms that facilitate the engineering of metabolism for enabling renewable chemicals synthesis.
In the past, several algorithms have been developed for computational strain designs including the series of bilevel optimization algorithms such as OptKnock, OptStrain, OptGene, OptReg, CosMos and OptORF1. In all of these methods, the inner optimization problem involves the formulation of genome- scale metabolic model with growth rate maximization as the objective, whereas the outer optimization involves the identification of the specific gene deletion/addition (represented as a binary variable) that lead to the maximization of the flux to the target chemical. Typically, the duality theory is used to convert the bilevel optimization problem into a single level mixed integer linear optimization problem with both the primal and dual version of growth maximization linear programs. However, the solution of the bilevel optimization problem for genome-scale models for cases with more than 6 or so modifications can be computationally prohibitive, even more if one wants to enumerate alternate intervention strategies. Finally, we have also previously developed an integer-free optimization approach based on successive linear programming that was shown to be highly efficient (EMILiO2). However, most of these algorithms use growth rate maximization as the cellular objective and use integer variables, which limit the number of simultaneous modifications due to the explosion in the combinatorial space. The model-based strain designs have been experimentally validated for lactate, butanediol, malonyl-CoA and fatty acid production in E. coli, and vanillin production in S. cerevisae highlighting the value of these in silico strain designs for metabolic engineering.
An alternative approach which does not use a cellular objective function is the use of minimal cut sets in which all flux distributions with a yield lower than the desired value are eliminated by the identification of minimal combinations of reaction or gene deletions. Previously, such minimal cut sets were identified after the calculation of elementary modes for a given metabolic network3. However, a recent study showed the equivalence between the elementary modes of the dual and the minimal cut sets of the primal problem4. This advance enabled the direct identification of these minimal cut sets using only the network stoichiometry and constraints. Recently, this method, coupled with the use of binary indicator variables that represented whether a continuous variable was non-zero, was used to identify thousands of minimal cut sets for intervention problems in genome-scale metabolic networks5. Once the minimal cut sets were identified, additional constraints that allow, for example, biomass synthesis with a minimum yield can be used to filter the set and identify constrained minimal cut sets (cMCS) fulfilling this desired behaviour. The findings in [5] highlighted the ability of this cMCS approach to identify novel deletion strategies that are not typically found by the growth-coupled bilevel optimization problems. However, one of the current restrictions of the cMCS approach is that these are limited to identifying only deletion modification.
In this work, we have developed a new approach (cRegMCS) that extends the cMCS method to consider up- and downregulation of metabolic reactions along with gene deletions. In order to search for up- and downregulations, we modify the regulated reactions so that they produce pseudo-metabolites which are then either forced to be consumed above a specific threshold (in the case of upregulation) or forced to be consumed below a specific threshold (in the case of downregulation). We then add slack reactions that produce the pseudo-metabolite (for upregulation) or consume the pseudo-metabolite (for downregulation) and use the cMCS method to identify deletions as before. If the regulation of a specific reaction can enable the elimination of the unwanted metabolic behaviors, then the corresponding slack reaction is deleted. However, this method is limited only to cases, for which the level of flux regulation is known. Hence, we use a pre-processing step based on flux variability analysis to identify the top ten reactions which have a reduced range in the desired space after blocking the unwanted flux vectors. We then specify three levels of regulation for each of these reactions and identify the top three reactions which are most frequently regulated. Subsequently, for each reaction, we specify a much finer regulation levels and identify cMCS and identify regulation modifications that lead to the least amount of interventions. We illustrate this approach by identifying strain designs that lead to ethanol production and using the combined regulation and deletions (cRegMCS) we can find strategies that are significantly smaller relative to the cMCS. For example, in the case of E. coli, using cRegMCS and threshold growth rate of 0.05 hr-1, we were able to identify several strategies with only three or four modifications and
333 strategies that have fewer than or equal to 5 modifications compared to the case where the cMCS requires at least seven deletions. By providing these minimal intervention sets, the cut set based methods clearly expand the range of strain designs and lead to a diversity of strain design strategies. Hence, the cRegMCS along with cMCS and other previously described strain design methods are valuable for generating alternative strain design algorithms and thereby increasing the choices available to metabolic engineers for experimental implementation. In addition, we have developed metrics for measuring the ability of strain designs to allow for the synthesis of the desired product in the presence of perturbations in the regulated reactions. The combination of such metrics along with a range of strain designs can be important to prioritize the strain designs in order to maximize the chances of success during the experimental implementation of these strain designs.
References
1. Zomorrodi, A. R., Suthers, P. F., Ranganathan, S. & Maranas, C. D. Mathematical optimization applications in metabolic networks. Metab. Eng. 14, 672-686 (2012).
2. Yang, L., Cluett, W. R. & Mahadevan, R. EMILiO: a fast algorithm for genome-scale strain design.

Metab. Eng. 13, 272 (2011).

3. Hadicke, O. & Klamt, S. Computing complex metabolic intervention strategies using constrained minimal cut sets. Metab. Eng. 13, 204-213 (2011).
4. Ballerstein, K., von Kamp, A., Klamt, S. & Haus, U. U. Minimal cut sets in a metabolic network are elementary modes in a dual network. Bioinformatics 28, 381-387 (2012).
5. von Kamp, A. & Klamt, S. Enumeration of smallest intervention strategies in genome-scale metabolic networks. PLoS Comput. Biol. 10, e1003378 (2014).