(314a) On the Systematic Use of Metabolomics for the Development of Large-Scale Kinetics for Industrial Bioreactors | AIChE

(314a) On the Systematic Use of Metabolomics for the Development of Large-Scale Kinetics for Industrial Bioreactors

Authors 

Kokossis, A. - Presenter, National Technical University of Athens
Mexis, K., National Technical University of Athens
Xenios, S., National Technical University of Athens
The paper addresses major challenges in the development of Digital Twins for Design-Build-Test-Learning (DBTL) cycles as they launched to produce biocatalysts for industrial biotechnology applications. Conventional models continue to regress century-old models (e.g. Michaelis-Menten and Monod) that disregard biocatalyst dynamics and quite often fail to reliably scale-up bioreactors. Digital Twins are instead challenged to capture a wide range of heterogeneous data available from the bioreactor operation, metabolomics, and omics. Once systematically captured, such data can be used to produce custom-made large-scale kinetics with direct reference to the biocatalyst reactions. They can also be used to reverse-engineer decisions for the better design of biocatalysts.

This study proposes a systematic and generic approach, part of a collective effort (Biondustry 4.0) supported by IBISBA (https://www.ibisba.eu/), the European Infrastructure on Industrial Biotechnology that brings together researchers from process systems engineering, synthetic biology, and computer science. Our approach introduces a systematic and efficient method for finding biosynthetic pathways for metabolic engineering of organisms to produce valuable chemicals. The process of designing biosynthetic pathways involves extracting novel metabolic pathways from complex metabolic networks that integrate known biological processes and predicted biotransformations. The objective of pathway search tools is to generate biologically relevant metabolic pathways in an automated manner that are both informative and easily comprehensible. To facilitate this task, we propose constructing searchable graph representations of metabolic networks. These pathways will subsequently be employed to generate large-scale kinetic models to produce self-regulated digital twins for the small scale, autonomous and on-site industrial biotechnology reactors in the context of Bioindustry 4.0. The development of digital twins will support the autonomous operation of biochemical reactors coupled with process control and online optimization (closed systems) and process and strain engineering. The purpose would be to compute process efficiency and guide strain engineering to improve the biochemical process for better efficiency and performance, as well as for tailoring of product portfolios. AI technology would be explored to support advanced, model-based control and online process optimization and to connect process and strain engineering. All processes will be simulated using the SPSE bioreactors modelling software for multiple industrial use cases. By integrating multi-omics data and established frameworks, our approach constructs large-scale kinetic models that capture cellular dynamics and provide more flexibility during bioprocess optimization. These models can simulate cell behavior and describe experimental kinetics, making them a valuable tool for constructing dataspace communities of in-silico curated kinetic models for different metabolic systems. These dataspaces will bring together relevant data and frameworks to facilitate data pooling and sharing, enabling the development of more robust large-scale kinetic models for digital twins in industrial biotechnology applications. Digital twinning of cellular processes (translation, metabolism) will be developed to capture the key features of biocatalyst function and will be embedded and tested within the bioreactor process twin. By incorporating diverse data sources and following established guidelines, our aim is to improve the accuracy and reliability of the generated kinetic models and accelerate the development of robust strains with novel metabolic pathways. This process will produce in-silico curated kinetic data that can integrate different stages of the Design-Build-Test-Learn (DBTL) cycle, enabling the simultaneous design of biocatalysts and process engineering.

Engineering kinetics are currently based on regression studies based on experimental data with limited reference to the underlying reaction pathways. The production of large scale metabolic kinetic models is also hindered by the uncertainty of predicting the kinetic parameters and producing the iterative Design-Build-Test-Learn (DBTL) cycle.

Engineering kinetics are currently based on regression studies based on experimental data with limited reference to the underlying reaction pathways. The production of large scale metabolic kinetic models is also hindered by the uncertainty of predicting the kinetic parameters and producing physiologically relevant and robust kinetic models. Instead, the ORACLE framework offers an attractive environment to generate populations of large-scale (curated) dynamics and a platform for in-silico kinetic models to check for physiological relevance and stability. Such models are valuable in scale-up studies, in the design and optimization of bioreactors, also to manipulate pathways as required to increase yields and selectivity.

Using the ORACLE framework, we are able to generate large populations of kinetic models but the biggest percentage of them are not stable. This paper explains a systematic approach that combines deterministic methods, data analytics and machine learning to accelerate realizable kinetics that could be set a basis to connect the dynamics of the cell with the dynamics of the process. Due to the small percentage of stable models, we deployed several ML methods (black-box models) trained on the saturation and thermodynamic displacement parameters of curated kinetic models to predict whether the inferred kinetic model is stable or not. Then, a CART decision tree (surrogate model) was trained to approximate predictions of the black box model and to extract rules that constrain the kinetic parameters of some critical enzymes, reducing the initial sampling space of the ORACLE framework and, as a result, the uncertainty in the model analysis.

The work is demonstrated with the production of muconic from S. cerevisiae which is achieved by shunting the shikimic pathway. The ORACLE framework consists of two reduction stages where the first one ensures that the generated kinetic models are physiologically relevant and the second one checks the model’s stability. The conventional method produced 370 physiologically relevant models out of which 70 were stable (19.2%) whereas our approach increased uptake of acceptable solutions to 97.7%.

The uncertainty in the model analysis is reduced through the use of machine learning principles. Using machine learning classification and explainability techniques we were able to raise the stability index of the generated models which leads to postulate that it is better to constrain such systems as much as possible to get more feasible results.

Acknowledgments

The authors acknowledge support from the EPFL team of Prof. Hatzimanikatis,especially Miskovic Ljubisa, who assisted in applying the ORACLE methodology. Financial support from the project DEBONAIR (K.A. 61511600) funded by the Foundation for Research & Innovation is gratefully appreciated.

References

[1] Miskovic L, Hatzimanikatis V. Production of biofuels and biochemicals: in need of an ORACLE. Trends Biotechnol. 2010 Aug;28(8):391-7

[2] Xenios, S., Weilandt, D., Vasilis, H., Miskovic, L., & Kokosis, A. (2022). On the integration of process engineering with metabolomics for the production of muconic acid: the case for Saccharomyces Cerevisiae. Computer Aided Chemical Engineering, 541–546

[3] Andreozzi, S., Miskovic, L., & Hatzimanikatis, V., 2016, iSCHRUNK – In Silico Approach to Characterization and Reduction of Uncertainty in the Kinetic Models of Genome-scale Metabolic Networks. Metabolic Engineering, 33, 158–168.