(665d) mRNA Half-life Predictor: An in silico tool for Metabolic Engineers

Gupta, S. T. P. - Presenter, Great Lakes Bioenergy Research Center
Gordon, G. C., University of Wisconsin Madison
Ramanathan, P., University of Wisconsin Madison
Pfleger, B., University of Wisconsin-Madison
Reed, J. L., University of Wisconsin-Madison

Half-life predictor: an in silico tool for metabolic engineers

Sanjan TP Gupta1,2, Gina C Gordon1,3
, Parmeswaran Ramanathan4, Brian F Pfleger1,2,3, and
Jennifer L Reed1,2

and Biological Engineering, University of Wisconsin-Madison; 2Great
Lakes Bioenergy Research Center, Madison, WI; 3Microbial Doctoral
Training Program, University of Wisconsin-Madison; 4Electrical and
Computer Engineering, University of Wisconsin-Madison

From the central dogma
in biology, it is well-known that transcription and translation play an
important role in encoding the information contained in DNA (or genes) into
functional proteins. Quantifying and controlling the amount of mRNA, a key
intermediate, would in turn provide a valuable tool for controlling in vivo
expression levels of these proteins.

Here, we describe a
machine learning based approach for predicting the mRNA half-lives in
cyanobacteria - a photosynthetic microbe that can convert CO2 into a
variety of chemicals. A set of 28 sequence and structure based features (such
as GC content, predicted RBS strength, and minimum free energy based on RNA
folding) were constructed for the 3,238 genes found in Synechococcus sp.
PCC 7002. Half-lives were measured for the corresponding mRNA transcripts using
a rifampicin based transcription arrest assay and used as target variables to
be predicted based on the feature values. Analyzing the importance of various
features used for building the model revealed that stable transcripts have
higher normalized expression levels, higher translation rates, and are less
likely to be found in an operon. Thresholds of 1 min and 3 min were used for
classifying an mRNA transcript as unstable, moderately stable, or highly stable
respectively. Decision tree based models built for predicting the transcript
stability level (Fig. 1) exhibited up to 55.32% accuracy (p-value <
0.001 when compared to a random classifier as per chi-squared test).

Such predictive models
can help metabolic engineers in designing mRNA transcripts with desired levels
of stability (high or low) as applicable in the context of engineering
metabolic pathways. This machine learning based in-silico tool adds to
the repertoire of growing computational tools for engineering cyanobacteria and
other microbes for various synthetic biology and metabolic engineering based

machine learning, mRNA, sequence-to-stability

Figure 1 Confusion
matrix showing the performance of machine learning based model for classifying
a transcript as stable, moderately stable, or highly stable