(627d) Using Graph Neural Networks with Genome Scale Metabolic Models to Predict Antimicrobial Resistance in E. coli | AIChE

(627d) Using Graph Neural Networks with Genome Scale Metabolic Models to Predict Antimicrobial Resistance in E. coli

Authors 

Daoutidis, P., University of Minnesota-Twin Cities
Azarin, S., University of Minnesota
As antimicrobial resistance (AMR) continues to grow exponentially amongst clinically-relevant human pathogens, efforts to identify exploitable biological mechanisms underpinning such resistance must match pace. The identification of these mechanisms will aid biologically-informed design or selection of novel antimicrobial agents, and our abundance of both molecular and metabolic data lends itself to machine learning approaches. Already, graph neural networks (GNNs) have been implemented for drug discovery, wherein molecules which potentially possess antimicrobial properties are represented as graphs. We, however, have focused on utilizing GNNs with metabolic networks themselves.

A dataset consisting of 3,616 draft genome scale metabolic models (GEMs) for various strains of E. coli was obtained and gapfilled via ModelSEED. For each GEM, the stoichiometric matrix was transformed into either a reaction adjacency graph (where reactions are nodes and their shared metabolites are edges) or metabolite graph (where metabolites are nodes and the reactions producing/consuming them are edges), serving as a network representation of the full metabolism. The metadata for each strain contains an experimentally verified antimicrobial resistance profile for up to 12 antimicrobial agents; using subsets of the dataset corresponding to each individual antibiotic as inputs to the GNN, we executed a whole-graph classification. Pairing this classification with methods such as approximation-based (e.g. sensitivity analysis, GraphLIME), relevance propagation-based (e.g. GNN-LRP), and perturbation-based (e.g. GNNExplainer) explanations has enabled us to identify key metabolic network features which contribute to AMR. Additional aspects of our work included identifying potential biases within the data, investigating the effects of network representation and pruning (either topological or biochemical) on GNN performance, and optimizing the GNN parameters. Preliminary classification accuracy averaged 65% across all antimicrobials for the test dataset, with minimal variation in model performance between graph representations.