(361a) Efficient Data-Driven Discovery of Novel Innate Immunomodulators Using Machine Learning-Guided High Throughput Screening | AIChE

(361a) Efficient Data-Driven Discovery of Novel Innate Immunomodulators Using Machine Learning-Guided High Throughput Screening

Authors 

Kim, J., University of Chicago
Ka Man Ip, C., University of Chicago
Bahmani, A., University of Chicago
Chen, Q., University of Chicago
Rosenberger, M., University of Chicago
Esser-Kahn, A., University of Chicago
Ferguson, A., University of Chicago
The success of prophylactic vaccines and immunotherapies relies on the innate immune response, which can be modulated by small molecules to inhibit unfavorable systemic inflammation for prophylactic vaccines and improve immune stimulation for immunotherapies to mitigate suppression from tumor micro-environment. Traditional screening methods for discovering small molecule with specific pharmaceutical usages are time-consuming and resource-intensive, and often fail to identify novel compounds with high efficacy. In addition, although human intuition and experience present valuable heuristics to guide this search, the relative infancy of immunomodulator discovery efforts, absence of mechanistic understanding, and vast size of molecular space can make these heuristics limiting and subject to human preconceptions, bias, and potential blind spots.


We have developed a machine learning-enabled active learning pipeline to guide in vitro experimental screening and discovery of small molecule immunomodulators that alter the signaling activity of innate immune responses stimulated by traditional pattern recognition receptor agonists. Molecules were tested by in vitro high throughput screening (HTS) where we measured modulation over the activation of two major effector of human immunity, the nuclear factor κ-light-chain-enhancer of activated B-cells (NF-κB) pathway and the interferon regulatory factors (IRF) pathway. These data were used to train data-driven predictive models linking molecular structure to their immunomodulatory activity using deep representational learning, Gaussian process regression (GPR), and Bayesian optimization (BO). The deep representational learning model was trained to convert molecular structure to continuous embeddings with 97% reversible fidelity. The GPR and BO converged after four rounds of discovery loop between computation and experimentation with stabilized prediction and minimized error.


By interleaving successive rounds of model training and in vitro HTS, we performed an active learning-guided traversal of a 139,998-molecule library using a fraction of the time and material costs associated with exhaustive screening. After experimentally evaluating only around 2% of the library, we discovered molecules with unprecedented immunomodulatory capacity, including those capable of suppressing NF-κB activity by up to 15-fold, elevating NF-κB activity by up to 5-fold, and elevating IRF activity by up to 3-fold. A subset of our top-performing candidates, namely 17 compounds, was tested to validate their immunomodulatory effect by measuring cytokine release profiles. One of these molecules demonstrated a 40-fold enhancement in IFN-β production. In addition, we rationalized the correlation between the chemical structure and immunomodulatory capacity by using a linear regression model and interpretable molecular features and discovered design rules for immunomodulators.


Our machine learning-enabled screening approach presents an efficient immunomodulator discovery pipeline that has furnished a library of novel small molecules with unprecedented capacity to enhance or suppress innate immune signaling pathways. This has the potential to improve prophylactic vaccination by minimizing side effects and addressing vaccination hesitancy, and to enhance the potency of immunotherapies. This collection of new small molecule immunomodulators may progress to subsequent screenings, in vivo studies, clinical trials, and eventually become a pharmaceutical product, if successful. Furthermore, this machine learning-based screening strategy can be employed in drug discovery and development, especially in pipelines lacking well-defined mechanistic insights and requiring expensive experimental assessments.