(203d) Giving Attention to Generative Models for De Novo Molecular Design | AIChE

(203d) Giving Attention to Generative Models for De Novo Molecular Design


Joshi, N., University of Washington
Beck, D., University of Washington
Pfaendtner, J., University of Washington
Attention mechanisms have led to many recent breakthroughs in sequential data modeling tasks including machine translation, text generation and protein structure prediction.1–3 While they have also been used in the molecular domain for tasks such as graph-based analysis of chemical structures4 and atom-mapping of organic reactions5, attention has yet to be incorporated into any generative algorithms for molecular design.

Here we explore the impact of adding self-attention layers to generative β-VAE models and show that those with attention are able to learn a complex “molecular grammar” while improving performance on downstream tasks such as accurately sampling from the latent space (“model memory”) or exploring novel chemistries not present in the training data. There is a notable relationship between a model’s architecture, the structure of its latent memory and its performance during inference. For instance, we find that there is an unavoidable tradeoff between model exploration and validity that is a function of the complexity of the latent memory. However, novel sampling schemes may be used that optimize this tradeoff.

We also demonstrate the ability of the transformer VAE to construct a set of complex, human-interpretable molecular substructural features in an unsupervised fashion. We compare these learned features across different input representations including SMILES and SELFIES6 strings as well as those extracted from traditional cheminformatics software packages. Finally, we discuss how these models may eventually be used in tandem with natural language models, high-throughput molecular dynamics simulations and reinforcement learning algorithms to present a unified AI-based framework for molecular discovery and optimization.

  1. Bahdanau, D., Cho, K. & Bengio, Y. Neural Machine Translation by Jointly Learning to Align and Translate. arXiv:1409.0473 [cs.CL] (2014).
  2. Brown, T. B. et al. Language Models are Few-Shot Learners. in 34th Conference on Neural Information Processing Systems (NeurIPS 2020), Vancouver, Canada (arXiv, 2020).
  3. Service, R. F. ‘The game has changed.’ AI triumphs at protein folding. Science 370, 1144–1145 (2020).
  4. Payne, J., Srouji, M., Yap, D. A. & Kosaraju, V. BERT Learns (and Teaches) Chemistry. arXiv:2007.16012 [q-bio.BM] (2020).
  5. Schwaller, P., Hoover, B., Reymond, J.-L., Strobelt, H. & Laino, T. Extraction of organic chemistry grammar from unsupervised learning of chemical reactions. Science Advances 7, eabe4166 (2021).
  6. Krenn, M., Häse, F., AkshatKumar, N., Friederich, P. & Aspuru-Guzik, A. Self-referencing embedded strings (SELFIES): A 100% robust molecular string representation. Machine Learning: Science and Technology 1, (2020).