(687e) Chemprop: Machine Learning for Molecular Property Prediction
AIChE Annual Meeting
2022
2022 Annual Meeting
Computational Molecular Science and Engineering Forum
Software Engineering in and for the Molecular Sciences
Friday, November 18, 2022 - 9:21am to 9:38am
We will discuss the architecture used by Chemprop, some notable examples of recent applications, and several of the softwareâs significant features. Inputs to Chemprop models are provided as SMILES strings from which the software can construct 2D connectivity graphs of molecules. Chempropâs network achitecture uses directed message passing neural networks (d-MPNN) for learnable molecular encodings, as implemented and benchmarked against other architectures by Yang et al. [3]. The end-to-end learning enabled by this architecture allows for the software to extract the information from the molecular graph that is most relevant to the property target being modeled. The software has been used for a variety of different applications, showing the versatility of learned encodings: enthalpy of formation, activation energy, solubility, antibiotic activity, reaction regioselectivity, UV-Vis absorption, and infrared spectra.
The implementation of Chemprop has incorporated a number of features and functions to fit the needs of its users. Chemprop has GPU-enabled training and prediction of models. Additional functions have been added for hyperparameter optimization, the extraction of molecule latent representations, and the estimation and calibration of model uncertainty. Additional molecule- and atom-level features can be provided to bring in outside information from outside methods such as experimental measurements or quantum mechanical calculations. Chemprop supports inputs using reactions or multiple molecules (e.g., solvent and solute). Tools for transfer learning and weighted multitask models enable the model to infer useful relationships across distinct datasets. The contained workflow allows for users to perform all of Chempropâs main functions with minimal coding required.
[1] Chemprop: Molecular Property Prediction. https://github.com/chemprop/chemprop
[2] Machine Learning for Pharmaceutical Discovery and Sythesis Consortium. https://mlpds.mit.edu
[3] Yang, K.; Swanson, K.; Jin, W.; Coley, C.; Eiden, P.; Gao, H.; Guzman-Perez, A.; Hopper, T.; Kelley, B.; Mathea, M.; et al. Analyzing Learned Molecular Representations for Property Prediction. J. Chem. Inf. Model. 2019, 59 (8), 3370â3388. https://doi.org/10.1021/acs.jcim.9b00237.