(11b) Rational Solvent Selection Guided By Machine Learning and Molecular Descriptors in Asymmetric Catalytic Reactions

Cao, L., Cambridge Centre for Advanced Research and Education in Singapore (CARES) Ltd
Amar, Y., University of Cambridge
Schweidtmann, A. M., RWTH Aachen University
Deutsch, P., UCB Pharma
Lapkin, A. A., Cambridge Centre for Advanced Research and Education in Singapore Ltd
Rational solvent selection guided by machine learning and molecular descriptors in asymmetric catalytic reactions

Yehia Amar,a Artur M. Schweidtmann,bLiwei Cao,a,d Paul Deutschc and Alexei Lapkin a,d

a Department of Chemical Engineering and Biotechnology, University of Cambridge, Philippa Fawcett Drive, Cambridge, CB3 0AS, United Kingdom

b Aachener Verfahrenstechnik – Process Systems Engineering, RWTH Aachen University, Aachen, Germany

c UCB Pharma S.A. Allée de la Recherche, 60 1070 Brussels, Belgium

d Cambridge Centre for Advanced Research and Education in Singapore Ltd, 1 Create Way, CREATE Tower #05-05, 138602, Singapore

Email: aal35@cam.ac.uk


Rational solvent selection remains a significant challenge in process development, especially within pharmaceutical applications. Therefore, a hybrid mechanistic - machine learning approach, geared towards automated process development work-flow was developed, and successfully applied on a Rh(CO)2(acac)/Josiphos (R1=cyclohexyl, R2=4-methoxy,3,5-dimethylphenyl) catalyzed asymmetric hydrogenation of a chiral α-β unsaturated γ-lactam reaction. The mechanistic part of the model is based on molecular descriptors of physico-chemical properties of solvents, including the reaction-specific descriptors, substrate and hydrogen solubility. A library of 400 solvents was used with 17 molecular descriptors.

The algorithm, which is based on a Gaussian process surrogate model, was trained to learn and optimize for both conversion and diastereomeric excess simultaneously, ultimately identifying better solvents.

In addition to being a powerful design of experiments methodology, the resulting statistical surrogate model is predictive, with a cross-validation correlation coefficient of 0.83. Furthermore, a solvent-mixing strategy based on the black box approach was also investigated. These methods open the door for process chemists to use enhanced process development workflows for optimization and discovery.


This paper has an Extended Abstract file available; you must purchase the conference proceedings to access it.


Do you already own this?



AIChE Members $150.00
AIChE Graduate Student Members Free
AIChE Undergraduate Student Members Free
Non-Members $225.00