Being able to predict the polymorph that is likely to crystallize from solution under given conditions is of utmost importance to many industries, as the resulting polymorph will determine key physical properties of the end-product. In the pharmaceutical industry, for example, the bioavailability of a drug molecule depends on its crystal structure, which directly affects the solubility and dissolution rate of the drug. Moreover, obtaining appropriate models of industrial crystallization units requires insight into the mechanism of nucleation and knowledge of polymorph specific kinetic properties such as nucleation rates. However, polymorph prediction remains a challenge both experimentally, due to the small time and length scales associated with formation of critical nuclei and computationally, due to the rare event nature of nucleation. Classical nucleation theory (CNT) has been the most prominent theoretical model to study and understand nucleation of a new thermodynamic phase. While it has been proven to work well for simple systems, the one-dimensional free energy description in CNT makes it less applicable to complex systems such as crystallization from solution. CNT assumes that the free energy change associated with formation of a nucleus only depends on the size of the nucleus and fails to provide any information about the structure of the critical nucleus. However, recent studies have revealed that the changes in the structure of the nucleus also play an important role in the nucleation mechanism. Especially in the presence of multiple possible polymorphs, structural information as well as the nucleus size should be present in the free energy description.
In this work, we describe a methodology to predict the polymorph of glycine that will crystallize from aqueous solution by using molecular simulations. Based on the discussion above, we aim to increase the dimensionality of free energy by introducing polymorph specific nucleus size coordinates. A template matching algorithm is used to differentiate between the three known polymorphs of glycine and to calculate polymorph specific nucleus sizes. These collective variables provide information about both the size and the structure of the nucleus. The âstring method in collective variablesâ can be used to obtain the minimum free energy paths on this multidimensional free energy surface connecting the solution phase to each of glycineâs polymorphs in the crystallized phase. This makes it possible to determine the free energy barrier for nucleation of each polymorph. We describe how hybrid MD/MC simulations can be used to evaluate the gradient of free energy at each point along the âstringsâ, allowing the use of discrete collective variables. The described methodology allows polymorph selection for nucleation from solution, without assuming the nucleus shape or structure.