(419e) Utilization of Multi-Dimensional Characterization in Virtual High-Throughput Screening
The design and selection of chemicals with targeted properties and activities has greatly benefited from the application of quantitative structure activity relationships (QSAR's). This approach involves the regression of various molecular descriptors, such that a property or activity of interest is accurately modeled as a function of the underlying molecular features. The advent of powerful variable selection techniques has allowed for these models to become highly predictive in nature, which can be taken advantage of in the search for optimal candidates meeting certain predetermined property/activity requirements. It has been shown that each property of interest is best characterized by it's own unique combination of molecular descriptors. These descriptors capture widely varying aspects of a molecule’s features and, among many ways, can be differentiated by the dimensionality of information they capture. Simple, low-dimensional descriptors such as molecular weight or number of aromatic rings, while easy to calculate, cannot explain more complicated phenomena such as ligand receptor interactions. On the other hand descriptors containing information such as spatial electrostatic character can explain these interactions, but are much more computationally intensive and vulnerable to statistical noise. The use of these QSAR's in a predictive manner towards identifying promising candidate molecules in a large region of chemical space is known as virtual high-throughput screening (VHTS). Used along-side conventional laboratory based HTS methods, VHTS can significantly reduce the time and costs associated with developing novel chemicals and drugs meeting certain property and activity requirements. The inclusion of descriptors with higher dimensionality, specifically 3D descriptors, has proven very successful but still has its limitations. Most often, these descriptors are calculated from the local energy minima of each molecule in a data set, which is not highly representative of the actual conformations found in solution. In response, techniques have been developed to calculate these spatial descriptors from a conformational ensemble, which can include information such as a molecules flexibility and is commonly known as 4D-QSAR. The utilization of these models in a predictive manner can become quite computationally intensive, especially when searching large regions of chemical space, thus necessitating methods to expedite the search. This contribution details an algorithm useful for VHTS in large regions of chemical space with descriptors of varying dimensionality (0D-4D). The approach utilizes concepts in graph theory and employs genetic algorithm to enable the consideration of a large search space subject to multiple property constraints. The signature molecular descriptor is used to compress the molecular information in a canonical form, which reduces memory storage requirements and facilitates an efficient combinatorial search. A case study will be presented to exemplify the applicability of such an approach.