(216d) Model-Based Clustering Applied to Combinatorial Libraries
AIChE Annual Meeting
Tuesday, October 18, 2011 - 9:30am to 9:50am
Peptide and organic combinatorial libraries are essential to screening for new drugs as well as other properties. Often, however, it is difficult to effectively treat false positives. This is especially true when searching for specific peptide sequences. Some sequences will non-specifically bind to proteins or specifically bind to classes of proteins. One approach to dealing with this problem is to cluster the data onto a structure-activity relationship (SAR) surface in order to discover the classes of molecules or peptides. One technique that allows the simultaneous discovery of the SAR surface and the clusters is model-based clustering. We've shown that this semi-parametric technique works well on peptide libraries and, when combined with other techniques, can handle over-specified QSAR models. The inherent advantage of this system, however, is that the Bayesian formulation of model-based clustering provides confidence intervals on the model parameters. For example, it can provide marginal likelihood for each SAR dimension. This provides a new way of analyzing imperfect complex combinatorial libraries. We used this technique on previously published results and on peptide library data in our own group. The model identified a number of clusters and further testing of sequences from those clusters revealed some to be non-specific and other clusters to be specific.