(243k) Opportunities in Spectroscopic Analysis: Chemometrics++

Authors: 
Albrecht, J., Bristol-Myers Squibb

Shortly after the proliferation of spectrometers in the 1970s, multivariate analysis techniques such PLS quickly gained recognition for their ability to create predictive models from the large data sets.  While these factor analysis approaches will always be an important tool for dimensionality reduction, statisticians and computer scientists continue to develop complimentary tools with ever-increasing sensitivity and specificity.  Neural networks, random forests, support vector machines, lasso, and ridge regression are examples of regression algorithms that can have superior performance compared to factor analysis approaches alone. 

The widespread availability of these tools, coupled with easy scripting and cloud computing setup allows for the development of workflows that can rapidly formulate and test hundreds of potential regression models and automatically identify the best balance of complexity, sensitivity, and specificity in a quantitative model.  Such a workflow will be demonstrated using Python with an example set of lab NIR data starting from the raw interferograms.  This presentation aims to extend the discussion of chemometric quantitation beyond conventional multivariate regression of processed absorbance spectra in order to reframe it as an application of modern applied predictive modeling.