(364a) Analytical Methods to Improve Diagnostic Protocols Using Infrared Spectroscopic Imaging | AIChE

(364a) Analytical Methods to Improve Diagnostic Protocols Using Infrared Spectroscopic Imaging


Mittal, S. - Presenter, University of Illinois At Urbana Champaign
Kim, J., Carle Illinois College of Medicine
Bhargava, R., University of Illinois at Urbana-Champaign
Spectrometry coupled to data mining approaches can identify spectral signatures indicative of disease state and its progression. Both structural and biochemical changes accompany cancer development and its subsequent progression. Current histologic characterization is morphology-based; thin (5 µm) tissue sections are stained, and cells are visually recognized by a pathologist using an optical microscope. However, the basis of the disease is well known to be molecular. Molecular analysis for pathology is complicated by the spatial diversity of cells and acellular materials, necessitating an analytical technique that involves imaging. New advancements in infrared (IR) spectroscopic imaging techniques have generated big data warranting analytical advances in using spectroscopic imaging for digital histopathology. Biological classes often have subtle variations in the molecular imaging data, genetics, proteomics, or other systems biology experiments. However, new advances in feature extraction and computational analysis have enabled the differentiation of intricate disease categories. In addition to measurement and computational factors, sample size estimation for desired statistical significance and generation of annotated data also influences the development of robust models based on spectroscopic data. Underestimating the sample size can lead to statistically deficient diagnostic tests, for example, while an overestimation can significantly increase experimental costs and time to develop protocols. Additionally, the transfer of annotations from clinical to spectroscopic data can be time-consuming, cumbersome, and have interobserver variability. For accurate classification models downstream, an accurate mapping of ground truth data to the images of interest (IR data) is crucial. Multimodal image registration is typically used to map data from one imaging modality to another. They have been utilized to correlate multiple biomarkers in conventional staining data1. Previous studies have also reported the use of image registration approaches for aligning mass spectroscopic images with Raman imaging data2 and optical images3. However, there are still challenges to accurately transfer annotations from stained images (typically used in clinics) to IR data in a user-friendly manner.

In this study, we carry out multivariate analysis of variance (MANOVA) to estimate the discrimination potential of IR spectroscopic features for building accurate machine learning models to separate different diagnostic categories. Next, we build a control point registration-based automated annotation tool that can generate training data for building new models with large-scale validation. The user can precisely annotate coordinates (three points) in the IR image and the clinical image, corresponding to the same spatial architecture using a graphical user interface. This overcomes the limitations of sparse ground truth data with current manual approaches by providing a tool to transfer pathologist annotations from stained images to IR images across diagnostic categories. Spectral features for disease classification are typically selected using feature selection approaches and subsequently used in artificial intelligence (AI) algorithms for diagnosis. Deep learning offers a new approach to combine spatial-spectral features with automatic identification by the AI algorithm. However, deep learning approaches require an extensive database of labeled training data. This would significantly reduce the amount of time needed to generate labeled training data, paving the way for an accurate, fully automated deep learning-based spectroscopic analysis of histopathological samples. We also utilize simple machine learning models to further increase the accuracy of our registration tool. Finally, we develop a combinatorial data mining approach (supervised + unsupervised) to identify diagnostic patterns and selecting pure chemical pixels for each cell type.


  1. G. Lippolis, A. Edsjö, L. Helczynski, A. Bjartell, N.C. Overgaard, A. Jemal, et al. “Automatic registration of multi-modal microscopy images for integrative analysis of prostate tissue sections”. BMC Cancer. 2013. 13(1): 408.
  2. T. Bocklitz, K. Bräutigam, A. Urbanek, F. Hoffmann, F. von Eggeling, G. Ernst, et al. “Novel workflow for combining Raman spectroscopy and MALDI-MSI for tissue based studies”. Anal. Bioanal. Chem. 2015. 407(26): 7865–7873.
  3. T. Gregory Schaaff, J.M. McMahon, P.J. Todd. “Semiautomated analytical image correlation”. Anal. Chem. 2002. 74(17): 4361–4369.