(23c) Opencrystaldata: An Open-Access Crystal Image Database for Enabling the Image-Based Analysis of Crystallization Systems | AIChE

(23c) Opencrystaldata: An Open-Access Crystal Image Database for Enabling the Image-Based Analysis of Crystallization Systems


Salami, H., Georgia Institute of Technology
Boyle, C., CMAC/Univeristy of Strathclyde
Bommarius, A., Georgia Institute of Technology
Nagy, Z., Purdue
Cardona, J., University of Strathclyde
Rousseau, R., Georgia Institute of Technology
Grover, M., Georgia Tech
Imaging-based process analytical technologies (PAT) have become critical tools for rapid crystallization process development and design owing to their ability to provide a wide range of insights, including tracking different crystallization mechanisms, real-time process monitoring, and in-situ particle size and shape characterization. As a result, attention towards developing and implementing various image analysis methods, including machine learning/artificial intelligence (ML/AI)-based approaches, to extract qualitative and quantitative information from image data has increased. However, a major roadblock against developing and evaluating novel image analysis models for crystallization processes is the lack of diverse, high-quality, and publicly available particle image datasets.

In this work, we present OpenCrystalData, an initiative to create an open-access database containing annotated image datasets from different crystallization systems with different particle sizes and shapes captured under various crystallization conditions. In the current version, the database consists of three different image datasets, addressing various applications in the crystallization process, including estimation of particle size distribution using in-situ images for different categories of particles and detection of anomalous particles for process monitoring purposes. These applications cover different image analysis tasks, including object detection, object classification, and instance segmentation. These applications enable the development and testing of new algorithms, comparing different image analysis approaches, and validating existing models. The images are collected using in-situ particle video microscopy followed by case-specific processing steps, such as ground-truth labeling and particle size characterization using offline microscopy.

The open-access database will be released on the user-friendly online collaborative platform Kaggle, along with problem-specific guidelines for each dataset. We hope for OpenCrystalData to facilitate and inspire new developments in the area of imaging-based PAT for crystallization processes.