(642g) A Big Data Analytics Workflow for Pharmaceutical Manufacturing Industry | AIChE

(642g) A Big Data Analytics Workflow for Pharmaceutical Manufacturing Industry


Zhang, S. - Presenter, Zhejiang University
Qu, H., Zhejiang University
Xie, X., Zhejiang University
Pharmaceutical manufacturing is a complex process, consisting of blending, granulation, tableting and other unit operations. The aim of pharmaceutical manufacturing is to reliably produce good quality drugs with high efficiency and low cost. However, due to lack of process understanding and control strategies, it is hard to avoid drug quality defects and production efficiency decreasing. Nowadays, many pharmaceutical manufacturing unit operations have been equipped with lots of sensors, controllers and even process analytical instruments, and this has enabled the rapid growth of pharmaceutical manufacturing industry big data. The vast data resource hides the process information and characteristics, so mining knowledge behind the manufacturing data is of vital importance.

In this study, we proposed a big data analytics workflow for pharmaceutical industry manufacturing to mine process knowledge, which helped to improve drug quality and production efficiency. The workflow contains five steps: problem definition, data acquisition, data preprocessing, data modelling and model application. The workflow provides an idea for pharmaceutical enterprises to collect, organize and analyze manufacturing data systematically. Meanwhile, a data analytics case study of a double effect evaporation process for herbal medicines was used for further illustration. In the case, the production efficiency of a double effect evaporator was observed to decrease during a long time. To handle this problem, the data of sensors, valves and instructions from 172 batches were collected. After data preprocessing, data modelling task was carried out. Correlation analysis was first applied to study the relationships among each single process variables. Next, multivariate data analysis methods were used to evaluate the profiles of all the batches: different process phases were identified using hierarchical clustering analysis according to the dissimilarities among the batches; the root causes of the phase transformations were also investigated through principal component analysis. The models above could be applied for on-line process monitoring, thus helping to enhance process understanding and to support manufacturing decision.