(584c) The Use of Hybrid Modeling Schemes in the Development of a Probabilistic Condition Monitoring System for a Continuous Drug Product Manufacturing Process | AIChE

(584c) The Use of Hybrid Modeling Schemes in the Development of a Probabilistic Condition Monitoring System for a Continuous Drug Product Manufacturing Process

Authors 

Lagare, R. - Presenter, Purdue University
Nagy, Z., Purdue
Reklaitis, G., Purdue University
Sheriff, M. Z., Purdue University
Industry 4.0 concepts have the potential to address problems in manufacturing operations and development. Through the seamless integration of separate manufacturing systems, Industry 4.0 results in the development of a “smart factory” that is lean, agile, and flexible.(Steinwandter, Borchert, and Herwig 2019; Barenji et al. 2019) Such a system features a centralized data collection system, paving the way for a more holistic way to analyze data. The application of machine learning models have played a significant role in taking advantage of this trend, allowing industries to effectively derive information and knowledge from massive amounts of data.(Qin 2014)

While the same can be said about Pharma 4.0 (Industry 4.0 for the pharmaceutical industry), unique challenges in pharmaceutical manufacturing, especially in drug development, make the adoption of these trends difficult. The drug development period is regarded as the critical factor in the success of a drug product. (Sugiyama et al. 2019) It is also during this period that data scarcity becomes an issue. During this stage, it would typically be very expensive to produce the relevant drug substance and its derivatives in large amounts, limiting the amount of material that can be used for data generation. Concurrently, models for advanced process control and condition monitoring need to be developed and proven to be effective in quality control and assurance. This is especially true for Continuous Manufacturing Systems, where the timely detection of faults and application of intervention measures are critical for its practicability.(Schenkendorf 2016)

This presentation features a case-study where a probabilistic condition monitoring model for Dry Granulation was developed under scarce data conditions. A limited amount of data was used to estimate the parameters of a flowsheet model, which serves as a digital twin for the process. Using this virtual representation of the process, data is generated for developing a probabilistic condition monitoring model, which can be more data-intensive.

Digital twins have been proven to simulate particulate systems effectively. The following figure demonstrates the close match between the actual experimental data from an impact pin mill, and predictions from its digital twin, which was developed using a multiscale modeling approach. (Wang et al. 2021)

Figure 1. Particle Size Measurements from an Impact Pin Mill versus the Predictions of its Digital Twin

Models like these are suitable for process predictions since they are based on first principles approaches like Population Balances, (Chaudhury et al. 2016) and fitted with actual data from operations. For the dry granulation method of tablet manufacturing, estimating the particle size distribution is of particular interest since it is directly related to the flowability and compressibility of the particles. These two parameters are important because they are Critical Material Attributes (CMA)(Maguire and Peng 2015) for maintaining product quality in the Tablet Press (TP). They are also considered the critical-to-quality attributes of the Roller Compactor (RC) by default, since the RC directly precedes the TP along the dry granulation line.

The identification of Quality-by-Design (QBD) parameters such as (Intermediate)-Critical-Quality Attributes (CQA) and Critical Process Parameters (CPP) proved to be an advantage as it caters to the utilization of Model-based Machine Learning Methods(Bishop 2013). These methods utilize graphical models based on process understanding to develop bespoke machine learning algorithms. An example of this model is shown in Figure 2, which represents the relationship between the condition of the process (variable A), the condition of the two sensors (variables B and E), and the readings from the two sensors. These graphs are called directed acyclic(Koller and Friedman 2009) graphs, and the nodes in these graphs are random variables with a certain probability distribution.

Figure 2. Directed Acyclic Graph (Bayesian Network) of a Process with Two Sensors

This graphical model can be converted to a factor graph, which contains nodes between connected random variables to represent their mathematical relationship. Factor graphs can be easily programmed using a probabilistic programming framework like Infer.NET(Minka et al., 2018), where machine learning is implemented using a Bayesian approach. The random variables of interest (i.e. the parameters that need to be predicted) have designated distributions called the prior, and these are updated using data observed from other variables (i.e. the input variables, a.k.a. features). Using message-passing algorithms, the distribution of unobserved random variables of interest are “inferred”, producing a posterior distribution which updates the prior.(Bishop 2013) This is a different paradigm from traditional machine learning where parameters are assigned point estimates, and the values of these point estimates are optimized by minimizing a loss function that reflects the difference between predicted and observed values.(Bishop 2013; Martin 2018) An example of a machine learning algorithm developed using probabilistic programming is shown in Figure 3, where process information from the Roller Compactor was used to predict the faults.

Figure 4. Probabilistic Graphical Model of a Roller Compactor Fault Detection Tool

The key advantage of the probabilistic approach is the natural quantification of uncertainty. Each prediction made by the model has an associated probability that reflects the certainty of that prediction. From an application perspective, this gives the operator the power to make informed decisions on whether to trust the predictions made by the system. From a model development perspective, this facilitates model comparison as well as guide the modeler to ascertain if the training data is sufficient for creating a reliable tool for fault detection. Designing experiments to produce training data can now become decisions based on quantifiable properties, placing process engineers in a position to create a better business case for justifying the cost of acquiring (or not acquiring) additional training data.


References

Barenji, Reza Vatankhah, Yagmur Akdag, Barbaros Yet, and Levent Oner. 2019. “Cyber-Physical-Based PAT (CPbPAT) Framework for Pharma 4.0.” International Journal of Pharmaceutics 567 (August): 118445.

Bishop, Christopher M. 2013. “Model-Based Machine Learning.” Philosophical Transactions. Series A, Mathematical, Physical, and Engineering Sciences 371 (1984): 20120222.

Chaudhury, Anwesha, Maitraye Sen, Dana Barrasso, and Rohit Ramachandran. 2016. “Population Balance Models for Pharmaceutical Processes.” In Process Simulation and Data Modeling in Solid Oral Drug Development and Manufacture, edited by Marianthi G. Ierapetritou and Rohit Ramachandran, 43–83. New York, NY: Springer New York.

Koller, Daphne, and Nir Friedman. 2009. Probabilistic Graphical Models: Principles and Techniques. MIT Press.

Maguire, Jennifer, and Daniel Peng. 2015. “How to Identify Critical Quality Attributes and Critical Process Parameters.” In Presented FDA/PQRI 2nd Conference North Bethesda, Maryland. pqri.org. https://pqri.org/wp-content/uploads/2015/10/01-How-to-identify-CQA-CPP-CMA-Final.pdf.

Martin, Osvaldo. 2018. Bayesian Analysis with Python: Introduction to Statistical Modeling and Probabilistic Programming Using PyMC3 and ArviZ, 2nd Edition. Packt Publishing Ltd.

Minka, T., J. Winn, J. Guiver, Y. Zaykov, D. Fabian, and J. Bronskill. n.d. “Infer .NET 0.3, 2018. Microsoft Research Cambridge.”

Qin, S. Joe. 2014. “Process Data Analytics in the Era of Big Data.” AIChE Journal. American Institute of Chemical Engineers 60 (9): 3092–3100.

Schenkendorf, R. 2016. “Supporting the Shift towards Continuous Pharmaceutical Manufacturing by Condition Monitoring.” In 2016 3rd Conference on Control and Fault-Tolerant Systems (SysTol), 593–98.

Steinwandter, Valentin, Daniel Borchert, and Christoph Herwig. 2019. “Data Science Tools and Applications on the Way to Pharma 4.0.” Drug Discovery Today 24 (9): 1795–1805.

Sugiyama, Hirokazu, Yusuke Morikawa, Mai Matsuura, and Menghe Xu. 2019. “Relevance of Regulatory Constraints in Designing Pharmaceutical Manufacturing Processes: A Case Study on Waste Solvent Recovery.” Sustainable Production and Consumption 17 (January): 136–47.

Wang, Li Ge, Ruihuan Ge, Xizhong Chen, Rongxin Zhou, and Han-Mei Chen. 2021. “Multiscale Digital Twin for Particle Breakage in Milling: From Nanoindentation to Population Balance Model.” Powder Technology. https://doi.org/10.1016/j.powtec.2021.03.005.

Figures