(697e) A Framework for in-Silico Formulation Design and Optimization Using Multivariate Latent Variable Regression Methods | AIChE

(697e) A Framework for in-Silico Formulation Design and Optimization Using Multivariate Latent Variable Regression Methods


Polizzi, M. - Presenter, Pfizer Worldwide Research and Development

Quality by Design is a technical exercise that should be performed upstream in the design of the production train for a given product. Although each processing step will have its own inputs and outputs that effect product quality, it is clear that raw materials properties are important for the development of any process. This work addresses the first step in the propagation of raw material effects by estimating the properties of the initial blend of powders for an oral dosage form.

Many methods for mixture models and predictions can be found in literature. However, these are typically ?black-box? approaches (e.g. Neural Networks) which are non-interpretable as they provide no fundamental understanding of the relationships between properties of the raw material and the end product quality attributes (Takayama et al. 2003). These fitted models are also known to be unstable upon inversion (Turner & Guiver 2005). These limitations can be overcome with the use of a multivariate latent variable method. The parameters of the fitted models provide an understanding of the mechanisms through which the predictions are made allowing the validation of the model from its adherence to fundamental laws of physics and chemistry rather than validating the model solely on its statistical properties. Also, these models can be readily inverted (Jaeckle & MacGregor 1998) which empowers the use of the model within an optimization framework where blends can be designed based on a set of desired formulation properties.

The use of multivariate regression models to handle properties of mixtures has been widely addressed in literature. The most recent work (Muteki & MacGregor 2007) proposes the use of the LPLS model to handle such data structures. Although the LPLS is currently the only method that will provide loading coefficients to interpret the effect of raw material properties on the product quality, it does not provide the ability to use the data on a completely new material that has not been used before in a formulation. This work addresses this limitation by introducing a new method where the properties of the raw materials are reduced by PCA before forming the L-shaped data block to be analyzed. The predictive performance of the proposed method is shown in contrast to the LPLS and the ideal mixture approach using data collected throughout 10 years of oral dosage development. After model cross-validation, the framework was further validated by contrasting the prediction results against data acquired in the laboratory. Performance of the blend in multiple compact mechanical, and powder attributes was correctly classified within previously established performance categories.

This predictive tool has been used for pre-screening of potential formulations that lie within the range of the data used to fit the models. It has the potential to reduce the number of experiments needed for process development by pre-selection formulations which have desirable properties. The use of the framework is illustrated in estimating blend properties of a new API. Due to the high dose requirement (600 mgA) the formulation needed to contain the highest API loading possible to produce a tablet of an acceptable size. The proposed model was used to predict blend properties at several different API loadings to determine the upper limit of API loading by where the blend material no longer exhibited acceptable mechanical and flow characteristics.


Jaeckle, C. M. & MacGregor, J. F. 1998, "Product Design Through Multivariate Statistical Analysis of Process Data", AICHE Journal, vol. 44, no. 5, pp. 1105-1118. Muteki, K. & MacGregor, J. F. 2007, "Multi-block PLS modeling for L-shape data structures with applications to mixture modeling", Chemometrics and Intelligent Laboratory Systems, vol. 85, pp. 186-194. Takayama, K., Fujikawa, M., Obata, Y., & Morishita, M. 2003, "Neural network based optimization of drug formulations", Advanced Drug Delivery Reviews, vol. 55, no. 9, pp. 1217-1231. Turner, P. & Guiver, J. 2005, "Introducing the bounded derivative network--superceding the application of neural networks in control", Journal of Process Control, vol. 15, no. 4, pp. 407-415.