(192ad) Qsars for Predicting Adipose:Blood Partitioning of Industrial Chemicals

Papadaki, K., Aristotle University of Thessaloniki
Karakitsios, S., Aristotle University of Thessaloniki
Sarigiannis, D., Aristotle University of Thessaloniki
In recent years, there is an increasing interest in the development of Physiologically Based Toxicokinetic (PBTK) models, which provide quantitative descriptors of Absorption, Distribution, Metabolism and Excretion (ADME) of environmental or pharmaceutical chemicals. However, their application in toxicity testing and health risk assessment is limited due to the lack of input parameters required for their development, especially when these systems comprise comprehensive mathematical descriptions of human physiology. Proper parameterization of PBTK models can be achieved using advanced Quantitative Structure-Activity Relationships (QSARs). QSARs are widely used for the estimation of physicochemical and biochemical properties of “data poor” chemical compounds, biological effects as well as understanding the physicochemical features governing a biological response. These relationships are described as regression or classification models, which connect the biological effects and chemistry of each chemical compound and comprise the activity data to be modeled, the data with which to model and a method to formulate the model.

Several approaches incorporating QSARs have been proposed for the prediction of partition coefficients for PBTK modeling, including

(a) an algorithm based on the fractional content of cells, interstitial fluid in tissue, plasma and erythrocyte in blood, tissue lipids and the lipophilicity of chemical compounds and

(b) the Linear Free Energy Relationship (LFER), proposed by Abraham and co-workers, for estimating biological properties, which takes into account excess molar refractivity, polarizability, solute effective or summation hydrogen-bond acidity and basicity, as well as the McGowan characteristic volume. The latter is a measure of the lipophilicity for chemical compounds.

Adipose/blood partition coefficient is considered as one of the most significant input parameters of PBTK models. The partitioning of chemical compounds into adipose tissue and blood provide information regarding distribution and toxicological effects of these substances.

The methodological approach presented in this study is based on the development of QSAR models to predict adipose/blood partition coefficient for environmental chemical compounds. The necessary input data for the models to be trained consisted of the experimental values of adipose/blood partition coefficient and two sets of molecular descriptors for 67 environmental chemicals of the initial set; a) the descriptors from Linear Free Energy Relationship (LFER) and b) the PaDEL descriptors. The PaDEL dataset included 1D, 2D and 3D descriptors, which are related to the molecular structure of the chemicals and are characterized as constitutional, topological, geometrical or electronic. The reduction process was followed for the derived molecular descriptors in order to avoid the semi-constant (>80%) and intercorrelated (>95%) ones. Principal Component Analysis (PCA) was used for further reduction of PaDEL inputs, as well as for the categorization of chemical compounds.

The datasets were randomly divided into the training set, containing 70% of the total data, the validation and the test set, each one representing 15% of the total. Then, they were analysed using two statistical methods; Genetic Algorithm based Multiple Linear Regression (GA-MLR) and Artificial Neural Networks (ANN). GA was implemented for the selection of the optimal set of descriptors for the models. The GA-MLR technique was implemented in the QSARINS software (Gramatica, Chirico et al. 2013, Gramatica, Cassani et al. 2014), while the ANN technique was implemented in MATLAB® (version R2016a, Mathworks Inc) using the Neural Network Toolbox.

The developed models with the LFER and PaDEL descriptors, coupled with ANN, produced excellent performance results. The fitting performance (R2) of the models, using LFER and PaDEL descriptors, was 0.94 and 0.96, respectively. Cross Validation (CV) indicated that the predictive performance of the models (Qcv2) was equal to 0.96 and 0.95, while the external validation value (Rext2) was found to be 0.98 and 0.94 for LFER and PaDEL model, respectively. The Applicability Domain (AD), which defined the limitations regarding the structural and target domains of the models with the best performance, was determined using several approaches, including bounding box, bounding box on PCs, convex hull, leverage, distance to centroid, k Nearest Neighbors (kNN) approach with fixed k, k Nearest Neighbors (kNN) approach with variable k, Probability Density Function (PDF) based methods. Finally, the developed models were applied to a large number of chemical compounds with unknown values of adipose/blood partition coefficient.

The proposed models for the estimation of adipose/blood partition coefficients were checked for their fitting performance, validity and applicability. It was found that they are stable, reliable and capable to predict physicochemical parameters of “data poor” chemical compounds that fall within the applicability domain. The developed predictive models could serve as a tool to fill in data gaps of environmental chemicals with unknown values of adipose/blood partitioning. In this way, animal testing could be reduced significantly and the wide use of PBTK models could be reinforced. Finally, the “safe-by-design” concept for environmental chemicals is supported, by allowing the successful prediction of toxicokinetic behavior based on molecular parameters, promoting green chemistry and cost saving of product development.