(370f) Quantitative Evaluation of Grouping Complex Substances and Sorbent Material Design: Delivering Data-Informed Insights | AIChE

(370f) Quantitative Evaluation of Grouping Complex Substances and Sorbent Material Design: Delivering Data-Informed Insights


Onel, M., Texas A&M Energy Institute, Texas A&M University
Zhou, L., Texas A&M University
Wright, F. A., North Carolina State University
Phillips, T. D., Texas A&M University
Rusyn, I., Texas A&M University
Pistikopoulos, E., Texas A&M Energy Institute, Texas A&M University
The ultimate goal of the Texas A&M Superfund program is to develop comprehensive tools and models for addressing exposure to chemical mixtures during environmental emergency-related contamination events [1]. Rapid and accurate analysis of exposures to complex mixtures is of utmost importance during environmental emergency-related contamination events (i.e. hurricanes), yet a challenging one. The inherent chemical composition complexity and dynamic exposure pathways obfuscate detailed characterization of such substances, and thus identification of potential risks of them for environmental health. Therefore there is an urgent need to determine the optimal grouping of chemical mixtures based on their chemical characteristics in order to facilitate comparative assessment of their human health impacts through read-across [2], and further guide the selection of sorption material in such a way that the adverse health effects of each group are mitigated.

In this work, we design a framework to (i) optimally group complex chemical substances based on their chemical characteristics in order to facilitate decision-making by read-across [3], and (ii) predict the sorption activity of broad-acting materials via regression techniques for different chemical groups [4]. First, we exploit hierarchical clustering methodology using Pearson correlation as similarity metric, and build classification models using Random Forest algorithm for optimal grouping. We have used the analytical chemistry data of 60 Standard Reference Materials (SRMs) provided by the National Institute of Standards and Technology (NIST) [5], and 15 complex chemical substances of Unknown or Variable composition, Complex reaction products, and Biological materials (UVCBs), where Gas Chromatography – Mass Spectrometry (GC-MS), two-dimensional gas chromatography with flame ionization detector (GCxGC-FID), and Ion Mobility – Mass Spectrometry (IM-MS) techniques [6] are adopted. Dimensionality reduction techniques are incorporated to select the most informative features in order to further improve the grouping results, which are quantified by the Fowlkes-Mallows (FM) index [7], and classification accuracy. On the other hand, the selection of the optimal sorption material for a given chemical mixture is a challenging and iterative task, where the chemical-sorbent property space needs to be explored iteratively to fine-tune and guide the experimental designs. Therefore, we perform predictive modeling of sorption activity of materials via advanced regression techniques [8-9]. Our results demonstrate that modeling and data-driven optimization analysis immensely facilitates the communication of complex substance groupings for the read-across, and thus the decision-making in designing solutions for the community during environmental emergency-related contamination events.


[1] TAMU Superfund Research Center (2019). https://superfund.tamu.edu/

[2] Schultz, T. W., Amcoff, P., Berggren, E., Gautier, F., Klaric, M., Knight, D. J., ... & Cronin, M. T. D. (2015). A strategy for structuring and reporting a read-across prediction of toxicity. Regulatory Toxicology and Pharmacology, 72(3), 586-601.

[3] Onel, M., Beykal, B., Ferguson, K., Chiu, W. A., McDonald, T.J., Zhou, L., House, J. S., Wright, F. A., Sheen, D. A., Rusyn, I., Pistikopoulos, E. N. (2019). Quantitative Evaluation of Grouping Complex Substances using Analytical Chemistry Data: Delivering Data-Informed Insights. (In preparation).

[4] Onel, M., Beykal, B., Wang, M., Grimm, F. A., Zhou, L., Wright, F. A., Phillips, T. A., Rusyn, & Pistikopoulos, E. N. (2018). Optimal Chemical Grouping and Sorbent Material Design by Data Analysis, Modeling and Dimensionality Reduction Techniques. Computer Aided Chemical Engineering, Elsevier, 43, 421-426.

[5] de Carvalho Rocha, W. F., Schantz, M. M., Sheen, D. A., Chu, P. M., & Lippa, K. A. (2017). Unsupervised classification of petroleum Certified Reference Materials and other fuels by chemometric analysis of gas chromatography-mass spectrometry data. Fuel, 197, 248-258.

[6] Grimm, F. A., Russell, W. K., Luo, Y. S., Iwata, Y., Chiu, W. A., Roy, T., ... & Rusyn, I. (2017). Grouping of petroleum substances as example UVCBs by ion mobility-mass spectrometry to enable chemical composition-based read-across. Environmental Science & Technology, 51(12), 7197-7207.

[7] Fowlkes, E. B., & Mallows, C. L. (1983). A method for comparing two hierarchical clusterings. Journal of the American Statistical Association, 78(383), 553-569.

[8] Boukouvala, F., & Floudas, C.A.. (2017). ARGONAUT: AlgoRithms for Global Optimization of coNstrAined grey-box compUTational problems. Optimization Letters, 11(5), 895-913.

[9] Beykal, B., Boukouvala, F., Floudas, C. A., Sorek, N., Zalavadia, H., & Gildin, E. (2018). Global optimization of grey-box computational systems using surrogate functions and application to highly constrained oil-field operations. Computers & Chemical Engineering, 114, 99-110.