(218j) Developing a Defeatured Atom-Additive Model to Predict Single Component Partition Coefficients with FT-ICR MS Data | AIChE

(218j) Developing a Defeatured Atom-Additive Model to Predict Single Component Partition Coefficients with FT-ICR MS Data

Authors 

Kenney, D. - Presenter, Worcester Polytechnic Institute
LeClerc, H., Worcester Polytechnic Institute
Timko, M. T., Worcester Polytechnic Institute
Teixeira, A. R., Worcester Polytechnic Institute
Paffenroth, R., Worcester Polytechnic Institute
Octanol-water partition coefficients (K­ow­) are a useful tool for determining solute-solvent partitioning behavior as it provides insights towards the lipophilic or hydrophilic nature of compounds. Single component values can be identified experimentally, from first principle ab initio thermodynamics or, more commonly, approximated by regressive algorithms. However, current methods fall short in understanding complex systems with thousands of unique compounds such as those present in oil spills and bio-oil production. Fourier Transformation Ion Cyclotron Resonance Mass Spectroscopy (FT-ICR MS) is a powerful analytical tool that identifies molecular formulas and relative abundances of ions based on their excitement in a magnetic field.

In this work, we developed a number of machine-learned models (linear regression, random forest, gradient boosted, etc.) that predict single component partition coefficients based on the data available through FT-ICR MS. By using web scraping methods, a database of 25,970 data points, with 5,514 unique molecular formulas, were collected along with their experimental partition coefficient value. The data was regressed using multiple techniques and found that partition coefficients could be determined on minimal information. Using an independent validation set of nearly 4,000 compounds, our model can produce a mean absolute error of 0.37. Combining this new regression algorithm with FT-ICR MS of complex oil-water systems provides insights into the molecular makeup and partitioning signatures of complex oils.