Evaluating Polymer Stabilizer Performance Using Molecular Descriptors and Machine Learning on a Small Dataset | AIChE

Evaluating Polymer Stabilizer Performance Using Molecular Descriptors and Machine Learning on a Small Dataset

Type

Conference Presentation

Conference Type

AIChE Annual Meeting

Presentation Date

November 8, 2021

Duration

15 minutes

Skill Level

Intermediate

PDHs

0.50

Mining experimental data from the literature is an important exercise for informing future experimental studies, even if available data is sparse. However, extracting insights from small materials datasets, such as those within research papers and patents, is inherently challenging due to high complexity, high dimensionality, and heterogeneous reporting across sources. This case study presents a situation where judicious molecular representation, feature importance, and physicochemical interpretation were integrated to extract machine learning insights on a small dataset. Here, experimental data from a single patent was analyzed to learn from the small molecule additives that were most effective in mitigating the degradation of poly(ethylene terephthalate) (PET). MACCS-166 and alvaDesc molecular descriptors were calculated for the dataset of 39 additive candidates to yield two sets of 166 and 1875 different features, respectively. Performing k­-means clustering using these molecular descriptors revealed evidence that performance differences were sensitive to variations in molecular structure. To pinpoint the features responsible for improved performance, a supervised reduced design region approach was applied to analyze descriptors both individually and in multiple dimensions to determine effectiveness in a binary classification of high and low performance. Not only were the most influential descriptors justifiable with respect to degradation chemistry, but also the selected features successfully trained random forest models with good cross validated performance. In comparing molecular descriptor approaches, we find that judicious interpretation of underlying physicochemical behavior is indispensable in validating the effectiveness of small data machine learning, especially for prioritizing experimental work toward a richer dataset.

Presenter(s) 

Once the content has been viewed and you have attested to it, you will be able to download and print a certificate for PDH credits. If you have already viewed this content, please click here to login.

Language 

Checkout

Checkout

Do you already own this?

Pricing

Individuals

AIChE Member Credits 0.5
AIChE Pro Members $19.00
AIChE Graduate Student Members Free
AIChE Undergraduate Student Members Free
AIChE Explorer Members $29.00
Non-Members $29.00