(305f) Data Science Applications for Process Improvement at Dupont

Conference

AIChE Annual Meeting

Year

2023

Proceeding

2023 AIChE Annual Meeting

Group

Computing and Systems Technology Division

Session

Data science and analytics for process applications

Time

Thursday, November 9, 2023 - 9:40am to 10:00am

Authors

Andrews, A. - Presenter

Zhu, H., DuPont

Hunt, J., DuPont

DuPont is a Fortune 500 company headquartered in Wilmington, DE (USA) that develops and manufactures materials for protective garments, construction, water purification, imaging, printing, microchip fabrication, and electronic devices. The company seeks to continually improve its manufacturing processes for greater efficiency and quality. An element of this effort has been growing a team to identify, develop, and deploy data science solutions at DuPont plants. This abstract summarizes lessons weâ€™ve learned over the past 5 years and 15 projects.

The most foundational results of our experience are a deep familiarity with the data sources at our plants and a profile of valuable data science applications in manufacturing. A modern manufacturing plant has numerous software systems: a process historian (PH), an enterprise resource planning (ERP) system, a manufacturing execution system (MES), a distributed control system (DCS), and a laboratory information management system (LIMS). Our projects almost always begin with large-scale data queries, aggregations, and joins to compile a master data table. The project objective is often to train models for one or more product properties or process yields of relevance.

The first class of opportunities weâ€™ve identified is predictive modeling. Predictive modeling is best applied to a process step that require a human decision. In this application, a process operator can view or trigger a prediction from a model to guide this process decision. She interfaces with the model through a web application with a graphical interface and a code-based back end that accesses live data from the manufacturing process and performs the model prediction. An example of our predictive modeling applications is a model to predict the weight of monomer required to produce the desired polymer viscosity in an imbalanced stoichiometric polymerization process. The weight of monomer required is variable due to batch-to-batch variations in raw material quality and operator behavior. Process variables such as current viscosity, viscosity history within the batch, reactor temperature, and mixing speed all have predictive value for the viscosity response. Our models in this application have decreased the number of required monomer additions by 20% and batch times by 20%.

The second class of opportunities weâ€™ve identified is root-cause analysis. Root-cause analysis is best paired with a specific product quality issue or equipment failure mode. In this application, we compile a large set of historical data, identify a set of plausible predictor variables in collaboration with process experts, train models, gather descriptive information from the model, and organize in a report. We train the model in this application specifically for the descriptive information it can provide, so we choose methods that supply coefficients or SHAP values. One root-cause analysis of a key product defect in DuPont suggested interventions that reduced the defect incidence by 65%.

Other opportunities weâ€™ve pursued are anomaly detection, fingerprinting from resonance or chromatography spectra, and quality control charts. We rely on traditional data-driven methods (principal component analysis and statistical process control respectively) for these applications instead of machine learning. However, we have success implementing these solutions using the modern data science toolbox of cloud tools, database queries, dashboards, and web applications.

The suite of machine learning methods we evaluate for our projects can vary, but always includes two preferred methods. The first is elastic net¹. Elastic net is a straightforward method to train, and the data scientist can easily confirm training success with a simple plot. Because elastic net is a linear model, it produces a set of model coefficients with clear descriptive value. The second is xgboost², a popular boosted regression tree model. Although more difficult to train than elastic net, it can fit more sophisticated nonlinear and interaction effects with good prediction accuracy, offers a moderate amount of descriptive information, and is less expensive to train than deep learning models.

Data preparation and feature engineering are important steps for developing machine learning models that benefit from domain expertise. Weâ€™ve developed a variety of guidelines and strategies for these informed by statistics and chemical engineering. One of the greatest challenges is modeling data produced by batch processes with data-driven methods. Because the state of a batch process depends on its history, we must either engineer features as integrals of process variables or a proposed rate equation or employ a machine learning method that includes a numerical integration, like neural odes³.

We can conclude the seminal by describing how cloud tools for data pipelining, data storage, compute, dashboards, and web applications can comprise a complete data science solution for manufacturing applications.

References

Zou, H. and Hastie, T. (2005), Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 67: 301-320.
Chen, Tianqi, and Carlos Guestrin. "Xgboost: A scalable tree boosting system." Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining. 2016.
Chen, R. T. Q. (2021). torchdiffeq (Version 0.2.2) [Computer software]. https://github.com/rtqichen/torchdiffeq

Topics

Computing and Systems Engineering

Process Design & Development

Other Sites & Tools

Technical Groups

Technical

Professional/Personal Growth

Societal Needs

Leadership

2024 Center for Hydrogen Safety Americas Conference

PD2M Conference on Modeling and Simulation Applications in Pharmaceutical Development and Manufacturing

Upcoming Conferences & Events

2024 Center for Hydrogen Safety Americas Conference

World Digital Congress of Chemical and Biochemical Engineering

PD2M Conference on Modeling and Simulation Applications in Pharmaceutical Development and Manufacturing

CCPS Latin America Regional Meeting (Spanish)

The Future of AI

2024 Lebanon Student Regional Conference

2024 Greece Chem-E-Car Competition

International Congress on Sustainability Science & Engineering (ICOSSE '24) and RAPID Roadmapping Workshop

2024 Process Development Symposium

CEP: May 2024

CEP: April 2024

Explore Areas of Advancement:

Learning Center:

Want to be an Entrepreneur? Personal Stories From Three Successful Entrepreneurs Who Have Traveled This Path.

(305f) Data Science Applications for Process Improvement at Dupont

AIChE Annual Meeting

2023

2023 AIChE Annual Meeting

Computing and Systems Technology Division

Data science and analytics for process applications

Thursday, November 9, 2023 - 9:40am to 10:00am

Authors

Topics

More Conference Links

Visit Orlando

Universal Studios Offer

Cancellation Policy

Code of Conduct

Beware of Hotel and Attendee-list Scams