(491g) Advancing Digital Twins in Chemical Systems: A Novel Time-Series Transformer (TST)-Based Hybrid Modeling Approach | AIChE

(491g) Advancing Digital Twins in Chemical Systems: A Novel Time-Series Transformer (TST)-Based Hybrid Modeling Approach

Authors 

Kwon, J. - Presenter, Texas A&M University
Sitapure, N., Texas A&M University
Recent years have seen a tremendous push toward the development of digital twins for different chemical systems that facilitate better online process monitoring, control, and real-time process optimization [1, 2]. Although the majority of these digital twins rely on fully data-driven approaches such as deep neural networks (DNNs) and a few examples of recurrent neural networks (RNNs) to account for the time-series dynamics of chemical systems, the hesitance to deploy black-box tools in practice due to safety and operational concerns has hindered their widespread adoption.

To tackle this conundrum, hybrid models combining first-principles physics-based dynamics with data-driven components have gained popularity as a ‘best of both the worlds’ approach [3]. The first-principle module typically includes mass and energy balance equations, rate kinetics, and population dynamics (if applicable), while the data-driven component estimates system-specific parameters (e.g., kinetic rate constant, selectivity, separation coefficient)that serve as inputs to the first-principle module [4,5]. However, existing hybrid models, predominantly based on DNNs, require a priori information of kinetic equations to be combined with the estimated parameters. This presents a major implementation hurdle, as obtaining precise kinetic information for complex systems (e.g., crystallization, fermentation, heterogenous catalysis) is often difficult. Furthermore, DNN-based hybrid models struggle to make accurate time-series predictions, a crucial aspect of an accurate digital twin for process monitoring or control application. To resolve these two challenges, an alternative hybrid modeling paradigm is needed that can (a) utilize process data to approximate the underlying kinetic function, and (b) accurately capture short and long-term system state evolutions.

In recent years, there has been significant progress in the development and online implementation of transformer-based large-language models (LLMs) such as ChatGPT, CodeGPT, Jarvis, and others for various applications. These models, powered by a multiheaded attention mechanism (MAM), excel at understanding contextual information and short and long-term dependencies in natural language processing (NLP) tasks due to their remarkable ability to learn a language’s underlying syntax [6-8]. However, directly applying these approaches to hybrid modeling of chemical systems is a non-trivial task.

To address these challenges, we propose a first-of-a-kind hybrid time-series transformer model for chemical systems, with a focus on crystallization. Specifically, the first-principle module incorporates system-agnostic dynamics (i.e., mass-energy balance and population balance equations), while a data-driven transformer model approximates the system-specific functional form of growth and nucleation dynamics, enabling dynamic coupling with the first-principle module. Since the existing vanilla transformer architecture utilizes a DNN for approximating the nonlinearities between static input and output variables [8], it is not well-suited for providing multivariate time-series predictions. Therefore, we integrate long-short-term-memory networks with the TST architecture to create a novel TST-LSTM framework that combines transformers’ remarkable ability to learn underlying system dynamics and functionality with LSTM’s superior performance in time-series predictions. The resulting hybrid-TST-LSTM model consists of a first-principles module containing generalized mass, energy, and population balance equations (PBE), and a TST-LSTM model that uses state information (i.e., concentration, temperature, crystal moments, etc.) to estimate the functional form of growth and nucleation. The developed hybrid model is trained, tested, and validated for simulating a batch crystallization system, using a high-fidelity model and experimental observations. Furthermore, the hybrid-TST-LSTM model is integrated within a model predictive controller (MPC) for set-point tracking of crystal size distribution.

Overall, the combination of transformers’ unique ability to learn complex system dynamics and functions, LSTM’s superior temporal prediction capabilities, and first-principle modeling offers exciting opportunities for the development of accurate hybrid model-based digital twins for various chemical systems. Moreover, the developed framework can be extended to various chemical systems, including fermentation, reaction engineering, and catalysis. Given the promising prospects, the future holds immense potential, and indeed bright, and we look forward to seeing the transformative impact of transformer-based hybrid models within the chemical industry.

Literature Cited:

  1. Ogumerem, Gerald S., and Efstratios N. Pistikopoulos. "Parametric optimization and control toward the design of a smart metal hydride refueling system." AIChE Journal 65.10 (2019): e16680.
  2. Chen, Yingjie, and Marianthi Ierapetritou. "A framework of hybrid model development with identification of plant‐model mismatch." AIChE Journal 66.10 (2020): e16996.
  3. Wang, Chang, et al. "Deeppipe: A hybrid model for multi-product pipeline condition recognition based on process and data coupling." Computers & Chemical Engineering 160 (2022): 107733.
  4. Shah, Parth, et al. "Deep neural network-based hybrid modeling and experimental validation for an industry-scale fermentation process: Identification of time-varying dependencies among parameters." Chemical Engineering Journal 441 (2022): 135643.
  5. Bangi, Mohammed Saad Faizan, and Joseph Sang-Il Kwon. "Deep hybrid modeling of a chemical process: Application to hydraulic fracturing." Computers & Chemical Engineering 134 (2020): 106696.
  6. Brown, Tom, et al. "Language models are few-shot learners." Advances in neural information processing systems 33 (2020): 1877-1901.
  7. Devlin, Jacob, et al. "Bert: Pre-training of deep bidirectional transformers for language understanding." arXiv preprint arXiv:1810.04805 (2018).
  8. Vaswani, Ashish, et al. "Attention is all you need." Advances in neural information processing systems 30 (2017).