(273e) Deep Knowledge Versus Deep Learning | AIChE

(273e) Deep Knowledge Versus Deep Learning

Authors 

Rollins, D. Sr. - Presenter, Iowa State University
Ghasemi, P., Iowa State University
Developing mathematical models that causally (deterministically) map measured process inputs to critical response variables (i.e., outputs) that depend on these inputs is acute to optimum process operations. This objective is coined “process modeling” for the purpose of this work. Due to advancements in sensor technology, the number of process variables and sampling frequencies have increased considerably in the last decade. Big data advancements in process (i.e., plant) data have created renewed interests in empirically-based process modeling, i.e., the use of model structures that have no theoretical basis but a form that allows accurate mapping of measured inputs into the response space. Real applications of empirical modeling have established high risks modeling inaccuracies when going outside the input region used to develop the fitted response behavior, i.e., extrapolation. An empirical approach that has gained renewed popularity is artificial neural networks (ANN). Researchers have contended that big data have potentially expanded the successful applications of “Deep Structured Learning” also known as “Deep Learning (DL)” (Schmidhuber, 2015, LeCun, et al., 2015). The basic hypothesis of DL is that with very large data sets, an ANN structure can be widely expanded (i.e., increasing the number of nodes and hidden layers) and this will lead to more successful process modeling. This hypothesis has been tested on simulated processes, which are quite limited in representing critical conditions of real processes. These limitations include highly correlated inputs, unmeasured disturbances, measured inputs that do not affect the response but are significantly correlated with inputs that do, among other real modeling input characteristics. Thus, to establish the effectiveness and claims of DL ANN, its modeling must be done on real chemical processes of large multiple-input, complex, and dynamic response behavior, where the model is shown to be affective over a prolong period of time for a particular process.

In Rollins et al.(2015), a “deep knowledge (DK)” modeling approach was applied to a pilot distillation column with nine (9) measured inputs. This work defines DK as the use of highly structured input mapping to the response space that uses differential equations with physically based theoretical structures evidenced by having model parameters with physically interpretable understanding with the ability to set mathematical constraints on them based on well-established first principle dynamic modeling understanding. In the work, the response is the top tray temperature, and a Wiener modeling approach is used that gives excellent test results on eight (8) “freely existing” data sets over a three-year period. A “freely existing” data set means that no effort was made to intelligently change the input variables based on an optimal experimental design methodology. Actually, Rollins et al. arbitrarily selected all the data sets from this column’s historical database that contained data sets created by undergraduate chemical engineering students learning how to run the column or running the column for data collection for a lab course in unit operations. The DK methodology of Rollins et al. also embraces a deliberate approach for maximizing information content and input causality based on a comprehensive Jacobian Matrix analysis to identify model weaknesses in the nature of input relationships. The Rollins et al. Wiener results of eight test cases represent the challenge for DL ANN.

This work applies DL ANN to the eight test cases and the results are fourth coming. In addition, for comparison, and to improve ANN modeling, this work applies a principal component neural network (PCNN) methodology that the authors developed successfully in real data study modeling of critical physical properties using asphalt core samples (Ghasemi, et al., 2018a, b). Ghasemi, et al. developed this methodology to fit ANN structures using orthogonal inputs in a DK approach to maximize information in a Jacobian matrix fashion. This is the first application of PCNN known to the authors of this work.

  1. Schmidhuber, Jürgen. "Deep learning in neural networks: An overview," Neural Networks 61 (2015), 85-117.
  2. LeCun, Yann, Yoshua Bengio, and Geoffrey Hinton. "Deep learning," Nature 521.7553 (2015), 436.
  3. Rollins, D. K., A. K. Roggendorf, Y. Khor, Y. Mei, P. Lee and S. Loveland, “Dynamic Modeling With Correlated Inputs: Theory, Method and Experimental Demonstration,” Ind. Eng. Chem. Res. 2015, 54(7), 2136-2144.
  4. Ghasemi, P., M. Aslani, D. K. Rollins, R. C. Williams and V. R. Schaefer, "Modeling Rutting Susceptibility of Asphalt Pavement Using Principal Component Pseudo Inputs in Regression and Neural Networks," International Journal of Pavement Research and Technology, https://doi.org/10.1016/j.ijprt.2018.01.003.
  5. Ghasemi, P., M. Aslani, D. K. Rollins and R. C. Williams, "Principal component analysis-based predictive modeling and optimization of permanent deformation in asphalt pavement: elimination of correlated inputs and extrapolation in modeling," Journal of Structural and Multidisciplinary Optimization, 2018, 1-19.