(367g) Constrained Subset Selection for the Regression of Multi-Component Helmholtz Energy Equations | AIChE

(367g) Constrained Subset Selection for the Regression of Multi-Component Helmholtz Energy Equations


Engle, M. - Presenter, Carnegie Mellon University
Sahinidis, N., Carnegie Mellon University
Accurate thermodynamic properties are important for the development of new processes and technologies and simulating an optimized flowsheet. Unfortunately, equations of state typically used, such as Peng Robinson and Soave-Redlich Kwong, are inaccurate in critical regions of new technologies. A recent development in equations of state uses first principles to determine one unifying equation explicit in Helmholtz energy to overcome these inaccuracies. These equations have been developed for over 100 pure substances and have been expanded to mixtures [1, 2, 3, 4].

A major challenge in the development of Helmholtz energy equations is the fitting of varying datasets containing correlated data to the main unifying equation [3]. The current fitting procedures depend on cycling between linear and nonlinear regression techniques that are restricted to equality constraints and result in local solutions requiring multiple starts or an experienced user manually selecting an initial starting point [4]. Advancing these techniques to eliminate the need for multistart heuristics and provide the ability to enforce inequality constraints on the resulting models would allow us to control the thermodynamic behavior, improve extrapolation behavior, and maintain thermodynamically feasible limitations on the regressed equation while simultaneously fitting the data [7].

In order to address these challenges associated with fitting data to Helmholtz energy equations, we have developed a model-building procedure from thermodynamic data and have expanded its capabilities to include mixtures. The main idea is to apply best subset selection and allow for the controlling of thermodynamic slopes with inequality constraints while fitting all the data simultaneously. A bank of terms is chosen to best represent the different phase regions and thermodynamic behavior. In order to avoid overfitting, our procedure systematically selects a subset of these terms to optimally fit the multiple thermodynamic property data sets. We rely on a global optimization solver [7] to find an optimal solution that optimizes the fit as well as the number of terms in the model according to an information criterion. The proposed approach was used to fit a new Helmholtz energy equation for carbon dioxide binary mixtures. The resulting models are compared to established models such as cubic equations of state, EOS-CG [5], and the GERG-2008 [6].

References cited

[1] Lemmon, E. W.; Huber, M. L.; McLinden, M. O. NIST Standard Reference Database 23: Reference Fluid Thermodynamic and Transport Properties-REFPROP, Version 9.1, National Institute of Standards and Technology. 2013; https://www.nist.gov/srd/refprop.

[2] Span, R. Multiparameter equations of state: An accurate source of thermodynamic property data; Springer-Verlag, 2000.

[3] Span, R.; Wagner, W.; Lemmon, E. W.; Jacobsen, R. T. Multiparameter equations of state - recent trends and future challenges. Fluid Phase Equilibria 2001, 183-184, 1-20.

[4] Lemmon, E; Tillner-Roth, R. A Helmholtz energy equation of state for calculating the thermodynamic properties of fluid mixtures, Fluid Phase Equilibria 1999, 165, 1-21.

[5] Gernert, G; Span, R. EOS-CG: A Helmholtz energy mixture model for humid gases and CCS mixtures. J. Chem. Thermodyn., 93, 274-293, 2016.

[6] Kunz, O.; Wagner, W. The GERG-2008 Wide-Range Equation of State for Natural Gases and other Mixtures: An Expansion of GERG-2004. J. Chem. Eng. Data 57, 11, 3032-3091, 2012.

[7] Tawarmalani, M.;Sahinidis, N. V. Global optimization of mixed-integer nonlinear programs: A theoretical and computational study, Mathematical Programming, 99, 563-591, 2004.