(380d) Process Superstructure Optimization Using Surrogate Models

Authors: 
Maravelias, C. T. - Presenter, University of Wisconsin - Madison
Henao, C. A. - Presenter, University of Wisconsin - Madison


Introduction and Background

Current chemical processes synthesis methodologies can be classified in two different categories: the more traditional sequential-conceptual methods and the more systematic superstructure optimization-based methods. The sequential methods are based on the existence of a natural hierarchy among the engineering decisions to be made in order to obtain a fully defined process structure. Such an approach is highly popular since it reduces the complexity of the synthesis problem, leading to a design procedure in which the main subsystems of the plant are designed one at a time, disregarding the two-way interaction between decisions made at different stages. On the other hand, in superstructure optimization-based methods, a process structure composed by all potentially useful unit operations and all relevant interconnections between them is initially considered. Then, a mathematical model of such ?superstructure? (including unit models, interconnection equations, and thermodynamic property calculation equations) is incorporated into an optimization problem whose solution ultimately indicates which of the initial units and interconnections are to be kept as well as the values of the operational conditions. From a theoretical point of view, the second kind of methods are more powerful since they pursue a simultaneous determination of the optimum structure and operational conditions, thus accounting for all the complex interactions between design decisions. But of course, this power comes at a price; that is, the mathematical complexity of the resulting optimization models, generally large-scale non-convex mixed-integer non-linear programs (MINLP)

Since much of the complexity of the mentioned MINLPs comes from the equations describing the behavior of the process units included in the superstructure (e.g. reaction kinetic expressions, thermodynamic property calculation equations, etc.), it is advantageous to replace complex unit models, and even entire plant sub-subsystems with surrogate models. This approach follows a general trend in engineering design motivated by the existence of highly accurate although computationally expensive computer programs which can simulate the behavior of particular engineering systems. In the case of chemical process engineering, a commercial process simulator can be used to generate sets of simulation cases of particular process units. Later, these results can be fitted with general-purpose multivariable mappings, which can be used instead of the original complex unit models in generic process optimization problems.

Even though some authors have used different surrogates to approximate detailed unit operation models, the application of such techniques to the solution of superstructure optimization problems has not yet been explored in depth. Furthermore, there is no systematic treatment of aspects such as the selection of independent-dependent surrogate variables, nor a systematic method for the reformulation of unit surrogate models so they can be incorporated into the associated superstructure MINLP model. These aspects are considered in this work.

Surrogate model variable selection

A reduction in the mathematical complexity of a superstructure MINLP via replacement of unit models with surrogates can come from a reduction in the total number of variables, equations and the non-linearities in them. In general, a unit model establishes an implicit relation f(X, Y)=0 between a set of independent variables X and the remaining variables Y. Here, X can be seen as a variable specification patter which transforms the model into a square and solvable equation system f (X=fixed, Y)=0. Now, for a surrogate model to enforce the same relations the original model enforces among the MINLP variables, it has only to include the subset of ?linking? variables C connecting the unit model to the rest of the optimization problem (e.g. unit variables appearing in other unit models, other constraints or the objective function). In general, the selection of variables in X is not unique and the way to reduce the number of surrogate variables (and hence the number of variables in the final MINLP reformulation) is by maximizing the intersection between the set of independent variables X and the set of ?linking? variables C. This is true because the independent surrogate variable Xs are the same as X, and since the only requirement for the surrogate variable set is to include all variables in C, the dependent surrogate variables Ys are just the variables in C not included in Xs.

As previously indicated, the selection of variables in X is not unique, but in all valid cases it has to lead to a structurally non singular equation system. The literature presents methodologies to check the structural non-singularity of equation systems based on the bipartite graph between the set of equations E={1, 2, ... , e, ..., m} and the set of non-specified variables V={1, 2, ... , v, ..., m}, where an edge (e, v) exists if equation "e" contains variable "v". In this work we present a method based on a maximum matching optimization problem, to guide the selection of the variables in X = Xs. It is based on the fact that the existence of a perfect matching between equations and variables is both necessary and sufficient for structural non-singularity. Advantages of this alternative selection strategy includes the possibility of favoring the selection of certain variables over others by using selection preferences, and the possibility of enforcing the selection of variables already specified by the synthesis problem statement (e.g. product purity specifications, capacity, etc.) With this in mind, it is possible to favor the selection of unit independent variables according to the standard options in some process simulators, while checking at the same time the feasibility of the synthesis problem specifications.

Surrogate model construction

Once the independent and dependent variable sets Xs and Ys are identified for a particular unit, the construction of its surrogate model starts by sampling the Xs space. These sample points are used to specify different simulation cases, whose results are then fitted using a general-purpose multivariable mapping. In this work, we have used Multi Layer Perceptrons (MLPs) mappings due to their excellent fitting characteristics, low complexity and because they can be easily reformulated to include them in superstructure MINLPs. However, the proposed methodology can be combined with any other multivariable mapping approach. The original samples are generated using a variance reduction technique, such as Latin Hypercube, followed by scaling and principal component analysis. This is meant to reduce the total number of samples required as well as the MLPs training burden. Also, Bayesian regularization and early stopping where implemented within the training procedure in order to avoid over-fitting, while enhancing the generalization capabilities of the resulting network. This is important when dealing with noisy data and small data sets.

For this project a MATLAB code including a MATLAB-(APEN PLUS) interface has been developed to automatically generate samples of the independent variable space, execute the necessary APEN PLUS simulation cases, retrieve the results, and train the MLPs.

Surrogate model reformulation

Any unit model (surrogate or not) to be included as a part of an MINLP formulation has to be reformulated to allow activation-deactivation though binary variables. In the case of MLPs, the reformulation that allows this activation-deactivation is very simple. Consider a two layer perceptron surrogate mapping the Xs space into the Ys space according to: Ys=w2*tanh(w1*Xs+b1)+b2. Here w1,w2 are layer weight matrices and b1,b2 layer bias vectors. Taking advantage of the sigmoid function behavior, particularly the fact that tanh(0) = 0, it is possible to reformulate the surrogate model as follows Ys=w2*tanh(w1*Xs+b1*s)+b2*s, 0<=s<=*s, where U is a vector containing the upper bound on the components of Xs (considered here to be non-negative for simplicity), and "s" is a binary selection variable. When s=1 the original surrogate relation is enforced. When s=0 the reformulated constraints lead to the deactivation of the model (i.e. Xs=0, Ys=0).

The simplicity of this reformulation is another reason to use MLPs surrogate models in this project. Furthermore, since in this framework the multiple types of non-linearities in the original unit models are replaced by only one type of non-linearity (i.e. the MLP tanh function) we also discuss alternative reformulations to efficiently deal with this type of non-convexity in the final superstructure MINLP. Finally, a couple of examples are presented to illustrate the application of this framework in the synthesis of a chemical process for the production of Maleic Anhydride.