# (575a) Sampling Domain Reduction for Surrogate Model Generation – Applied to Hydrogen Production with Carbon Capture

#### AIChE Annual Meeting

#### 2019

#### 2019 AIChE Annual Meeting

#### Computing and Systems Technology Division

#### Data Driven Optimization

#### Wednesday, November 13, 2019 - 3:30pm to 3:49pm

The definition of the sampling

domain impacts both the performance of a surrogate model and the number of

sampling points required to obtain a satisfactory fit. The most commonly

applied approach to limit the sampling domain is by use of simple box

constraints for each of the independent variables. This leads to a choice

between two problems; 1. The chosen box constraints are tight, which limits the

applicability of the surrogate model, and 2. the chosen box constraints are

large, causing weak bounds on the relevant sampling domain and thereby sampling

in operating regions never encountered in the application of the surrogate

model. The latter is particularly a problem in chemical engineering, in which the

component flow rates are normally depending on each other. Here, the

application of box constraints and an adaptive sampling algorithm may result in

extensive sampling in regions far outside the nominal operating conditions.

This, in turn, may cause sampling in regions that exhibit highly nonlinear characteristics

that are not relevant or prevailing in the regions of nominal operations

conditions [1].

Structured sampling domain reduction through incorporation of constraints from

known physical relations between the chosen independent variables may

significantly improve the numerical efficiency of adaptive surrogate model

generation.

If an inlet stream to the

surrogate model is the overall feed to the system, it is straightforward to

implement proportional or inverse proportional dependencies in-between the

component flow rates [2].

This already limits the size of the sampling domain to relevant regions, as compositions

far away from the nominal operation conditions are rarely encountered. However,

this approach fails if the feed stream to the surrogate model is the product of

a chemical reaction or, to a lesser extent, the product of a separation. As a

motivating example, consider the product stream of a steam methane reformer (SMR)

which is fed to the water-gas shift reactors (WGS). Two contradictory

conclusions may be drawn for the dependency between methane and hydrogen:

1. the more methane is in the feed to the water-gas shift reactors,

the more hydrogen is in the feed due to a larger inlet flow rate of methane to the

steam methane reformer while maintaining a similar extent of reaction

(proportional dependency);

2. the more methane is in the feed to the water-gas shift reactors,

the less hydrogen is in the feed due to a reduced extent of reaction in the

steam methane reformer (inverse proportional dependency).

However, it is not possible to draw

a conclusion on the exact nature of the dependency. Hence, we propose to use a

data-driven approach to solve this problem and identify a constrained sampling

domain of relevance that is, regions which can be achieved in practice based on

the outlet conditions of the previous unit operations. In the first step of the

approach, it is therefore necessary to sample points for the feed composition

of the surrogate model with components. Here, the previous

unit operations are used for creating outlet points. Based on the already sampled

points (denoted by superscript *cal*),

it is possible to calculate in total inequality constraints

given by

The first set of inequality constraints corresponds to box

constraints and define upper and lower bounds of each component flow rate. The

second set of constraints limits the ratios of two component flow rates,

whereas the last set limits the sum of two component flow rates. In the context

of the outlined issues above, the second set of constraints provides bounds on

proportional dependencies in-between the component molar flow rates, whereas

the third set provides bounds on inverse proportional dependencies. Note that

these inequalities always define a convex set and thus polytopic constraints in

.

The sampling of

the points requires evaluations of the previous sections in the detailed model,

*e.g.* the SMR in the case of sampling for a WGS. This can act as

showstopper for the reduction of the sampling domain. However, if a surrogate

model has been fitted to the previous section, then this surrogate model can be

used for the calculation of the inequality constraints at limited computational

expenses. This is for example the case in the procedure outlined in [3]

and as well a part of the philosophy of the ALAMO approach [4].

While implementation of inequality

constraints in the sampling is in general relatively straight forward, the

complexity depends on the chosen sampling approach. Adaptive sampling

algorithms frequently utilize a black-box solver for finding regions for optimal

sampling of the simulator model due to a lack of access to the code of the

simulator. This approach enables addition of inequality constraints for sampling

domain reduction, provided that the black-box solver admits general constraints.

Static (predefined) sampling approaches require, however, that points are

placed within the bounds directly. One possibility is to use only the box

constraints for defining a set of sampling points, discard all points which are

infeasible, and then select an optimal subset of the feasible points. This

approach is utilized in the ARGONAUT algorithm [5],

but may result in a large fraction of discarded points. We implement an

iterative surrogate-model generation approach, using a LASSO-based approach [6]

with polynomial basis functions for surrogate fit and complexity reduction of

the surrogate model, together with an adaptive sampling technique with linear

constraints for reducing the sampling domain as described above.

The sampling domain reduction is applied

to a model of a SMR in Aspen HYSYS. A WGS reactor is located after the SMR and

shall be modelled as a new surrogate model. Five chemical components in the outlet

of the SMR are identified to have dependencies, CH_{4}, H_{2}O,

CO, CO_{2}, and H_{2}. If proportional dependencies are

incorporated in the feed to the SMR, it is possible to reduce the sampling

domain to 1 % of the size of the box constraints, whereas if we do not

incorporate proportional dependencies in the feed to the SMR, we can reduce the

sampling domain to 14 % of the total sampling domain. Figure 1 is

illustrating the dependencies between steam, methane, and hydrogen based on the

sampled points and the inequality constraints when proportional dependencies

are incorporated in the feed to the SMR. These dependencies are especially

pronounced between hydrogen and steam showing the necessity to incorporate

structured sampling domain reduction to improve the sampling for surrogate

model generation.

Figure 1:

Illustration of the dependencies in-between a)

methane and steam, b) methane and hydrogen, and c) steam and hydrogen including

the bounds illustrated in Eqs. (1)-(3).

**References:**

[1] J. Straus and S. Skogestad, “Surrogate

model generation using self-optimizing variables,” *Comput. Chem. Eng.*,

vol. 119, pp. 143–151, Nov. 2018.

[2] J. Straus and S. Skogestad, “Use of

Latent Variables to Reduce the Dimension of Surrogate Models,” in *Computer
Aided Chemical Engineering*, 2017, vol. 40, pp. 445–450.

[3] J. Straus and S. Skogestad, “Minimizing

the complexity of surrogate models for optimization,” in *Computer Aided
Chemical Engineering*, 2016, vol. 38, pp. 289–294.

[4] A. Cozad, N. V. Sahinidis, and D. C.

Miller, “Learning surrogate models for simulation-based optimization,” *AIChE
J.*, vol. 60, no. 6, pp. 2211–2227, Jun. 2014.

[5] F. Boukouvala and C. A. Floudas,

“ARGONAUT: AlgoRithms for Global Optimization of coNstrAined grey-box

compUTational problems,” *Optim. Lett.*, vol. 11, no. 5, pp. 895–913, Jun.

2017.

[6] H. Zou, “The Adaptive Lasso and Its

Oracle Properties,” *J. Am. **Stat. Assoc.*, vol. 101, no. 476, pp. 1418–1429, Dec. 2006.

### Checkout

This paper has an Extended Abstract file available; you must purchase the conference proceedings to access it.

### Do you already own this?

Log In for instructions on accessing this content.

### Pricing

####
**Individuals**

AIChE Pro Members | $150.00 |

AIChE Graduate Student Members | Free |

AIChE Undergraduate Student Members | Free |

AIChE Explorer Members | $225.00 |

Non-Members | $225.00 |