(409b) A Parallel Algorithm for Efficient Large-Scale Dynamic Optimization | AIChE

(409b) A Parallel Algorithm for Efficient Large-Scale Dynamic Optimization

Authors 

Hartwich, A. - Presenter, Bayer Technology Services GmbH


In nowadays' market conditions in the chemical industry, multi-product facilities, either multi-purpose batch processes or continuous processes operated in production campaigns, become more and more common. Hence, the operation of these processes does not only require optimal stationary but also optimal dynamic modes of operation. Dynamic optimization is considered a powerful tool to determine optimal control trajectories. However, the application of this method is bothered by the very high computational times in case of considering realistic models of compelte chemical plants. Therefore, besides continuing algorithmic improvement, parallel computing strategies have to be exploited. Such parallelisation strategies are presented in this contribution. Three examples of very different problem sizes are discussed. The speedup for the largest example even obtains a value close to the theoretical limit of the parallelization approach..

Dynamic optimization can be considered as a powerful tool, where most applications are more or less time-critical. In particular, the applicability of real-time implementations strongly depends on the computational times until the solution to the dynamic optimization problem is obtained. Applying the sequential approach, the infinite dimensional optimization problem is transformed into a finite dimensional optimization problem by means of an appropriate discretization, where the computation of the corresponding gradients, also termed sensitivities, of the latter optimization problem results in the largest part of computational times. Thus, the numerical integration of the combined state and sensitivity equation system represents the largest portion of computational effort. Nowadays' developments in computer hardware are not directed towards faster serial performance, but towards parallel computers. Hence, further reductions in computational times can only be accomplished if parallel computing is accounted for. Keeping and Pantelides (1998) and Zhu and Petzold (1999) have exploited BDF-type numerical integration routines in order to allow for distributed memory parallelization. Schlegel et al. (2004) have extended the linear-implicit extrapolation routine LIMEX in order to allow for the solution of sensitivity equation systems. The parallel implementation of this sensitivity integration algorithm is described in this work. As the greatest part of the arising computational effort is due to the sensitivity equation system, the parallelization as suggested in this work focuses on the additional computational effort due to the sensitivity equation system.

Three different parallelization strategies are assessed, where the extent to which information is communicated increases.

Distributed sensitivities (DS): The simplest approach for parallelization is the distributed computation of sensitivities, since the sensitivity equations are independent of each other and only dependent on the state system. All processes are assigned an equal number of sensitivity equation systems. This approach is very simple and requires communication only once per sensitivity integration. All processes have to perform the state integration, the computation of the Jacobians and the LU decomposition.

Distributed sensitivities with centralized Jacobian (CJ): The fundamental idea of this approach is to decouple the state integration and the sensitivity integration. The state integration requires one Jacobian per integration step, whereas the sensitivity integration additionally requires several Jacobians for the construction of the right hand sides of the sensitivity equations. However, the states, from which the Jacobians can be computed, are independent of the sensitivity system and can be obtained by a pure state integration. Hence, a preceding state integration can deliver the states and the Jacobians can be computed in the time span until the sensitivity integration reaches the according point on the time horizon. Thus, the computational burden to obtain Jacobian matrices for the sensitivity integration can be reduced to the communication overhead.

Distributed sensitivities with centralized Jacobian and communicated LU decomposition (CJLU): This approach follows the idea of (CJ). In addition, the LU decompositions, which are computed by the state integrator, are communicated to the sensitivity integrator.

The following results were obtained employing 8 CPUs.

Example 1: polymerization process (236 DAEs): The controls are discretized by 68 parameters. The model is considered very small. This can be seen at the very small computational time for a single sensitivity integration. Only strategy DS has been successfully applied to this example, since the other approaches have resulted in higher computational times compared to the serial code because of large overhead. Even though this example is very small and features only a small number of sensitivities, a speedup in the range of 2.02 has still been achieved. The serial sensitivity integration required 2.54 seconds, whereas strategy DS required 1.26 seconds.

Example 2: polymerization process (2104 DAEs): The controls are discretized by 197 parameters. For this example, strategy CJ turns out to be the fastest strategy, whereas DS gives slightly worse results and CJLU is significantly worse compared to the other approaches. The serial sensitivity integration required 33.27 seconds, DS 11.28 (speedup: 2.95), CJ 10.74 sec (speedup: 3.1), CJLU 17.55 sec (speedup: 1.9).

Example 3: large-scale intermediate chemicals process (13956 DAEs): The controls are discretized by 75 parameters. This example is considered the most important example for the studied parallelization strategies. In contrast to the other examples, the parallelization performance almost reaches its theoretical limit, which is the state integration without sensitivities. The large communication overhead that occurred for the other examples is comparably small, since the general computational times are large enough. Furthermore, the frequency of communication decreases. The serial sensitivity integration required 3319 seconds, DS 2957 (speedup: 1.12), CJ 811 sec (speedup: 4.09), CJLU 892 sec (speedup: 3.72).

Checkout

This paper has an Extended Abstract file available; you must purchase the conference proceedings to access it.

Checkout

Do you already own this?

Pricing

Individuals

AIChE Pro Members $150.00
AIChE Graduate Student Members Free
AIChE Undergraduate Student Members Free
AIChE Explorer Members $225.00
Non-Members $225.00