(386d) End-to-End Reinforcement Learning of Koopman Models for Economic Model Predictive Control

Conference

AIChE Annual Meeting

Year

2023

Proceeding

2023 AIChE Annual Meeting

Group

Computing and Systems Technology Division

Session

Data-driven Modeling, Estimation and Optimization for Control I

Time

Wednesday, November 8, 2023 - 1:24pm to 1:42pm

Authors

Mayfrank, D. - Presenter

Mitsos, A., RWTH Aachen University

Dahmen, M., FZ Jülich

Data-driven surrogate models for dynamic models are a promising way to make economic nonlinear model predictive control (eNMPC) viable by reducing the computational burden of the underlying optimal control problems [1]. System identification (SI) is the most common approach to training data-driven dynamic surrogate models, but it is narrowly focused on maximizing average prediction accuracy on a set of simulation samples. In contrast, dynamic surrogate models, trained directly for optimal performance in a control application using reinforcement learning (RL), were recently shown to outperform SI-trained models [2-6]. These findings, however, were restricted to the learning of linear models [2], applications not including state constraints [2-5], or primarily focused on the adaptation of bounds and cost function instead of the dynamic MPC model itself [6].

The vast majority of RL research focuses on learning model-free control policies. Such policies do not use system state predictions to determine the control actions. In contrast, learning a dynamic model and using its predictions to obtain a control law, as in eNMPC, has some advantages: First, no retraining is required if constraints or objective functions change, as long as the system dynamics remain identical. Second, learning the system dynamics may be more sample efficient than learning a sensible control policy [2,3]. Third, MPC has a rich theory regarding performance and stability guarantees, especially for linear models, and recent publications aim to extend this established theory to (learned) eNMPC [6,7].

We present a framework for end-to-end learning of nonlinear dynamic surrogate models for optimal performance in eNMPC applications with hard constraints on states. Specifically, we use applied Koopman theory and its extension to controlled systems [8] to obtain a model structure that can capture systems with nonlinear dynamics but gives rise to a convex MPC problem. We use post-optimal sensitivity analysis [9,10] to construct MPC policies whose control outputs can be differentiated with respect to the parameters of the surrogate models. This enables us to use state-of-the-art model-free RL algorithms like Proximal Policy Optimization [11] to train the dynamic surrogate models for optimal performance in eNMPC. We test our approach on two different case studies derived from a well-studied continuous stirred-tank reactor model [12,13]: (i) an NMPC case study where the controller shall stabilize the product concentration given a fluctuating product flow rate, and (ii) a demand response case study wherein an eNMPC shall minimize electricity costs subject to hard constraints on state variables. We compare the resulting control performances to those from dynamic models trained solely using SI and model-free policies trained using RL. We show that end-to-end trained dynamic surrogate models, like model-free policies, consistently outperform models trained by SI. Additionally, we find that, unlike model-free policies, the MPCs employing an end-to-end trained dynamic surrogate model can successfully adapt to constraint changes without the need for re-training.

[1] McBride, K., & Sundmacher, K. (2019). Overview of surrogate modeling in chemical process engineering. Chemie Ingenieur Technik, 91(3), 228-239.

[2] Chen, B., Cai, Z., & Bergés, M. (2019). GNU-RL: A precocial reinforcement learning solution for building HVAC control using a differentiable MPC policy. In Proceedings of the 6th ACM International Conference on Systems for Energy-Efficient Buildings, Cities, and Transportation (pp. 316-325).

[3] Amos, B., Jimenez, I., Sacks, J., Boots, B., & Kolter, J. Z. (2018). Differentiable MPC for end-to-end planning and control. Advances in Neural Information Processing Systems, 31.

[4] Yin, H., Welle, M. C., & Kragic, D. (2022). Embedding Koopman Optimal Control in Robot Policy Learning. In 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (pp. 13392-13399).

[5] Iwata, T., & Kawahara, Y. (2022). Data-driven End-to-end Learning of Pole Placement Control for Nonlinear Dynamics via Koopman Invariant Subspaces. arXiv preprint arXiv:2208.08883.

[6] Gros, S., & Zanon, M. (2019). Data-driven economic NMPC using reinforcement learning. IEEE Transactions on Automatic Control, 65(2), 636-648.

[7] Angeli, D., Amrit, R., & Rawlings, J. B. (2011). On average performance and stability of economic model predictive control. IEEE Transactions on Automatic Control, 57(7), 1615-1626.

[8] Korda, M., & MeziÄ‡, I. (2018). Linear predictors for nonlinear dynamical systems: Koopman operator meets model predictive control. Automatica, 93, 149-160.

[9] Fiacco, A. V., & Ishizuka, Y. (1990). Sensitivity and stability analysis for nonlinear programming. Annals of Operations Research, 27(1), 215-235.

[10] Agrawal, A., Amos, B., Barratt, S., Boyd, S., Diamond, S., & Kolter, J. Z. (2019). Differentiable convex optimization layers. Advances in Neural Information Processing Systems, 32.

[11] Schulman, J., Wolski, F., Dhariwal, P., Radford, A., & Klimov, O. (2017). Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347.

[12] Petersen, D., Beal, L. D., Prestwich, D., Warnick, S., & Hedengren, J. D. (2017). Combined noncyclic scheduling and advanced control for continuous chemical processes. Processes, 5(4), 83.

[13] Baader, F. J., Bardow, A., & Dahmen, M. (2022). Simultaneous mixedâ€integer dynamic scheduling of processes and their energy systems. AIChE Journal, 68(8), e17741.

Topics

Process Automation & Control

Computing and Systems Engineering

Plant Operations

Other Sites & Tools

Technical Groups

Technical

Professional/Personal Growth

Societal Needs

Leadership

International Congress on Sustainability Science & Engineering (ICOSSE '24) and RAPID Roadmapping Workshop

2024 Process Development Symposium

Upcoming Conferences & Events

International Congress on Sustainability Science & Engineering (ICOSSE '24) and RAPID Roadmapping Workshop

2024 Process Development Symposium

CCPS Latin America Regional Meeting (Portuguese)

Hydrogen: Myths and Realities

Process Automation to Circumvent Workforce Challenges in Chemical Manufacturing

CCPS Latin America Regional Meeting

MMM (Mining, Minerals, and Metals) Open Webinar

2024 Synthetic Biology: Engineering, Evolution & Design (SEED)

CCPS Pharma, Food, and Fine Chemicals Meeting, June 2024

CEP: May 2024

CEP: April 2024

Explore Areas of Advancement:

Learning Center:

Want to be an Entrepreneur? Personal Stories From Three Successful Entrepreneurs Who Have Traveled This Path.

(386d) End-to-End Reinforcement Learning of Koopman Models for Economic Model Predictive Control

AIChE Annual Meeting

2023

2023 AIChE Annual Meeting

Computing and Systems Technology Division

Data-driven Modeling, Estimation and Optimization for Control I

Wednesday, November 8, 2023 - 1:24pm to 1:42pm

Authors

Topics

More Conference Links

Visit Orlando

Universal Studios Offer

Cancellation Policy

Code of Conduct

Beware of Hotel and Attendee-list Scams

Code of Conduct

Beware of Hotel and Attendee-list Scams