(188o) Approximate Dynamic Programming for Nonlinear Process Control Under Uncertainty

Authors: 
Yang, Y., California State University Long Beach
Even though model predictive control (MPC) has become a dominated methodology for process control both in industry and academia, it still suffers from several drawbacks. First, it is difficult to find an optimal solution for large-scale nonlinear process with long control horizon. Second, the control performance cannot be guaranteed if uncertainties or model matches are not taken into account properly. To overcome these shortcomings, several strategies are proposed to handle online computational demands and uncertainties, such as fast MPC, robust MPC and stochastic MPC.

In this work, we focus on the approximate dynamic programming (ADP) [1] to improve MPC and develop a simple control strategy with lower computational complexities and better performance. ADP, or reinforcement learning in computer science community, is a data-driven method to solve multistage scheduling problem and overcome the curse of dimensionality for optimization in high dimensional space. The essential idea is extracting and generalizing useful information from sampled states trajectory to build a value function, which represents the control cost of current control policy. Then a better control action can be selected by minimizing the value function of process state in next step based on the model prediction. The key advantage of ADP is that only one-step prediction and online optimization is required such that computational burden is reduced significantly. Moreover, the model uncertainty can be compensated if the value function is constructed based on the real operational data. Different from conventional optimal control theory, ADP offers a reliable and practical way to improve the current control performance.

Clearly, the quality of proposed control approach is highly dependent on the value function, which is build on the post-state space to avoid considering uncertainty explicitly in the formulation. In this study, we utilize MPC to generate a number of states trajectories and then employ three different strategies to approximate value function for MPC. The first and also the easiest way is using polynomial function to fit the data, which requires to solve a least square problem. Second, convex regression is employed to build value function such that the desired steady state becomes the global minimum of the value function. Third, the hinge function, in a piecewise linear form, is used to approximate operational data with arbitrary accuracy. This will result in a mixed-integer linear programming (MILP) problem and an efficient computational strategy is proposed to obtain a suboptimal solution.

Each strategy mentioned above has pros and cons. In the numerical study, we design and compare three controllers based on these strategies for two nonlinear process with additive noises.

Reference

[1] Lee JM, Lee JH, Approximate dynamic programming-based approaches for input–output data-driven control of nonlinear processes. Automatica, 2005; 1281-1288.