Continuous Control of a Batch Crystallization System with Deep Reinforcement Learning

Source: AIChE
  • Type:
    Conference Presentation
  • Checkout

    Checkout

    Do you already own this?

    Pricing


    Individuals

    AIChE Member Credits 0.5
    AIChE Members $19.00
    AIChE Graduate Student Members Free
    AIChE Undergraduate Student Members Free
    Computing and Systems Technology Division Members Free
    Non-Members $29.00
  • Conference Type:
    AIChE Spring Meeting and Global Congress on Process Safety
  • Presentation Date:
    August 19, 2020
  • Duration:
    20 minutes
  • Skill Level:
    Intermediate
  • PDHs:
    0.40

Share This Post:

The development of controllers for chemical process models requires careful examination of the process dynamics, development of mathematical models, tuning of parameters and formulation of objective functions suited to the target application. Reinforcement Learning (RL) provides an appealing alternative to this approach. In RL, once the problem has been transformed into the required framework, the algorithm learns automatically by interacting with the system and gathering data. Although the use of RL in process control has been attempted before, it has been viewed and explored with skepticism owing to problems like sparsity of learning signals in the task. However, on account of recent breakthroughs in deep learning, RL algorithms have become more flexible and capable of handling a wider range of problems.

In this work, we explore the application of a deep reinforcement learning controller to the batch crystallization of the sodium chloride, water, ethanol anti-solvent system. The ultimate objective of the controller is to obtain a target distribution of crystal size at the end of the batch by manipulating the temperature of the system and flow rate of the anti-solvent. To do this, we implement a deep neural-network based controller with an actor-critic structure that is trained to maximize the long term reward at each sampling time. The network is trained on a Fokker-Planck model whose parameters are estimated by dynamic optimization, and the network learns in this environment by exploration. We evaluate the performance of the controller against baselines established by classical control techniques and finally analyze the advantages and disadvantages of this framework.

Presenter(s): 
Once the content has been viewed and you have attested to it, you will be able to download and print a certificate for PDH credits. If you have already viewed this content, please click here to login.

Checkout

Checkout

Do you already own this?

Pricing


Individuals

AIChE Member Credits 0.5
AIChE Members $19.00
AIChE Graduate Student Members Free
AIChE Undergraduate Student Members Free
Computing and Systems Technology Division Members Free
Non-Members $29.00
Language: