(177b) Utilizing Deep Reinforcement Learning for Supply Chain Materials Planning | AIChE

(177b) Utilizing Deep Reinforcement Learning for Supply Chain Materials Planning

Authors 

Hubbs, C. D., The Dow Chemical Company
M. Wassick, J., The Dow Chemical Company
Amaran, S., The Dow Chemical Company
This paper presents a novel approach to supply chain planning optimization through the utilization of deep reinforcement learning and compares it to modern, mathematical optimization techniques. The supply chain is modeled as a factory with multiple products and stochastic demand for each product. The deep reinforcement learning agent interacts with this environment to learn by experience, as captured in neural networks, to determine which product to produce and when to produce it to satisfy customer demand. The neural networks approximate the value of each state with respect to a value function that accounts for on time delivery, the level of inventory on hand, and the customer service level. The reinforcement learning agent learns through simulations of the manufacturing environment and demand patterns in order to maximize the reward received through the value function. In parallel, we formulate a mixed-integer program that has to make identical decisions under the same constraints. The mixed-integer program can see orders out to a certain horizon, and does not use any information from historical order patterns. We compare and contrast the performance between the two methodologies on several scheduling test examples, and suggest how to choose between them, as well as how they may be used in concert.