(516h) Smart Perfusion Machines | AIChE

(516h) Smart Perfusion Machines

Authors 

Lucia, A. - Presenter, University of Rhode Island
Machine perfusion (MP) is an engineered system used to preserve organs for transplant and has been tested in clinical trials. At first glance, one would think it has nothing to do with chemical engineering. However, a closer look shows that MP contains many fundamental aspects of chemical engineering – mass balancing, reactions/reaction networks, thermodynamics, heat transfer, and recycle. For example, in liver preservation by machine perfusion there’s an interconnected reaction network (i.e., a liver), mass conservation, flow in and out, solubilities of nutrients and oxygen in the perfusate (i.e., thermodynamics), and recirculation of the perfusate (i.e., heat transfer and recycle). The primary goals of machine perfusion are to expand the organ donor pool and maximize preservation time while maintaining organ viability. The system variables that can be adjusted at any time are the amounts of nutrients added, the chamber temperature, and the O2 flow. MP has the following problem attributes: (1) decision-making, (2) combinatorial complexity, and (3) uncertainty (i.e., noisy data, partial domain coverage). Moreover, there is no general agreement on MP protocols and the field is moving toward dynamic intervention. Thus, we can add to the list (4) ad hoc/sub-optimal protocols. The objective in this work is to use machine learning to develop ‘smart’ policies for MP that maximize machine perfusion (or preservation) time.

Current derivative methods for policy improvement in machine learning attempt to maximize expected return based on gradient information (i.e., policy gradient methods) and are preferred in applications with uncertainty and complex continuous states and actions, as is the case here. However, policy gradient methods are local methods, often slow to converge, and can get trapped in local maxima. Therefore, in this work, we use global optimization methods to generate ‘smart’ policies and, as a result, provide globally optimal protocols for MP. To do this, we use the terrain/funneling methods of Lucia and coworkers (2001, 2004, 2008) to maximize expected return as a function of states and actions at each time step in policy-based reinforcement learning.

The specific example studied in this work is the resuscitation of ischemic livers using machine perfusion and a key challenge in applying machine learning to MP is that the initial state of each liver is different – containing varying amounts of metabolites, cofactors, and enzymes and much of this information is unknown. Therefore, a technician overseeing machine perfusion is faced with a new challenge each time. Standard MP protocols usually involve placing the liver in static cold storage (SCS) at 4 C, then performing MP at a specified temperature for some designated period of time that’s constrained between 4 and 24 hours. Current research favors MP temperatures in the range 21 – 37 C and the average perfusion time is 9 hours.

Using the currently accepted protocol of SCS followed by MP as a starting point, we present numerical results that illustrate proof-of-concept that policy-based reinforcement learning can be used to create ‘smart’ policies for MP. It is shown that policies can have multiple maxima and that global optimization is the only way to find truly optimal policies. For example, the policy optimization approach developed in this work shows that temperatures in the sub-normothermic range (21 – 34 C) are preferred over temperatures in the normothermic range (35-38 C). In addition, we show that liver viability constraints such as maintaining pH > 7.3 can have an impact on policy optimization and perfusion times. Several other numerical illustrations are presented to elucidate key ideas and show improvement in MP performance.