(761c) Multi-Stage Stochastic Programming Models for Pharmaceutical Clinical Trial Planning | AIChE

(761c) Multi-Stage Stochastic Programming Models for Pharmaceutical Clinical Trial Planning


Cremaschi, S., Auburn University
Pharmaceutical industry is a global business with over one trillion U.S. dollars per year market with extensive supply chains throughout the world [1]. A potential drug identified at discovery stage goes through pre-clinical testing. The goal of these laboratory and animal model studies is to understand how the drug works and assess its safety. The clinical trials aim to demonstrate the safety and efficacy of the potential drug, and are designed with and carried out under strict guidelines and supervision of regulatory bodies. If a drug successfully completes the clinical trials and is approved by the regulatory bodies, the drug is manufactured and distributed to the market. Pharmaceutical manufacturers are under pressure to improve the efficiency of the pharmaceutical R&D pipeline, partially because the patent protections of a number of significant brand-name drugs will soon expire [2].

Managing the scheduling and planning of clinical trials is one of the efficient ways to reduce the cost of developing new drugs. There are three phases in clinical trials. The goal of Phase I clinical trials is to assess the safety of the drug and to understand how it is metabolized in the body. Phase II clinical trials are used to evaluate the drug’s effectiveness and short-term side effects on a limited number of target patient volunteers. Phase III clinical trials aim to assess the benefit-risk ratio of the drug using a large number of target patient volunteers (in the order of thousands) [3]. The clinical trial planning is complicated due to its highly stochastic nature: the pharmaceutical companies do not know which drugs will successfully complete clinical trials a priori. The outcomes of clinical trials significantly influence drug development plan, the investments, and overall profits. An effective approach to solve clinical trial planning problem employs multi-stage stochastic programming (MSSP).

Colvin and Maravelias [4] developed a MSSP model for clinical trial planning. They explicitly incorporated the impact of endogenous uncertainty [5], i.e., the outcome of a clinical trial is only revealed once that clinical trial is completed, to their model. In a later study, they exploit the structure of the problem to reduce the number of scenarios, and extend their model to account for resource planning by introducing outsourcing decisions [6]. developed a sample average approximation (SAA) based algorithm to generate candidate solutions for the MSSP [7]. Colvin and Maravelias [8] introduce a number of theoretical properties, which reduce the problem size and improve the tightness of the formulation. Then, they develop a novel branch and cut algorithm to solve the resulting problem efficiently.

Motivated by above, we propose two new MSSP formulations for pharmaceutical clinical trial planning problem. Givens are: (1) a set of candidate drugs (i ϵ I)that should go through a set of clinical trials (j ϵ J), (2) the length of the planning horizon, which is discretized into equal time periods t = 1, 2, 3…T (period t starts at time t-1 and ends at time t), (3) the lengths, resource requirement, and costs associated each clinical trial for each drug, and (4) potential revenue of each drug if it successfully completes all clinical trials. The decisions are which clinical trials of drugs to carry out, and when to start the clinical trials. The objective is to maximize the expected net present value (ENPV) [4]. The clinical-trial outcome uncertainty of each drug is defined by parameter, whose outcome space is {I-fail, II-fail, III-fail, III-pass}. The scenarios for the MSSP model are constructed as Cartesian product of uncertain parameter outcomes, and, the total number of scenarios for a problem with |I| drugs is |S|= 4|I|.

The first formulation, which will be referred to as CM1, uses two binary decision variables, Xi,j,p,s and Yi,j,t,s. The first binary variable, Xi,j,p,s, tracks when a drug starts a clinical trial. Formally, it is equal to 1 if drug i starts clinical trial j at time period p in scenario s. The second binary variable, Yi,j,t,s, tracks when a clinical trial is completed, and is equal to 1 if drug i completes clinical trial j at time period t in scenario s. The constraints ensure that (1) drug i starts clinical trial j only once, (2) drug i finishes clinical trial j only once, (3) clinical trial j+1 for each drug i cannot be completed before its previous trial j, and (4) the utilized resources at any given time period remains within the available resources. The second formulation, which will be referred to as CM2, introduces one more binary decision variable, Wi,j,t,p,s, which tracks both start and end time periods of drug i clinical trial j. This decision variable, is equal to 1 if (drug, clinical trial) pair (i, j) starts at time period p and end at time period t. The new binary variable satisfies the following Wi,j,t,p,s  <=> Xi,j,p,s ˄ Yi,j,t,s.

We applied both formulations to solve 60 different instances [9] of clinical trial planning problem. For comparison, all instances were also solved using the MSSP formulation of [4], which will be referred to as CM3 here. All models were implemented in Pyomo [10], and solved using CPLEX 12.6.3 on Auburn University Hopper Cluster. All problems were solved to 0.1% optimality gap with all three formulations. In general, the solution times of CM3 were the longest, and the solution times of CM2 the shortest. For example, CM2 only took 8114 CPUs to solve the six-product case, while CM1 12991 CPUs, and CM3 108746 CPUs. Similar improvements were observed for all problems solved, and the differences in solutions times are more pronounced for larger instances of the problem. Furthermore, most problems with all formulations were solved at the root node. For instances where CPLEX branched, CM2 consistently required fewer branches than CM1 and CM3 to close the optimality gap, suggesting that the additional binary variable provided CPLEX a more efficient branching variable.


This work was completed in part with resources provided by the Auburn University Hopper Cluster. The authors would like to acknowledge financial support from the NSF CAREER Award #1623417.


1. PhRMA, 2016 Biopharmaceutical Research Industry Profile, Pharmaceutical Research and Manufacturers of America. 2016, PhRMA: Washington, DC.

2. IMS, Institute global use of medicines: outlook through 2016. Future Prescr, 2013. 14(1): p. 2-10.

3. FDA. FDA Drug Approval Process - US Food and Drug Administration. 2017 [cited 2017 30 January 2017]; Available from: http://www.fda.gov/downloads/Drugs/ResourcesForYou/Consumers/UCM284393.pdf.

4. Colvin, M. and C.T. Maravelias, A stochastic programming approach for clinical trial planning in new drug development. Computers & Chemical Engineering, 2008. 32(11): p. 2626-2642.

5. Goel, V. and I.E. Grossmann, A Class of stochastic programs with decision dependent uncertainty. Mathematical Programming, 2006. 108(2/3): p. 355-394.

6. Colvin, M. and C.T. Maravelias, Scheduling of testing tasks and resource planning in new product development using stochastic programming. Computers & Chemical Engineering, 2009. 33(5): p. 964-976.

7. Solak, S., et al., Optimization of R&D project portfolios under endogenous uncertainty. European Journal of Operational Research, 2010. 207(1): p. 420-433.

8. Colvin, M. and C.T. Maravelias, Modeling methods and a branch and cut algorithm for pharmaceutical clinical trial planning using stochastic programming. European Journal of Operational Research, 2010. 203(1): p. 205-215.

9. Christian, B. and S. Cremaschi, Variants to a knapsack decomposition heuristic for solving R&D pipeline management problems. Computers & Chemical Engineering, 2017. 96: p. 18-32.

10. Hart, W.E., et al., Pyomo–optimization modeling in python. Vol. 67. 2012: Springer Science & Business Media.