# (183g) Stochastic Programming Methods for Problems Under Endogenous Observation of Uncertainty

#### AIChE Annual Meeting

#### 2009

#### 2009 Annual Meeting

#### Computing and Systems Technology Division

#### Advances in Optimization II

#### Tuesday, November 10, 2009 - 10:30am to 10:50am

Introduction

A multi-stage stochastic programming (MSSP) formulation requires two key elements: scenarios, s in S, representing all possible realizations of uncertainty and stages, t in T, often representing planning periods. Each scenario is formulated as a deterministic multi-stage mathematical programming problem with non-anticipativity constraints (NACs) added to ensure that the solver cannot use future information regarding uncertainty in its decision making. For any pair of scenarios (s,s'), there exists a stage at which they become distinguishable, i.e. there is a difference in uncertainty realizations. Until this stage (i.e. while scenarios are indistinguishable), optimization decisions in scenarios s and s' have to be identical. In problems where the decision-maker affects the timing of uncertainty observation (i.e. problems under endogenous observation of uncertainty), NACs have to enforced using a double inequality over all decisions and all scenario pairs. As a result, the number of NACs grow quadratically relative to the remainder of the formulation and make all but the smallest problems computationally intractable.

The goal of this work is the derivation of theoretical results that enable the formulation of substantially smaller yet tighter MSSP formulations, and the development of a novel branch-and-cut algorithm for the solution of formulations that cannot be generated using commercial solvers.

While applicable to multiple MSSP formulations with endogenous observation of uncertainty, the techniques in this paper were developed for the modeling of the pharmaceutical research and development (R&D) pipeline. In this problem, a pharmaceutical company has a number of drugs under development in its R&D pipeline. These drugs must undergo a number of clinical trials which can either succeed or fail; if they fail all development is stopped. All information regarding resources, durations, costs, revenues and probabilities is assumed to be given.

Background

Colvin and Maravelias (2009a) presented four properties leading to the reduction in the number of NACs:

1) It is sufficient to express NACs only for pairs of scenarios (s, s') that differ in the outcome of a single source of uncertainty.

2) If the uncertainty from a single source is revealed gradually by decisions that must be made sequentially, it is sufficient to express NACs only for pairs of scenarios for which the difference is associated with a single decision.

3) The logic condition that links scenarios (s,s') can be expressed in terms of existing binary variables if the timing scenarios are distinguished depends on a single optimization decision.

4) NACs associated with the variables that are prerequisites for the optimization decision that distinguishes scenario pair (s,s') can be expressed as equalities. NACs associated with variables for which the optimization decision distinguishing scenario pair (s,s') is a prerequisite need not be expressed.

By implementing these properties, the number of NACs for a formulation with 3 drugs and 64 scenarios was reduced from over 290,000 to less than 20,000. For a problem with 6 drugs and 4,096 scenarios, the number of NACs was reduced from 1.2x10^9 to less than 2.6x10^6. While significant, further improvements must be made before industrial sized problems containing up to ten drugs can be solved.

New Theoretical Results

The number of NACs after the implementation of the aforementioned properties represents more than 75% of all constraints in problems containing seven or eight drugs. To further reduce the formulation size we derive the following two results (Colvin and Maravelias, 2009b):

5) In stages where the optimization decision distinguishing two scenarios cannot be made, NACs can be expressed as equalities. Further, these equalities can be aggregated in a manner such that they must be expressed once for each scenario rather than once for each scenario pair.

6) If the probability of all scenarios is strictly positive and all decision variables are binary, then NACs expressed as equalities can be expressed once for each set of indistinguishable scenarios, rather than once for each scenario within the set.

In addition to reducing the number of NACs, properties 5 and 6 lead to tighter mixed-integer programming formulations. Furthermore, property 6 is especially useful as it can be applied to all stochastic programming formulations and not just those with endogenous observations of uncertainty.

Branch and Cut Algorithm

Even with all of the reductions above, NACs represent the majority of total constraints. However, most of inequality NACs are not active in any feasible solution. This observations led us to explore the development of a branch and cut (BaC) algorithm in which the initial formulation does not include any NACs represented by inequalities, though equalities (see Properties 4-6) are included. When inequality NACs are found to be violated, they are added into the formulation and kept in all descendant nodes. Note that unlike typical branch and cut algorithms, we remove ?essential?, not tightening constraints.

Removing essential constraints from the formulation allows for integer solutions that are infeasible to the full model to be found feasible. To handle this, a number of modifications to the standard BaC algorithm were made. First, heuristics were turned off to prevent better than optimal, but infeasible, solutions to be found and used as lower bound for pruning. Second, bound updating was modified because the lower bound cannot be updated immediately upon finding an integer solution (the solution must first be checked for NACs feasibility). Finally, if an integer solution is found to violate removed NACs, it must be resolved after the addition of violated NACs (possibly leading to a fractional solution), and subsequently partitioned using standard branching.

We also improved the performance of our algorithm in two ways: node selection rules and selection of nodes where NACs are checked. Note that in our algorithm, local search (which is the default node selection method) loses two key advantages. First, the advantage of having the basis of the previous node is diminished because hundreds of violated NACs are typically added, especially in early nodes. Second, with heuristics turned off, a lower bound is not available to quickly identify unproductive areas of the tree. In later nodes where fewer NACs are added and a lower bound is available, the advantages of local search are important. We found that using a best first search for a fixed number of nodes before using local search provided for the fastest solution times.

Related to the node selection rules is the decision to test for violated NACs at all nodes or at integer nodes only. Checking for infeasibilities at all nodes increases the computational cost per node substantially (because the solution is checked against a large number of NACs), but reduces the total cut additions because a cut added at an early fractional node is carried to all descendent nodes reducing the number of cuts that are added multiple times at different nodes. Testing for infeasibilities at every node outperformed testing only at integer solutions; however, when used in conjunction with our hybrid node selection rule, adding cuts in all early nodes, but only in integer nodes after a fixed depth seemed to perform better than either pure approach.

Finally, we present computational results for a series of examples containing 3-8 drugs. We show how the new theoretical results and the branch and cut algorithm allow us to generate and solve instances with thousands of scenarios, which were intractable using standard solvers.

References

Colvin, M., Maravelias, C. T. (2009a). Scheduling of testing tasks and resource planning in new product development using stochastic programming. Computers & Chemical Engineering, 33, 964-976.

Colvin, M.; Maravelias, C. T. (2009b). Modeling methods and a branch and cut algorithm for pharmaceutical clinical trial planning using stochastic programming. Submitted for publication, European Journal of Operational Research.