(280e) Data-Driven Two-Stage Conic Optimization with Rare High-Impact Zero-One Uncertainties | AIChE

(280e) Data-Driven Two-Stage Conic Optimization with Rare High-Impact Zero-One Uncertainties

Authors 

Subramanyam, A. - Presenter, Argonne National Laboratory
El Tonbari, M., Georgia Institute of Technology
Kim, K., Argonne National Laboratory
We study two-stage conic optimization problems arising in applications that are affected by an extremely large, yet finite, number of rare, high-impact random events and in particular, those that consist of high-dimensional binary outcomes. Such applications are ubiquitous in network optimization, where the rare events amount to random failures of underlying network components such as its nodes or edges. For example, in electric power networks, random node and edge failures have been used to model losses of physical components such as substations, transmission lines, generators and transformers [1]. Similarly, in natural gas, wireless communication and transportation networks, they can be used to model failures of compressors, gas pipelines, antennas and road links, respectively.

The challenges in modeling and solving such uncertainty-affected optimization problems are fourfold. First, the number of possible failure states grows exponentially as the size of the input increases. For example, the number of failure states in a network with n failure-prone nodes is 2n. Second, network failures are extremely rare but critical, and historical records are often not rich enough to include observations for every possible failure state. Third, failures of individual network elements are unlikely to be independent of each other. Fourth, and finally, generating additional data (e.g., via Monte Carlo computer simulations) is often costly or impossible in such applications. Because of these reasons, the true distribution governing the random parameters is unknown and difficult to estimate with a small number of historical observations.

To address these challenges, we adopt a distributionally robust approach (e.g., see [2, 3]). Specifically, given a sparse empirical dataset of high-dimensional zero-one random parameters, we seek to minimize the worst-case expectation of the random cost (which is calculated as the optimal value of a convex conic program), over all distributions that lie in an ambiguity set centered at the empirical distribution. The size of the ambiguity set is governed by the Wasserstein metric, which itself is defined using any norm-induced metric on the zero-one hypercube in the high-dimensional space of parameters.

Our approach finds strong justification for several reasons. First, the high dimensionality and sparsity of the training data in the case of rare events prevents us from obtaining reliable estimates of moments, ruling out several popular moment-based sets. Second, ambiguity sets defined using other metrics, such as the Kullback-Liebler divergence or total variation distance effectively contain only those distributions that assign non-zero probability only to the empirically observed random samples and the worst-case realization (e.g., complete network failure); this is highly undesirable for tackling rare events, since the majority of possible uncertain states are unlikely to have been observed empirically. In contrast, the Wasserstein ambiguity set of any non-zero radius includes distributions that assign non-zero probability to any arbitrary realization. Indeed, it is this property which ensures that its solutions are robust to the occurrence of rare events.

We extend the state-of-the-art in data-driven distributionally robust optimization towards the realm of two-stage conic problems with a particular focus on high-dimensional zero-one uncertainties [4]. By exploiting ideas from penalty methods and bilinear programming, we prove that their solution reduces to optimization problems over the convex hulls of mixed-integer conic representable sets. We also provide sufficient conditions for these convex hull formulations to be tractable. However, it is also shown that they are generically NP-hard even if there are no first-stage decisions and the second-stage loss function is the optimal value of a 2-dimensional linear program. This is crucially different from existing literature (e.g., see [5]) for distributionally robust optimization that assume finitely supported distributions; indeed, almost all of the existing reformulations and algorithms scale with the size of the support set, which limits their applicability to address zero-one uncertainties since the size of support can grow exponentially large in this case.

We circumvent this exponential growth by utilizing tractable conservative approximations inspired by convexification techniques in global optimization. Specifically, by using lift-and-project hierarchies to approximate the aforementioned convex hulls of the mixed-integer conic representable sets, we derive tractable conservative approximations of distributionally robust two-stage conic problems. Moreover, we provide practical guidelines to compute them efficiently. Notably, our approximations become exact as the size of the Wasserstein ambiguity set shrinks to zero, and can be also extended towards conditional-value-at-risk and classical two-stage robust formulations.

We demonstrate the practical viability of our method and its out-of-sample performance on challenging nonlinear optimal power flow problems affected by rare high-impact transmission line outages. We observe that, for fixed sample size and decreasing Wasserstein radius, as well as for fixed Wasserstein radius and increasing sample size, the optimality gaps of our approximations uniformly decrease to zero. Moreover, they have small computation times when compared to an exact Benders scheme (3-10 times faster), with the relative difference in their computation times being significantly higher for large sample sizes. For larger networks, the Benders scheme does not converge within 10 minutes for any reasonable sample size, whereas our approximations are solved within a few minutes. Finally, out-of-sample tests indicate that our method consistently outperforms the classical sample average approximation by about 20%, particularly when the sample size is small. We observe that the relative benefits of our approach appear to increase with increasing rarity and economic impact of the random transmission line outages.

References:

[1] North American Electric Reliability Corporation (2017). Transmission System Planning Performance Requirements.TPL-001-4.

[2] Esfahani, P. M., & Kuhn, D. (2018). Data-driven distributionally robust optimization using the Wasserstein metric: Performance guarantees and tractable reformulations. Mathematical Programming, 171(1-2), 115-166.

[3] Rahimian, H. & Mehrotra, S. (2019). Distributionally robust optimization: A review. arXiv preprint arXiv:1908.05659.

[4] Subramanyam, A., Tonbari, M. E., & Kim, K. (2020). Data-driven two-stage conic optimization with rare high-impact zero-one uncertainties. arXiv preprint arXiv:2001.04934.

[5] Ben-Tal, A., Den Hertog, D., De Waegenaere, A., Melenberg, B., & Rennen, G. (2013). Robust solutions of optimization problems affected by uncertain probabilities. Management Science, 59(2), 341-357.