(7iu) High-Performance Computing Approaches to Large-Scale Stochastic Programming and Data Analysis

Cao, Y., University of Wisconsin-Madison

Computing Approaches to Large-Scale Stochastic Programming and Data

Research Interests: High-performance computing, stochastic
programming, machine learning, energy systems, robust/stochastic MPC

This poster highlights my recent work on algorithms
and software implementations to solve nonlinear stochastic programming problems
of unprecedented complexity to local and global optimality. SchurIpopt and
PIPS-NLP are powerful local solvers for continuous problems, which exploit the
arrowhead block representation of the linear algebra system using a
distributed-memory Schur decomposition strategy. However, the scalability of
the Schur decomposition approach is hindered by the number of first-stage
variables. IPCluster is a solver that overcomes this fundamental bottleneck by performing
adaptive clustering of scenarios to create effective preconditioners for the
linear algebra system. SNGO is a global optimization solver that uses a
specialized branch and bound strategy to enable parallelism and convergence for
both continuous and mixed-integer problems. We also present PLASMO, a
Julia-based open-source modeling framework that uses a graph-based approach to
facilitate the expression of complex stochastic programming problems and
scenario trees. We have applied these tools in the three applications.

Robust/Stochastic Control

Robust/Stochastic Control is an approach
to design controllers that explicitly addresses uncertainty either from
uncertain parameter or disturbance. A robust/stochastic nonlinear model
predictive control (NMPC) strategy optimizes the worse-case/expected-value
performance at each sampling instance. The size of problem is often too large
to be implemented in real time. In a collaboration with the experimental
research group of Prof. Zoltan Nagy at Purdue, we have found that unseeded
batch crystallization processes reach optimization formulations of  over
half a million variables and constraints. The serial solver Ipopt takes 7 min
to solve the problem and by using parallel implementations in SchurIpopt we are
able to reduce the time to  30 seconds, allowing for real-time applications.
We have also found that a robust NMPC strategy guarantees satisfaction of all
state and input constraints for a set of uncertainty realizations, and also
provides better robust performance. In particular, the standard deviation of
crystal size and worst case mismatch from desired crystal size using robust
NMPC is 30% smaller than that using nominal NMPC and is 50% better than that
using open-loop optimal control. 

It is also possible to use
robust/stochastic NMPC formulations to optimize the parameters of controller
laws. In a collaboration with General Electric, we have recently proposed a stochastic
optimization formulation to identify optimal parameters for pitch and torque controllers
in wind turbines considering historical data of wind speed. This approach seeks
to extract maximum power and to satisfy extreme load requirements by design. Compared
with nominal controller parameters, the optimal settings increase the revenue
of a wind turbine by 18%, reaching revenue improvements of hundreds of
thousands of USD per year. The optimization problem involves up to 7.5 million
variables and constraints and can be solved in less than 1.3 hours using
PIPS-NLP.  The serial solver Ipopt takes over one day to solve the

Optimization of Energy Systems

Many energy system suffers from significant
uncertainty arising from climate conditions and electricity demands. Moreover,
conflicting objectives must often be taken into account. We have recently
developed a multi-objective stochastic optimization framework to solve these
types of problems.  We have applied this framework to the design of a
combined heat and power plant that must take cost, emissions, and water usage
under consideration. The design problem has more than 1 million variables and
can be solved with PIPS-NLP in less than 6 minutes.

We have also applied our techniques to
stochastic market clearing models in large power grid. A clearing problem for
the Illinois power grid system has more than 64,000 first stage variables and
more than 1.2 million total variables. The challenge in this problem is the
large dimensionality of the first-stage. While Ipopt and PIPS-NLP both take
more than 8 hours to solve the problem, a preconditioned strategy implemented
in IPCluster takes only 11 minutes to solve the problem.

Experimental Data Analysis

High-performance computing and
optimization algorithms can also be used to break existing limitations in
estimation and learning problems. In a collaboration with the experimental
group of Prof. Ophellia Venturelli at UW-Madison,  we have demonstrated
that we can estimate parameters in highly nonlinear microbial growth models to
global optimality. Our solver SNGO solved this problem in less than 9 minutes
to an optimality gap of 1%, while the state-of-the-art global solver SCIP cannot
solve the problem (it takes one day to close the optimality gap to 60%). The
advantage of using a global solver is that we can provide a certificate to
experimental researchers that no better set of parameters can be found.

Machine learning problems from either
support vector machines (SVM) or neural network can also be solved using the
proposed algorithms. In a collaboration with Prof. Nicholas Abbott group at
UW-Madison, we have applied SVM to classify the response of liquid crystals (LCs) under analytes such as DMMP (similar
structure to sarin nerve gas) and water. We have used our algorithms to classify
more than 70,000 images from experiments of LCs and we have achieved accuracies
of 99%.

Teaching Interests:

I had gain extensive teaching experience
during my PhD as a teaching assistant in modeling, high-performance computing,
optimization, and control. I am qualified to teach undergraduate chemical
engineering core courses such as design, control, and modeling. I am also
interested in developing new courses related to applications of high
performance computing, optimization, and machine learning in chemical