(255d) Determination of the Optimal Number of Eigenfunctions in Proper Orthogonal Decomposition Based on Machine Learning

Authors: 
Sidhu, H. S., Texas A&M University
Narasingam, A., Texas A&M University
Siddhamshetty, P., Texas A&M Energy Institute, Texas A&M University

Determination of the optimal number
of eigenfunctions in Proper Orthogonal Decomposition based on Machine Learning

Harwinder Singh Sidhu[1],[2],
Abhinav Narasingam[1],[2], Prashanth Siddhamshetty[1],[2],

Joseph Sang-II Kwon[1],[2]

[1] Artie McFerrin Department of
Chemical Engineering, Texas A&M University, College Station, TX 77845 USA

[2] Texas A&M Energy Institute,
Texas A&M University, College Station, TX 77845 USA

Spatiotemporal data, whether
captured through remote sensors or large-scale simulations has always been
‘Big’. However, recent advances in next generation smart manufacturing have
dramatically intensified data generation through networked information-based
technologies throughout the chemical industry and other manufacturing
enterprises, making spatiotemporal data even bigger. Therefore, we need an
algorithm that is able to extract the underlying process dynamics from large
datasheets to make decisions for safe and efficient operations.

Proper Orthogonal
Decomposition
(POD) [1] is a widely used model-reduction technique which extracts
dominant spatial patterns from spatiotemporal data obtained via experiments or
large-scale simulations of distributed parameter systems (DPSs). In POD, the essential
information of a large-scale complex system, consisting of m variables, is
extracted in ‘d’ retained eigenfunctions. The main idea of this
technique is based on a simple observation that very often the underlying
signal of high-dimensional data belongs to a low-dimensional space that is
spanned by ‘d’ eigenfunctions. Thus, d is much smaller than m.
Nevertheless, the selection of these optimal number of eigenfunctions remains
a key challenge. Although there are several guidelines [2]available
in literature for selecting these eigenfunctions, they may not necessarily be optimal.
For example, one may retain the eigenfunctions whose eigenvalues are larger
than a prescribed value. Such a selection may discard certain low-energy states
which can have a large influence on the accuracy of a developed low-dimensional
system [3].

Motivated by the penalized
regression methods for simultaneous variable selection and coefficient
estimation, like elastic net, we have formulated the following regularization problem
to determine the optimal number of eigenfunctions, ‘d’, that minimizes
the energy functional consisting of two terms:

                                                              
                                               (1)

The first term is the square of
the distance from a snapshot xj
to the d dimensional subspace V spanned by the orthogonal
eigenfunctions. This term represents the quality of the subspace to capture the
information embedded in the original spatiotemporal data. β(d), satisfying  β(0) = 0, acts as a penalizer that measures
the complexity of the reduced model. The regularization parameter, α≥0, quantifies
the relative trade-off between the complexity and the error of the reduced
model. Techniques such as cross validation can be employed to select an
appropriate α.
Now, for a given α,
a larger d will minimize the first term but will increase the model
complexity and vice-versa. Therefore, the overall energy functional is a convex
function, and the minima will give us the optimal number of eigenfunctions.

We implement the proposed methodology
for order reduction of a hydraulic fracturing process described by multiple
nonlinear parabolic PDEs with time-dependent spatial domains [4]. First, a
representative ensemble of solutions is constructed by solving a high-order
discretization of the PDEs. Then, the proposed method is applied to derive an
optimal set of empirical eigenfunctions, which are subsequently used as basis
functions within a model reduction framework, like Galerkin’s method, to derive
low-order ODE systems that accurately describe the dominant dynamics of the
hydraulic fracturing process. These ODE systems are then used to compute
approximate solutions (i.e., spatiotemporal profiles) to the original system.

References:

[1] Berkooz G., Holmes P.,
Lumley J. The proper orthogonal decomposition in the analysis of turbulent
flows. Ann. Rev. Fluid. Mech. 1993;25:539-575.

[2] Jolliffe IT. Principal Component Analysis. New York:
Springer; 2002.

[3] Noack, B.R., Schlegel, M.,
Ahlborn, B., Mutschke, B., Morzynski, M., Comte, P., Tadmor, G.

A finite-time thermodynamics
formalism for unsteady flows. J. Non-Equilib. Thermodyn. 2008;33:103-148.

[4] Yang, S., Siddhamshetty,
P., Kwon, J.S. Optimal pumping schedule design to achieve a uniform proppant
concentration level in hydraulic fracturing. Compt. Chem. Eng.,
2017;101:138-147.