(352c) A Data-Mining Framework for Uncertainty Analysis in Pipeline Erosion Modeling | AIChE

(352c) A Data-Mining Framework for Uncertainty Analysis in Pipeline Erosion Modeling


Dai, W. - Presenter, Auburn University
Cremaschi, S., Auburn University

A data-mining framework for uncertainty analysis in pipeline
erosion modeling

Wei Dai, Selen

a Department of Chemical Engineering, Auburn
University, AL 36849

selen-cremaschi@auburn.edu Abstract

many industrial operations, solid particles move at high speeds in the fluid
system and result in serious wear attack. This type of wear is called erosion.
Erosion in pipelines is defined as the material removal from the solid surface
due to solid particle impingement. This phenomena, especially in multiphase
flow systems, is very complex and depends on many factors including fluid and
solid characteristics, the pipeline material properties and the geometry of the
flow lines. The safe and efficient operation and design of these pipelines requires
reliable estimates of erosion rates.

Given the complexity, most of the modeling work
focuses on developing empirical or semi-mechanistic models to predict erosion
rates. For example, Oka et al. (2005) developed their erosion model using
particle impingement in air with empirical constants based on particle
properties and hardness of the target materials. Their model is one of the most
commonly cited in the literature. Another semi-mechanistic model called 1-D
SPPS (Zhang, 2007), which is widely used for predicting erosion rates by the
oil and gas industry, was developed with several empirically estimated
parameters, like the sharpness factor of particles, Brinell hardness and the
empirical constants in the impact angle function. These empirical parameters
are calculated using experimental observations. However, the experimental data
used in these calculations and also for model validation and uncertainty
quantification are, for the most part, collected in small pipe diameters (from
2 to 4 inches). These small pipe sizes do not coincide with the field
conditions, where the pipe diameters generally exceed 8 inches. Hence, the
predictions of erosion models are routinely extrapolated to conditions where
experimental data or even operating experience is not available, and the
estimation of erosion-rate prediction uncertainty becomes crucial especially
for systems too-costly to fail.The quantification of this uncertainty
is especially important during the design phase for subsea applications, as
erosion rate allowance, which is set using the erosion rate predictions and its
uncertainty, directly impacts the integrity of the facility.

The uncertainty in model predictions can stem from
three sources in general: (1) uncertainty in experimental measurements of input
conditions, (2) model form uncertainty (i.e., incomplete presentation of the
actual system due to lack of knowledge or imprecise experimental observations),
and (3) model parameter uncertainty. The experimental data uncertainty usually
consists of measurement errors due to both instrumental and human errors. The
reliability of the models largely depends on the ability of model form to
capture the details of erosion process in enough granularity. The uncertainty in the model parameters results from
an inability to accurately quantify the parameters of a model (Shrestha, 2009).

In this talk, a systematic framework is introduced to
quantify erosion-rate prediction uncertainty for operating conditions where
experimental data are not available, and for a set of newly-collected
experimental data points. The framework incorporates the impacts of model form
and parameter uncertainties to estimate prediction uncertainties, and combines
data clustering and Gaussian Process Modeling approaches with Monte Carlo

For estimating erosion-rate prediction uncertainty, we
compiled an experimental database of erosion rate measurements from literature.
The database contains 586 data points in single or multiphase carrier flows.
Eighty percent of the data in the database are collected for gas dominated
flows (i.e., gas only, annular, mist and churn flow). The experimental database
covers a wide range of input conditions resulting in significantly different
erosion rate measurements. The dataset encompasses data collected from six
different flow regimes, with wide-range of material properties and production

The data clustering is used to capture the similar
characteristics of operating conditions and to identify internal data
structures present within the database. Among the data clustering approaches
available in the literature, k-prototype (Cheung, 2013) is selected as the most
appropriate for our dataset due to the existence of categorical variables. It calculates
the similarity based on both categorical attributes and numerical attributes,
and classifies the given data points into several clusters such that the
similarities between objects in the same group are high while the similarities
between objects in different groups are low.

Gaussian Process Modeling (GPM, Rasmussen, 2006) is used
to estimate the prediction uncertainty stemming from both model form and model
parameter uncertainties. The GPM models erosion-rate model discrepancy, which
is defined as the difference between experimental erosion rates and the corresponding
erosion rate predictions as a Gaussian random process. This process is
presented by mean and covariance functions assuming a multivariate normal
distribution. The most likely values of mean and covariance function parameters
are determined by Maximum Likelihood Estimation (MLE) using experimental data.
Once GPM is trained based on the available data set, a set of hyper-parameters
can be obtained and used for future model interpolation or extrapolation
analysis (Jiang, 2013). A GPM is built for each cluster identified by the data
clustering step.

Finally, the Monte Carlo
simulation is used to study the influences of data uncertainty due to limited
repetition in the experiments. We previously developed a novel approach to
estimate experimental data uncertainties in the absence of repetitive
experiments and the kernel density estimations of experimental uncertainty for
four different measurement approaches (Dai, 2016). The impact of experimental data
uncertainty on GPM predictions is assessed in a Monte Carlo framework where the
training of GPM is repeated 1000 times with randomized initializations from the
kernel density. After 1000 replications, the distribution of model prediction uncertainty
in each cluster is obtained. A box plot is used to show the spread of model
prediction uncertainties.

The application of the developed framework is
demonstrated on one of the well-known erosion model, 1-Dimensional Sand
Production Pipe Saver (1-D SPPS, ECRC), which is used extensively in oil and
gas industry for erosion predictions. The data clustering approaches divided
the database into seven clusters. In the previous studies, we clustered the
data based on flow regimes and built GPMs for these clusters (Dai, 2015). A
comparison of data clustering based on the k-prototype approach and flow
regimes is given in support of the application of data clustering approach. The
mean square error (MSE) and area metric (AM) (Ferson, 2008) of the GPM
predictions obtained using a fourth fold cross-validation for both approaches
are compared.  The smallest MSE and
AM are 6.79¡Á10-9 and 4.39¡Á10-5 based on k-prototype
approach and 1.48¡Á10-8 and 6.44¡Á10-5
based on flow regimes. The results suggest the data clustering based on
k-prototype approach where smaller MSE and AM are obtained.


work is supported by the Chevron Energy Technology Company. Discussions and
comments from the Haijing Gao, Gene Kouba and Janakiram Hariprasad of Chevron
and Brenton McLaury, Siamack Shirazi of E/CRC at the University of Tulsa were highly


Y.M. and Jia, H., 2013, Categorical-and-numerical-attribute data clustering
based on a unified similarity metric without knowing cluster number. Pattern
Recognition, 46, 2228-2238.

Dai, W. and Cremaschi,
C., 2015. Quantifying Model Uncertainty in Scarce Data Regions ¨C A Case Study
of Particle Erosion in Pipelines. The 12th International Symposium on Process
Systems Engineering and 25th European Symposium on Computer Aided Process
Engineering, Copenhagen, Denmark

Dai, W., Cremaschi, C.,
Islam M.A., Nukala, R.T., Subramani, H,J., Kouba, G.E. and Gao, H.J., 2016. Uncertainty
analysis of multiphase flow ¨C Case studies from erosion, sand transport, liquid
entrainment models. 10th North American Multiphase conference, Banff, Canada

Ferson, S., Oberkampf,
W. L., and Ginzburg, L., 2008, Model Validation and Predictive Capability for
the Thermal Challenge Problem, Computer Methods in Applied Mechanics and
Engineering, Vol. 197, No. 29-32, pp 2408-2430.

Jiang, Z.,
Chen, W., Fu, Y., and Yang, R., 2013, Reliability-Based Design Optimization
with Model Bias and Data Uncertainty, SAE International.

Oka, Y.
I., Okamura, K., and Yoshida, T., 2005, Practical estimation of erosion damage
caused by solid particle impact: Part 1: Effects of impact parameters on a
predictive equation, Wear, 259(1-6), page 95-101.

C. D. (2013). Correlation of porosity uncertainty to productive reservoir
volume Society

Petroleum Engineers.

C. E., & Yeung, H. (2001). Uncertainty estimation and monte carlo
simulation method.
Flow Measurement and Instrumentation, 12(4), 291¨C298.

C.E. and Williams, C.K. I., 2006, Gaussian Processes for Machine Learning, The
MIT Press.

D.L., 2009, Uncertainty Analysis in Rainfall-Runoff Modelling: Application of
Machine Learning Techniques, PhD. Dissertation, UNESCO-IHE, the Netherlands

Zhang, Y.,
Reuterfors, E.P., McLaury, B.S., Shirazi, S.A., and Rybicki, E.F., 2007,
Comparison of Computed and Measured Particle Velocities and Erosion in Water
and Air Flows, Wear, 263.