(189b) Big Data Process Modelling with Parallel Graphics
- Conference: AIChE Spring Meeting and Global Congress on Process Safety
- Year: 2017
- Proceeding: 2017 Spring Meeting and 13th Global Congress on Process Safety
- Group: 3rd Big Data Analytics
- Time: Wednesday, March 29, 2017 - 4:15pm-5:00pm
Process engineers are skilled at data reduction. When investigating a problem a typical approach is to place a boundary as small as possible around the suspected equipment, select a few variables known to be key and a few more that might be related to the problem at hand, limiting the approach to 10 to 20 variables of the hundreds that may be available and then focusing in on just some key time periods believed to be key. In this way, only a very small amount of the applicable data is ever used and very little new understanding can be generated. The drawback has been that multi-way interactions have exponential complexity with the number of variables and can take prohibitively long to investigate, understand, and discover new relations that arenât just the known and expected physical correlations.
The parallel coordinate graph removes this data analysis limitation. By providing a graph that allows viewing continuous data across hundreds of variables simultaneously, engineering analysis and visualization can proceed roughly linearly with the number of variables considered, dramatically increasing the amount of data available to the engineer and brining the limit closer to the actual physical memory limit of the computer system used. Parallel graphs with queries can be used to link and compare operating envelopes from final product quality variables across hundreds of process operating values, allowing discovery and utilization of data not previously considered.
Extending this approach to geometric modelling, over one hundred variables can be used in a single process operating model built from the same historical data. The use of the parallel coordinate plot allows operators to monitor the process and detect changes in the relationships between the variables in real time. Unlike traditional models which are practically limited to 10-20 variables, this model takes account of far more historical data and variable relationships.
As an example an ethylene refrigeration system will be considered. Here one of the operation issues is keeping the system far enough from compressor surge that the anti-surge system doesnât operate. By including many more variables, a much more sensitive event prediction models can be built that allows more optimal operation while still giving warning of required operator action.