# (584a) System Decomposition for Distributed Multivariate Statistical Process Monitoring

- Conference: AIChE Annual Meeting
- Year: 2018
- Proceeding: 2018 AIChE Annual Meeting
- Group: Computing and Systems Technology Division
- Session:
- Time:
Wednesday, October 31, 2018 - 3:30pm-3:45pm

The system decomposition (i.e. the partitioning of a systemâ€™s measured variables into subsystems) has a significant impact on the performance of a distributed MSPM (DMSPM) method. The DMSPM method should ideally be implemented using a decomposition for which its performance, in monitoring a set of faults, is optimal so that the faults can be detected with greater speed and accuracy. The optimal decomposition for a DMSPM method depends on the set of faults that tend to affect the system. In an optimal decomposition of the system, the different measured variables of a subsystem tend to be affected by similar faults thus ensuring that most measured variables in the subsystem contribute significantly to the subsystem test statistic when a fault occurs thereby making the statistical hypothesis tests of the DMSPM method more sensitive to the faults. The optimal decomposition for a DMSPM method also depends on various other factors such as:

- The number of subsystems that the system is partitioned into.
- The MSPM method that is implemented in a distributed configuration.
- Consensus between the subsystems.

Therefore, finding the optimal decomposition for a DMSPM method is a difficult task. An effective strategy to find the optimal decomposition would be to use simulation optimization wherein the performance of the DMSPM method is simulated for a set of candidate decompositions and the decomposition with the best performance is considered optimal.

In this work, we propose and present a novel simulation optimization method, called the Performance Driven Agglomerative Clustering (PDAC) method, which finds a near optimal system decomposition for a DMSPM method. The PDAC method uses the greedy search of Wardâ€™s agglomerative clustering algorithm to generate a set of candidate decompositions (the decision variables). Normal operation and faulty data, input by the user, is then used to simulate the performance of the DMSPM method for a candidate decomposition and calculate its missed detection rate (MDR) which is the objective function. The MDRs of the candidate decompositions having the same number of subsystems are compared. After applying the agglomerative clustering procedure, a near optimal decomposition is generated for every possible value of the number of subsystems. The number of subsystems can range from one to the number of measured variables in the system. A fine tuning procedure is used to slightly modify some of the decompositions output by the agglomerative clustering procedure and further reduce their MDRs. The monitoring performance of the decompositions output by the agglomerative clustering and fine tuning procedures is then compared to find the optimal number of subsystems and hence the optimal system decomposition for the DMSPM method. The PDAC method is a completely data-driven system decomposition method and can be automated. The PDAC method can also, in principle, be applied to most DMSPM methods since it only requires simulation of the DMSPM method using process data. To illustrate its effectiveness, PDAC is used to find the decomposition of the benchmark Tennessee Eastman Process case study for which the monitoring performance, using a distributed Principal Component Analysis based monitoring scheme, is optimal.