(620c) Dimensionality Reduction in Sustainability Assessment: A Combined Use of Mixed-Integer Programming and Data Envelopment Analysis

Galán Martín, Á. - Presenter, Imperial College of Science, Technology and Medicine
Limleamthong, P., Imperial College London
Guillén-Gosálbez, G., Imperial College of Science, Technology and Medicine
The sustainability assessment of industrial systems often requires the consideration of a wide range of indicators. Among the available tools to tackle this task, Data Envelopment Analysis (DEA) has recently emerged as an effective technique to measure the comparative efficiency of a set of homogeneous units considering multiple indicators (i.e. inputs and outputs in DEA notation) simultaneously whilst providing clear guidelines on how to improve inefficient alternatives. However, involving too many indicators in the analysis could inevitably turn the problem into larger dimensions, leading to difficulties in the visualization of results. In addition, this could deteriorate the discriminatory power of DEA and ultimately produce results that are less meaningful and hard to interpret. Hence, there is a clear need to retain in the DEA analysis only some indicators that are the most essential by eliminating some of those that might be redundant in the sense that they can be excluded from the analysis without changing the results.

In this paper, we proposed a systematic MIP-DEA model to enhance the DEA application in sustainability assessments where many indicators need to be considered. Our method poses the task of identifying metrics that can be omitted with minimum information loss as a bi-level programming model. The outer problem seeks to minimize the difference between the efficiency scores calculated considering all the inputs and outputs and those obtained in a reduced subset of them. On the other hand, the inner problem provides the efficiency scores that would be obtained for any potential combination of inputs and outputs proposed by the outer problem. Here, binary variables model the selection of inputs and outputs in the master outer problem, while continuous ones represent the DEA weights in the inner problem.

We explored the capabilities of our approach through the assessment of several electricity generation technologies according to multiple criteria, some of which are based on life cycle metrics that are modelled as inputs to be minimized. The results show that our systematic approach can effectively reduce the number of indicators from 10 to 5 without information loss. These results evidence that there are significant redundancies in sustainability indicators, which makes it possible to reduce the problem dimensionality with no information loss. The same approach can be used to reduce the number of efficient units, thereby improving the discriminatory power of DEA. However, the later approach would likely modify the efficiency scores, so a compromise should be attained between the number of indicators included in the analysis and the quality of the final results.

Furthermore, to gain further insight into why our approach decides to keep some specific inputs in the model and how the inputs correlate with each other, we calculated the Spearman correlation matrix of the inputs and applied a “k-means” clustering algorithm to identify groups of variables performing similarly. It turns out that the optimal combination of inputs selected by the MIP-DEA is consistent with the clustering results. This suggests that the proposed approach reduces the error by selecting proxies of each set of correlated variables. This opens up new research directions on combining these two tools effectively for dimensionality reduction. Overall, our approach for dimensionality reduction has the potential to greatly simplify sustainability studies from the viewpoints of visualization and interpretation of the results.