Machine learning clustering algorithms based on Data Envelopment Analysis in the presence of uncertainty

Document Type : Research Paper


Department of Mathematics, Science and Research Branch, Islamic Azad University, Tehran, Iran


This study combines Data Envelopment Analysis (DEA) with machine learning clustering method in datamining for finding the most efficient Decision Making Unit (DMU) and the best clustering algorithm, respectively. The problem of assessment of units by using DEA may not be straightforward due to the data uncertainty. Several scholars have been attracted to develop methods which incorporate uncertainty into input/output values in the DEA literature. On the other hand, in many real world applications, the data is reported in the form of intervals. This means that each input/output value is selected from a symmetric box. In the DEA literature, this type of uncertainty has been addressed as Interval DEA approaches. The main goal of this study is to evaluate the efficiency of banks in the case of data uncertainty with cross-efficiency method in the DEA literature. For this purpose, we consider the BCC-CCR and CCR-BCC models in the presence of uncertain data to find the superior model. After applying the optimization models, in machine learning step, clustering method is applied. Clustering is a procedure for grouping similar items together which this group is called the cluster. Also, the different clustering algorithms can be used according to the behavior of data. In this study, we apply the farthest first and expectation maximization algorithms and show that, in the case of data uncertainty, the BCC-CCR and farthest first algorithms are as a superior optimization model and machine learning algorithm, respectively.