A Polynomial Algorithm for Balanced Clustering via Graph Partitioning
The objective of clustering is to discover natural groups in datasets and to identify geometrical structures which might reside there, without assuming any prior knowledge on the characteristics of the data. The problem can be seen as detecting the inherent separations between groups of a given point set in a metric space governed by a similarity function. The pairwise similarities between all data objects form a weighted graph adjacency matrix which contains all necessary information for the clustering process, which can consequently be formulated as a graph partitioning problem. In this context, we propose a new cluster quality measure which uses the maximum spanning tree and allows us to compute the optimal clustering under the min-max principle in polynomial time. Our algorithm can be applied when a load-balanced clustering is required.
READ FULL TEXT