Almost 3-Approximate Correlation Clustering in Constant Rounds
We study parallel algorithms for correlation clustering. Each pair among n objects is labeled as either "similar" or "dissimilar". The goal is to partition the objects into arbitrarily many clusters while minimizing the number of disagreements with the labels. Our main result is an algorithm that for any ϵ > 0 obtains a (3+ϵ)-approximation in O(1/ϵ) rounds (of models such as massively parallel computation, local, and semi-streaming). This is a culminating point for the rich literature on parallel correlation clustering. On the one hand, the approximation (almost) matches a natural barrier of 3 for combinatorial algorithms. On the other hand, the algorithm's round-complexity is essentially constant. To achieve this result, we introduce a simple O(1/ϵ)-round parallel algorithm. Our main result is to provide an analysis of this algorithm, showing that it achieves a (3+ϵ)-approximation. Our analysis draws on new connections to sublinear-time algorithms. Specifically, it builds on the work of Yoshida, Yamamoto, and Ito [STOC'09] on bounding the "query complexity" of greedy maximal independent set. To our knowledge, this is the first application of this method in analyzing the approximation ratio of any algorithm.
READ FULL TEXT