Improved Coresets for Clustering with Capacity and Fairness Constraints
We study coresets for clustering with capacity and fairness constraints. Our main result is a near-linear time algorithm to construct Õ(k^2ε^-2z-2)-sized ε-coresets for capacitated (k,z)-clustering which improves a recent Õ(k^3ε^-3z-2) bound by [BCAJ+22, HJLW23]. As a corollary, we also save a factor of k ε^-z on the coreset size for fair (k,z)-clustering compared to them. We fundamentally improve the hierarchical uniform sampling framework of [BCAJ+22] by adaptively selecting sample size on each ring instance, proportional to its clustering cost to an optimal solution. Our analysis relies on a key geometric observation that reduces the number of total “effective centers" from [BCAJ+22]'s Õ(k^2ε^-z) to merely O(klogε^-1) by being able to “ignore” all center points that are too far or too close to the ring center.
READ FULL TEXT