Constant Approximation for k-Median and k-Means with Outliers via Iterative Rounding
In this paper, we present a novel iterative rounding framework for many clustering problems. This leads to an (α_1 + ϵ≈ 7.08 + ϵ)-approximation algorithm for k-median with outliers, greatly improving upon the large implicit constant approximation ratio of Chen [Chen, SODA 08]. For k-means with outliers, we give an (α_2+ϵ≈ 53 + ϵ)-approximation, which is the first O(1)-approximation for this problem. The iterative algorithm framework is very versatile; we show how it can be used to give α_1- and (α_1 + ϵ)-approximation algorithms for matroid median and knapsack median problems respectively, improving upon the previous best approximations ratios of 8 [Swamy, ACM Trans. Algorithms] and 17.46 [Byrka et al, ESA 2015]. The natural LP relaxation for the k-median/k-means with outliers problem has an unbounded integrality gap. In spite of this negative result, our iterative rounding framework shows that we can round an LP solution to an almost-integral solution of small cost, in which we have at most two fractionally open facilities. Thus, the LP integrality gap arises due to the gap between almost-integral and fully-integral solutions. Then, using a pre-processing procedure, we show how to convert an almost-integral solution to a fully-integral solution losing only a constant-factor in the approximation ratio. By further using a sparsification technique, the additive factor loss incurred by the conversion can be reduced to any ϵ > 0.
READ FULL TEXT