Streaming Euclidean MST to a Constant Factor
We study streaming algorithms for the fundamental geometric problem of computing the cost of the Euclidean Minimum Spanning Tree (MST) on an n-point set X ⊂ℝ^d. In the streaming model, the points in X can be added and removed arbitrarily, and the goal is to maintain an approximation in small space. In low dimensions, (1+ϵ) approximations are possible in sublinear space [Frahling, Indyk, Sohler, SoCG '05]. However, for high dimensional spaces the best known approximation for this problem was Õ(log n), due to [Chen, Jayaram, Levi, Waingarten, STOC '22], improving on the prior O(log^2 n) bound due to [Indyk, STOC '04] and [Andoni, Indyk, Krauthgamer, SODA '08]. In this paper, we break the logarithmic barrier, and give the first constant factor sublinear space approximation to Euclidean MST. For any ϵ≥ 1, our algorithm achieves an Õ(ϵ^-2) approximation in n^O(ϵ) space. We complement this by proving that any single pass algorithm which obtains a better than 1.10-approximation must use Ω(√(n)) space, demonstrating that (1+ϵ) approximations are not possible in high-dimensions, and that our algorithm is tight up to a constant. Nevertheless, we demonstrate that (1+ϵ) approximations are possible in sublinear space with O(1/ϵ) passes over the stream. More generally, for any α≥ 2, we give a α-pass streaming algorithm which achieves a (1+O(logα + 1/αϵ)) approximation in n^O(ϵ) d^O(1) space. Our streaming algorithms are linear sketches, and therefore extend to the massively-parallel computation model (MPC). Thus, our results imply the first (1+ϵ)-approximation to Euclidean MST in a constant number of rounds in the MPC model.
READ FULL TEXT