Optimal Round and Sample-Size Complexity for Partitioning in Parallel Sorting

04/10/2022
by   Wentao Yang, et al.
0

State-of-the-art parallel sorting algorithms for distributed-memory architectures are based on computing a balanced partitioning via sampling and histogramming. By finding samples that partition the sorted keys into evenly-sized chunks, these algorithms minimize the number of communication rounds required. Histogramming (computing positions of samples) guides sampling, enabling a decrease in the overall number of samples collected. We derive lower and upper bounds on the number of sampling/histogramming rounds required to compute a balanced partitioning. We improve on prior results to demonstrate that when using p processors/parts, O(log^* p) rounds with O(p/log^* p) samples per round suffice. We match that with a lower bound that shows any algorithm requires at least Ω(log^* p) rounds with O(p) samples per round. Additionally, we prove the Ω(p log p) samples lower bound for one round, showing the optimality of sample sort in this case. To derive the lower bound, we propose a hard randomized input distribution and apply classical results from the distribution theory of runs.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset