Massively Parallel Computation via Remote Memory Access
We introduce the Adaptive Massively Parallel Computation (AMPC) model, which is an extension of the Massively Parallel Computation (MPC) model. At a high level, the AMPC model strengthens the MPC model by storing all messages sent within a round in a distributed data store. In the following round, all machines are provided with random read access to the data store, subject to the same constraints on the total amount of communication as in the MPC model. Our model is inspired by the previous empirical studies of distributed graph algorithms using MapReduce and a distributed hash table service. This extension allows us to give new graph algorithms with much lower round complexities compared to the best known solutions in the MPC model. In particular, in the AMPC model we show how to solve maximal independent set in O(1) rounds and connectivity/minimum spanning tree in O(_m/n n) rounds both using O(n^δ) space per machine for constant δ < 1. In the same memory regime for MPC, the best known algorithms for these problems require polylog n rounds. Our results imply that the 2-Cycle conjecture, which is widely believed to hold in the MPC model, does not hold in the AMPC model.
READ FULL TEXT