Optimal Load Allocation for Coded Distributed Computation in Heterogeneous Clusters
Recently, coding has been a useful technique to mitigate the effect of stragglers in distributed computing. However, coding in this context has been mainly explored under the assumption of homogeneous workers, although the real-world computing clusters can be often composed of heterogeneous workers that have different computing capabilities. The uniform load allocation without the awareness of heterogeneity possibly causes a significant loss in latency. In this paper, we suggest the optimal load allocation for coded distributed computing with heterogeneous workers. Specifically, we focus on the scenario that there exist workers having the same computing capability, which can be regarded as a group for analysis. We rely on the lower bound on the expected latency and obtain the optimal load allocation by showing that our proposed load allocation achieves the minimum of the lower bound for a sufficiently large number of workers. From numerical simulations, when assuming the group heterogeneity, our load allocation reduces the expected latency by orders of magnitude over the existing load allocation scheme.
READ FULL TEXT