A New Design Framework for Heterogeneous Uncoded Storage Elastic Computing
Elasticity is one important feature in modern cloud computing systems and can result in computation failure or significantly increase computing time. Such elasticity means that virtual machines over the cloud can be preempted under a short notice (e.g., hours or minutes) if a high-priority job appears; on the other hand, new virtual machines may become available over time to compensate the computing resources. Coded Storage Elastic Computing (CSEC) introduced by Yang et al. in 2018 is an effective and efficient approach to overcome the elasticity and it costs relatively less storage and computation load. However, one of the limitations of the CSEC is that it may only be applied to certain types of computations (e.g., linear) and may be challenging to be applied to more involved computations because the coded data storage and approximation are often needed. Hence, it may be preferred to use uncoded storage by directly copying data into the virtual machines. In addition, based on our own measurement, virtual machines on Amazon EC2 clusters often have heterogeneous computation speed even if they have exactly the same configurations (e.g., CPU, RAM, I/O cost). In this paper, we introduce a new optimization framework on Uncoded Storage Elastic Computing (USEC) systems with heterogeneous computing speed to minimize the overall computation time. Under this framework, we propose optimal solutions of USEC systems with or without straggler tolerance using different storage placements. Our proposed algorithms are evaluated using power iteration applications on Amazon EC2.
READ FULL TEXT