Storage Codes with Flexible Number of Nodes
This paper presents flexible storage codes, a class of error-correcting codes that can recover information from a flexible number of storage nodes. As a result, one can make a better use of the available storage nodes in the presence of unpredictable node failures and reduce the data access latency. Let us assume a storage system encodes kℓ information symbols over a finite field 𝔽 into n nodes, each of size ℓ symbols. The code is parameterized by a set of tuples {(R_j,k_j,ℓ_j): 1 ≤ j ≤ a}, satisfying k_1ℓ_1=k_2ℓ_2=...=k_aℓ_a and k_1>k_2>...>k_a = k, ℓ_a=ℓ, such that the information symbols can be reconstructed from any R_j nodes, each node accessing ℓ_j symbols. In other words, the code allows a flexible number of nodes for decoding to accommodate the variance in the data access time of the nodes. Code constructions are presented for different storage scenarios, including LRC (locally recoverable) codes, PMDS (partial MDS) codes, and MSR (minimum storage regenerating) codes. We analyze the latency of accessing information and perform simulations on Amazon clusters to show the efficiency of presented codes.
READ FULL TEXT