Run Compressed Rank/Select for Large Alphabets
Given a string of length n that is composed of r runs of letters from the alphabet {0,1,...,σ-1} such that 2 <σ< r, we describe a data structure that, provided r < n / ^ω(1) n, stores the string in rnσ/r + o(rnσ/r) bits and supports select and access queries in O((n/r)/ n) time and rank queries in O((nσ/r)/ n) time. We show that rn(σ-1)/r bits are necessary for any such data structure and, thus, our solution is succinct. We also describe a data structure that uses (1 + ϵ)rnσ/r + O(r) bits, where ϵ > 0 is an arbitrary constant, with the same query times but without the restriction r < n / ^ω(1) n. By simple reductions to the colored predecessor problem, we show that the query times are optimal in the important case r > 2^^δ n, for an arbitrary constant δ > 0. We implement our solution and compare it with the state of the art, showing that the closest competitors consume 31-46
READ FULL TEXT 
  
  
     share
 share