Re-Pair In-Place

08/14/2019
by   Dominik Köppl, et al.
0

Re-Pair is a grammar compression scheme with favorably good compression rates. The computation of Re-Pair comes with the cost of maintaining large frequency tables, which makes it hard to compute Re-Pair on large scale data sets. As a solution for this problem we present, given a text of length n whose characters are drawn from an integer alphabet, an O(n^2) time algorithm computing Re-Pair in n τ bits of space including the text space, where τ is the maximum of n and the number of terminals and non-terminals. The algorithm works in the restore model, supporting the recovery of the original input in O(n^2) time with O( n) additional bits of working space. We give variants of our solution working in parallel or in the external memory model.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset