Approximate Trace Reconstruction from a Single Trace
The well-known trace reconstruction problem is the problem of inferring an unknown source string x ∈{0,1}^n from independent "traces", i.e. copies of x that have been corrupted by a δ-deletion channel which independently deletes each bit of x with probability δ and concatenates the surviving bits. The current paper considers the extreme data-limited regime in which only a single trace is provided to the reconstruction algorithm. In this setting exact reconstruction is of course impossible, and the question is to what accuracy the source string x can be approximately reconstructed. We give a detailed study of this question, providing algorithms and lower bounds for the high, intermediate, and low deletion rate regimes in both the worst-case (x is arbitrary) and average-case (x is drawn uniformly from {0,1}^n) models. In several cases the lower bounds we establish are matched by computationally efficient algorithms that we provide. We highlight our results for the high deletion rate regime: roughly speaking, they show that - Having access to a single trace is already quite useful for worst-case trace reconstruction: an efficient algorithm can perform much more accurate reconstruction, given one trace that is even only a few bits long, than it could given no traces at all. But in contrast, - in the average-case setting, having access to a single trace is provably not very useful: no algorithm, computationally efficient or otherwise, can achieve significantly higher accuracy given one trace that is o(n) bits long than it could with no traces.
READ FULL TEXT