Codes with Biochemical Constraints and Single Error Correction for DNA-Based Data Storage
In DNA-based data storage, DNA codes with biochemical constraints and error correction are designed to protect data reliability. Single-stranded DNA sequences with secondary structure avoidance (SSA) help to avoid undesirable secondary structures which may cause chemical inactivity. Homopolymer run-length limit and GC-balanced limit also help to reduce the error probability of DNA sequences during synthesizing and sequencing. In this letter, based on a recent work <cit.>, we construct DNA codes free of secondary structures of stem length ≥ m and have homopolymer run-length ≤ℓ for odd m≤11 and ℓ≥3 with rate 1+log_2ρ_m-3/(2^ℓ-1+ℓ+1), where ρ_m is in Table <ref>. In particular, when m=3, ℓ=4, its rate tends to 1.3206 bits/nt, beating a previous work by Benerjee et al.. We also construct DNA codes with all of the above three constraints as well as single error correction. At last, codes with GC-locally balanced constraint are presented.
READ FULL TEXT