The Taub Faculty of Computer Science Events and Talks
Daniella Bar-Lev (CS, Technion)
Sunday, 27.11.2022, 15:45
New families of reconstruction codes motivated by DNA data storage and sequencing applications will be discussed. In such applications, DNA strands are sequenced by reading some subset of their substrings. The discussion will start with the extreme case of the torn-paper channel in which substrings are read with no overlap. Our model extends the previously researched probabilistic setting to the worst-case. We will construct asymptotically optimal codes, with efficient encoding and decoding algorithms, for any parameters of the torn-paper channel, for which a non vanishing asymptotic rate is possible. In contrast to no overlaps between the read substrings, we consider the other extreme case in which all substrings of a few pre-defined lengths are read. Previously researched models are extended by studying the setup where multiple strings are reconstructed together. Two constructions of such multi-strand reconstruction codes, with rates which approach 1, will be presented. Finally, an extension for the model that considers the setup, in which consecutive substrings are read with some given minimum overlap, will be discussed. An upper bound on the attainable rates of codes that guarantee unique reconstruction will be given. Efficient constructions of codes that asymptotically meet this bound will be presented.