TY - GEN

T1 - Pattern-matching for strings with short descriptions

AU - Karpinski, Marek

AU - Rytter, Wojciech

AU - Shinohara, Ayumi

N1 - Publisher Copyright:
© Springer-Verlag Berlin Heidelberg 1995.

PY - 1995

Y1 - 1995

N2 - We consider strings which are succinctly described. The description is in terms of straight-line programs in which the constants are symbols and the only operation is the concatenation. Such descriptions correspond to the systems of recurrences or to context-free grammars generating single words. The descriptive size of a string is the length n of a straight-line program (or size of a grammar) which defines this string. Usually the strings of descriptive size n are of exponential length. Fibonacci and Thue-Morse words are examples of such strings. We show that for a pattern P and text T of descriptive sizes m, n, an occurrence of P in T can be found (if there is any) in time polynomial with respect to n. This is nontrivial, since the actual lengths of P and T could be exponential, and none of the known string-matching algorithms is directly applicable. Our first tool is the periodicity lemma, which allows to represent some sets of exponentially many positions in terms of feasibly many arithmetic progressions. The second tool is arithmetics: a simple application of Euclid algorithm. Hence a textual problem for exponentially long strings is reduced here to simple arithmetics on integers with (only) linearly many bits. We present also an NP-complete version of the pattern-matching for shortly described strings.

AB - We consider strings which are succinctly described. The description is in terms of straight-line programs in which the constants are symbols and the only operation is the concatenation. Such descriptions correspond to the systems of recurrences or to context-free grammars generating single words. The descriptive size of a string is the length n of a straight-line program (or size of a grammar) which defines this string. Usually the strings of descriptive size n are of exponential length. Fibonacci and Thue-Morse words are examples of such strings. We show that for a pattern P and text T of descriptive sizes m, n, an occurrence of P in T can be found (if there is any) in time polynomial with respect to n. This is nontrivial, since the actual lengths of P and T could be exponential, and none of the known string-matching algorithms is directly applicable. Our first tool is the periodicity lemma, which allows to represent some sets of exponentially many positions in terms of feasibly many arithmetic progressions. The second tool is arithmetics: a simple application of Euclid algorithm. Hence a textual problem for exponentially long strings is reduced here to simple arithmetics on integers with (only) linearly many bits. We present also an NP-complete version of the pattern-matching for shortly described strings.

UR - http://www.scopus.com/inward/record.url?scp=84957879924&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84957879924&partnerID=8YFLogxK

U2 - 10.1007/3-540-60044-2_44

DO - 10.1007/3-540-60044-2_44

M3 - Conference contribution

AN - SCOPUS:84957879924

SN - 3540600442

SN - 9783540600442

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 205

EP - 214

BT - Combinatorial Pattern Matching - 6th Annual Symposium, CPM 1995, Proceedings

A2 - Galil, Zvi

A2 - Ukkonen, Esko

PB - Springer Verlag

T2 - 6th Annual Symposium on Combinatorial Pattern Matching, CPM 1995

Y2 - 5 July 1995 through 7 July 1995

ER -