TY - GEN

T1 - Fast full permuted pattern matching algorithms on multi-track strings

AU - Diptarama,

AU - Yoshinaka, Ryo

AU - Shinohara, Ayumi

N1 - Funding Information:
Acknowledgments This work is supported by Tohoku University Division for Interdisciplinary Advance Research and Education, JSPS KAKENHI Grant Numbers JP15H05706, JP24106010, and ImPACT Program of Council for Science, Technology and Innovation (Cabinet Office, Government of Japan).
Publisher Copyright:
© Czech Technical University in Prague.

PY - 2016

Y1 - 2016

N2 - A multi-track string is a tuple of strings of the same length. The full permuted pattern matching problem is, given two multi-track strings T = (t1, t2, . . . , tN) and P = (p1, p2, . . . , pN) such that |p1| = = |pN| ≤ |t1| = = |tN|, to find all positions i such that P = (tr1 [i : I+m-1], . . . , trN [i : I+m-1]) for some permutation (r1, . . . , rN) of (1, . . . ,N), where m = |p1| and t[i : J] denotes the substring of t from position i to j. We propose new algorithms that perform full permuted pattern matching practically fast. The first and second algorithms are based on the Boyer-Moore algorithm and the Horspool algorithm, respectively. The third algorithm is based on the Aho-Corasick algorithm where we use a multi-track character instead of a single character in the so-called goto function. The fourth algorithm is an improvement of the multi-track Knuth-Morris-Pratt algorithm that uses an automaton instead of the failure function of the original algorithm. Our experiment results demonstrate that those algorithms perform permuted pattern matching faster than existing algorithms.

AB - A multi-track string is a tuple of strings of the same length. The full permuted pattern matching problem is, given two multi-track strings T = (t1, t2, . . . , tN) and P = (p1, p2, . . . , pN) such that |p1| = = |pN| ≤ |t1| = = |tN|, to find all positions i such that P = (tr1 [i : I+m-1], . . . , trN [i : I+m-1]) for some permutation (r1, . . . , rN) of (1, . . . ,N), where m = |p1| and t[i : J] denotes the substring of t from position i to j. We propose new algorithms that perform full permuted pattern matching practically fast. The first and second algorithms are based on the Boyer-Moore algorithm and the Horspool algorithm, respectively. The third algorithm is based on the Aho-Corasick algorithm where we use a multi-track character instead of a single character in the so-called goto function. The fourth algorithm is an improvement of the multi-track Knuth-Morris-Pratt algorithm that uses an automaton instead of the failure function of the original algorithm. Our experiment results demonstrate that those algorithms perform permuted pattern matching faster than existing algorithms.

KW - AC-automaton

KW - Boyer-Moore algorithm

KW - Horspool algorithm

KW - Multi-track string

KW - Permuted pattern matching

UR - http://www.scopus.com/inward/record.url?scp=85064207536&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85064207536&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:85064207536

T3 - Proceedings of the Prague Stringology Conference, PSC 2016

SP - 7

EP - 21

BT - Proceedings of the Prague Stringology Conference, PSC 2016

A2 - Holub, Jan

A2 - Zdarek, Jan

PB - Prague Stringology Club

T2 - 20th Prague Stringology Conference, PSC 2016

Y2 - 29 August 2016 through 31 August 2016

ER -