TY - GEN

T1 - Generalized dictionary matching under substring consistent equivalence relations

AU - Hendrian, Diptarama

PY - 2020/1/1

Y1 - 2020/1/1

N2 - Given a set of patterns called a dictionary and a text, the dictionary matching problem is a task to find all occurrence positions of all patterns in the text. The dictionary matching problem can be solved efficiently by using the Aho-Corasick algorithm. Recently, Matsuoka et al. [TCS, 2016] proposed a generalization of pattern matching problem under substring consistent equivalence relations and presented a generalization of the Knuth-Morris-Pratt algorithm to solve this problem. An equivalence relation ≈ is a substring consistent equivalence relation (SCER) if for two strings X, Y, X ≈ Y implies |X| = |Y| and X[i: j] ≈ Y [i: j] for all 1 ≤ i ≤ j ≤ |X|. In this paper, we propose a generalization of the dictionary matching problem and present a generalization of the Aho-Corasick algorithm for the dictionary matching under SCER. We present an algorithm that constructs SCER automata and an algorithm that performs dictionary matching under SCER by using the automata. Moreover, we show the time and space complexity of our algorithms with respect to the size of input strings.

AB - Given a set of patterns called a dictionary and a text, the dictionary matching problem is a task to find all occurrence positions of all patterns in the text. The dictionary matching problem can be solved efficiently by using the Aho-Corasick algorithm. Recently, Matsuoka et al. [TCS, 2016] proposed a generalization of pattern matching problem under substring consistent equivalence relations and presented a generalization of the Knuth-Morris-Pratt algorithm to solve this problem. An equivalence relation ≈ is a substring consistent equivalence relation (SCER) if for two strings X, Y, X ≈ Y implies |X| = |Y| and X[i: j] ≈ Y [i: j] for all 1 ≤ i ≤ j ≤ |X|. In this paper, we propose a generalization of the dictionary matching problem and present a generalization of the Aho-Corasick algorithm for the dictionary matching under SCER. We present an algorithm that constructs SCER automata and an algorithm that performs dictionary matching under SCER by using the automata. Moreover, we show the time and space complexity of our algorithms with respect to the size of input strings.

KW - Aho-Corasick algorithm

KW - Dictionary matching

KW - Substring consistent equivalence relation

UR - http://www.scopus.com/inward/record.url?scp=85080970564&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85080970564&partnerID=8YFLogxK

U2 - 10.1007/978-3-030-39881-1_11

DO - 10.1007/978-3-030-39881-1_11

M3 - Conference contribution

AN - SCOPUS:85080970564

SN - 9783030398804

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 120

EP - 132

BT - WALCOM

A2 - Rahman, M. Sohel

A2 - Sadakane, Kunihiko

A2 - Sung, Wing-Kin

PB - Springer

T2 - 14th International Conference and Workshops on Algorithms and Computation, WALCOM 2020

Y2 - 31 March 2020 through 2 April 2020

ER -