Fast and Linear-Time String Matching Algorithms Based on the Distances of q-Gram Occurrences

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Given a text T of length n and a pattern P of length m, the string matching problem is a task to find all occurrences of P in T. In this study, we propose an algorithm that solves this problem in O((n +m)q) time considering the distance between two adjacent occurrences of the same q-gram contained in P. We also propose a theoretical improvement of it which runs in O(n+m) time, though it is not necessarily faster in practice. We compare the execution times of our and existing algorithms on various kinds of real and artificial datasets such as an English text, a genome sequence and a Fibonacci string. The experimental results show that our algorithm is as fast as the state-of-the-art algorithms in many cases, particularly when a pattern frequently appears in a text. 2012 ACM Subject Classification Theory of computation ! Pattern matching.

Original languageEnglish
Title of host publication18th International Symposium on Experimental Algorithms, SEA 2020
EditorsSimone Faro, Domenico Cantone
PublisherSchloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing
ISBN (Electronic)9783959771481
DOIs
Publication statusPublished - 2020 Jun 1
Event18th International Symposium on Experimental Algorithms, SEA 2020 - Catania, Italy
Duration: 2020 Jun 162020 Jun 18

Publication series

NameLeibniz International Proceedings in Informatics, LIPIcs
Volume160
ISSN (Print)1868-8969

Conference

Conference18th International Symposium on Experimental Algorithms, SEA 2020
CountryItaly
CityCatania
Period20/6/1620/6/18

Keywords

  • String matching algorithm
  • Text processing

ASJC Scopus subject areas

  • Software

Fingerprint Dive into the research topics of 'Fast and Linear-Time String Matching Algorithms Based on the Distances of q-Gram Occurrences'. Together they form a unique fingerprint.

Cite this