Truncated DAWGs and their application to minimal absent word problem

Yuta Fujishige, Takuya Takagi, Diptarama Hendrian

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)

Abstract

The directed acyclic word graph (DAWG) of a string y is the smallest (partial) DFA which recognizes all suffixes of y and has O(n) nodes and edges. Na et al. [11] proposed k-truncated suffix tree which is a compressed trie that represents substrings of a string whose length up to k. In this paper, we present a new data structure called k-truncated DAWGs, which can be obtained by pruning the DAWGs. We show that the size complexity of the k-truncated DAWG of a string y of length n is O(min{{n,kz}) which is equal to the truncated suffix tree’s one, where z is the size of LZ77 factorization of y. We also present an O(n log σ) time and O(min{{n,kz}) space algorithm for constructing the k-truncated DAWG of y, where σ is the alphabet size. As an application of the truncated DAWGs, we show that the set MAWk(y) of all minimal absent words of y whose length is smaller than or equal to k can be computed by using k-truncated DAWG of y in O(min{{n,kz}) + |MAWk(y)|) time and O(min{{n,kz}) working space.

Original languageEnglish
Title of host publicationString Processing and Information Retrieval - 25th International Symposium, SPIRE 2018, Proceedings
EditorsTravis Gagie, Alistair Moffat, Gonzalo Navarro, Ernesto Cuadros-Vargas
PublisherSpringer Verlag
Pages139-152
Number of pages14
ISBN (Print)9783030004781
DOIs
Publication statusPublished - 2018 Jan 1
Event25th International Symposium on String Processing and Information Retrieval, SPIRE 2018 - Lima, Peru
Duration: 2018 Oct 92018 Oct 11

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume11147 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Other

Other25th International Symposium on String Processing and Information Retrieval, SPIRE 2018
CountryPeru
CityLima
Period18/10/918/10/11

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)

Fingerprint Dive into the research topics of 'Truncated DAWGs and their application to minimal absent word problem'. Together they form a unique fingerprint.

  • Cite this

    Fujishige, Y., Takagi, T., & Hendrian, D. (2018). Truncated DAWGs and their application to minimal absent word problem. In T. Gagie, A. Moffat, G. Navarro, & E. Cuadros-Vargas (Eds.), String Processing and Information Retrieval - 25th International Symposium, SPIRE 2018, Proceedings (pp. 139-152). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 11147 LNCS). Springer Verlag. https://doi.org/10.1007/978-3-030-00479-8_12