Linear-Time text compression by longest-first substitution

Ryosuke Nakamura, Shunsuke Inenaga, Hideo Bannai, Takashi Funamoto, Masayuki Takeda, Ayumi Shinohara

Research output: Contribution to journalArticlepeer-review

15 Citations (Scopus)

Abstract

We consider grammar-based text compression with longest first substitution (LFS), where non-overlapping occurrences of a longest repeating factor of the input text are replaced by a new non-terminal symbol. We present the first linear-time algorithm for LFS. Our algorithm employs a new data structure called sparse lazy suffix trees. We also deal with a more sophisticated version of LFS, called LFS2, that allows better compression. The first linear-time algorithm for LFS2 is also presented.

Original languageEnglish
Pages (from-to)1429-1448
Number of pages20
JournalAlgorithms
Volume2
Issue number4
DOIs
Publication statusPublished - 2009 Dec

Keywords

  • Grammar-based text compression
  • Linear-time algorithms
  • Suffix trees

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Numerical Analysis
  • Computational Theory and Mathematics
  • Computational Mathematics

Fingerprint Dive into the research topics of 'Linear-Time text compression by longest-first substitution'. Together they form a unique fingerprint.

Cite this