Linear-time off-line text compression by longest-first substitution

Shunsuke Inenaga, Takashi Funamoto, Masayuki Takeda, Ayumi Shinohara

Research output: Contribution to journalArticlepeer-review

7 Citations (Scopus)

Abstract

Given a text, grammar-based compression is to construct a grammar that generates the text. There are many kinds of text compression techniques of this type. Each compression scheme is categorized as being either off-line or on-line, according to how a text is processed. One representative tactics for off-line compression is to substitute the longest repeated factors of a text with a production rule. In this paper, we present an algorithm that compresses a text basing on this longest-first principle, in linear time. The algorithm employs a suitable index structure for a text, and involves technically efficient operations on the structure.

Original languageEnglish
Pages (from-to)137-152
Number of pages16
JournalLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume2857
Publication statusPublished - 2003 Dec 1
Externally publishedYes

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)

Fingerprint Dive into the research topics of 'Linear-time off-line text compression by longest-first substitution'. Together they form a unique fingerprint.

Cite this