Multiple pattern matching in LZW compressed text

Takuya Kida, Masayuki Takeda, Ayumi Shinohara, Masamichi Miyazaki, Setsuo Arikawa

Research output: Contribution to journalConference article

47 Citations (Scopus)

Abstract

In this paper we address the problem of searching in LZW compressed text directly, and present a new algorithm for finding multiple patterns by simulating the move of the Aho-Corasick pattern matching machine. The new algorithm finds all occurrences of multiple patterns whereas the algorithm proposed by Amir, Benson, and Farach finds only the first occurrence of a single pattern. The new algorithm runs in O(n + m2 + r) time using O(n + m2) space, where n is the length of the compressed text, m is the length of the total length of the patterns, and r is the number of occurrences of the patterns. We implemented a simple version of the algorithm, and showed that it is approximately twice faster than a decompression followed by a search using the Aho-Corasick machine.

Original languageEnglish
Pages (from-to)103-112
Number of pages10
JournalData Compression Conference Proceedings
Publication statusPublished - 1998 Jan 1
Externally publishedYes
EventProceedings of the 1998 Data Compression Conference, DCC - Snowbird, UT, USA
Duration: 1998 Mar 301998 Apr 1

ASJC Scopus subject areas

  • Computer Networks and Communications

Fingerprint Dive into the research topics of 'Multiple pattern matching in LZW compressed text'. Together they form a unique fingerprint.

  • Cite this

    Kida, T., Takeda, M., Shinohara, A., Miyazaki, M., & Arikawa, S. (1998). Multiple pattern matching in LZW compressed text. Data Compression Conference Proceedings, 103-112.