Translation of regular expression with lookahead into finite state automaton

Akimasa Morihata

Research output: Contribution to journalArticlepeer-review

2 Citations (Scopus)

Abstract

Most of the conventional implementations of regular expressions are based on backtracking. Such implementations are slow in the worst case, and thus, we would like to develop a better matching algorithm. However, it is nontrivial to provide an efficient matching algorithm that can deal with practical extensions including submatch addressing. This paper studies regular expression with lookaheads and negative lookaheads, abbreviated to REwLA. First, we propose a transformation from a REwLA of size m to a deterministic finite automaton of O(2 2m) states. Next, we consider weighted regular expressions, which enable us to calculate submatch addressing. We propose a transformation from a weighted REwLA of size m to a weighted nondeterministic finite automaton of O(2 2m) states.

Original languageEnglish
Pages (from-to)147-158
Number of pages12
JournalComputer Software
Volume29
Issue number1
Publication statusPublished - 2012 Mar 7

ASJC Scopus subject areas

  • Software

Fingerprint Dive into the research topics of 'Translation of regular expression with lookahead into finite state automaton'. Together they form a unique fingerprint.

Cite this