Revisiting state blow-up: Automatically building augmented-FA while preserving functional equivalence

Xiaodong Yu, Bill Lin, Michela Becchi

Research output: Contribution to journalArticlepeer-review

25 Scopus citations

Abstract

Regular expression matching, a central task in deep packet inspection and other networking applications, has been traditionally implemented through finite automata. Thanks to their limited per-character processing and memory bandwidth requirements, deterministic finite automata (DFA) are a natural choice for memory-based implementations. In the presence of large datasets of complex patterns, however, DFA suffer from the well-known state explosion problem. Specifically, state explosion can take place during DFA generation when the considered patterns contain bounded and unbounded repetitions of wildcards or large character sets. Several alternative FA representations have been proposed to address this problem. However, these proposals all suffer from one or more of the following problems: some can avoid state explosion only on datasets of limited size and complexity; some have prohibitive worst-case memory bandwidth requirements or processing time; and some can only guarantee functional equivalence for restricted classes of regular expressions and require the user to manually filter out unsupported patterns. In this work we propose JFA, a finite automation that uses state variables to avoid state explosion, and is functionally equivalent to the corresponding DFA. Functional equivalence is guaranteed by construction without requiring user intervention. We also provide optimization techniques to both limit the amount of state variables required and provide a lower bound for the JFA traversal time.

Original languageEnglish
Article number6901222
Pages (from-to)1822-1833
Number of pages12
JournalIEEE Journal on Selected Areas in Communications
Volume32
Issue number10
DOIs
StatePublished - 1 Oct 2014

Fingerprint

Dive into the research topics of 'Revisiting state blow-up: Automatically building augmented-FA while preserving functional equivalence'. Together they form a unique fingerprint.

Cite this