TY - JOUR
T1 - Revisiting state blow-up
T2 - Automatically building augmented-FA while preserving functional equivalence
AU - Yu, Xiaodong
AU - Lin, Bill
AU - Becchi, Michela
N1 - Publisher Copyright:
© 1983-2012 IEEE.
PY - 2014/10/1
Y1 - 2014/10/1
N2 - Regular expression matching, a central task in deep packet inspection and other networking applications, has been traditionally implemented through finite automata. Thanks to their limited per-character processing and memory bandwidth requirements, deterministic finite automata (DFA) are a natural choice for memory-based implementations. In the presence of large datasets of complex patterns, however, DFA suffer from the well-known state explosion problem. Specifically, state explosion can take place during DFA generation when the considered patterns contain bounded and unbounded repetitions of wildcards or large character sets. Several alternative FA representations have been proposed to address this problem. However, these proposals all suffer from one or more of the following problems: some can avoid state explosion only on datasets of limited size and complexity; some have prohibitive worst-case memory bandwidth requirements or processing time; and some can only guarantee functional equivalence for restricted classes of regular expressions and require the user to manually filter out unsupported patterns. In this work we propose JFA, a finite automation that uses state variables to avoid state explosion, and is functionally equivalent to the corresponding DFA. Functional equivalence is guaranteed by construction without requiring user intervention. We also provide optimization techniques to both limit the amount of state variables required and provide a lower bound for the JFA traversal time.
AB - Regular expression matching, a central task in deep packet inspection and other networking applications, has been traditionally implemented through finite automata. Thanks to their limited per-character processing and memory bandwidth requirements, deterministic finite automata (DFA) are a natural choice for memory-based implementations. In the presence of large datasets of complex patterns, however, DFA suffer from the well-known state explosion problem. Specifically, state explosion can take place during DFA generation when the considered patterns contain bounded and unbounded repetitions of wildcards or large character sets. Several alternative FA representations have been proposed to address this problem. However, these proposals all suffer from one or more of the following problems: some can avoid state explosion only on datasets of limited size and complexity; some have prohibitive worst-case memory bandwidth requirements or processing time; and some can only guarantee functional equivalence for restricted classes of regular expressions and require the user to manually filter out unsupported patterns. In this work we propose JFA, a finite automation that uses state variables to avoid state explosion, and is functionally equivalent to the corresponding DFA. Functional equivalence is guaranteed by construction without requiring user intervention. We also provide optimization techniques to both limit the amount of state variables required and provide a lower bound for the JFA traversal time.
UR - http://www.scopus.com/inward/record.url?scp=84919427781&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84919427781&partnerID=8YFLogxK
U2 - 10.1109/JSAC.2014.2358840
DO - 10.1109/JSAC.2014.2358840
M3 - Article
AN - SCOPUS:84919427781
SN - 0733-8716
VL - 32
SP - 1822
EP - 1833
JO - IEEE Journal on Selected Areas in Communications
JF - IEEE Journal on Selected Areas in Communications
IS - 10
M1 - 6901222
ER -