TY - GEN
T1 - Architectural support for efficient large-scale automata processing
AU - Liu, Hongyuan
AU - Ibrahim, Mohamed
AU - Kayiran, Onur
AU - Pai, Sreepathi
AU - Jog, Adwait
N1 - Publisher Copyright:
© 2018 IEEE.
PY - 2018/12/12
Y1 - 2018/12/12
N2 - The Automata Processor (AP) accelerates applications from domains ranging from machine learning to genomics. However, as a spatial architecture, it is unable to handle larger automata programs without repeated reconfiguration and re-execution. To achieve high throughput, this paper proposes for the first time architectural support for AP to efficiently execute large-scale applications. We find that a large number of existing and new Non-deterministic Finite Automata (NFA) based applications have states that are never enabled but are still configured on the AP chips leading to their underutilization. With the help of careful characterization and profiling-based mechanisms, we predict which states are never enabled and hence need not be configured on AP. Furthermore, we develop SparseAP, a new execution mode for AP to efficiently handle the mis-predicted NFA states. Our detailed simulations across 26 applications from various domains show that our newly proposed execution model for AP can obtain 2.1x geometric mean speedup (up to 47x) over the baseline AP execution.
AB - The Automata Processor (AP) accelerates applications from domains ranging from machine learning to genomics. However, as a spatial architecture, it is unable to handle larger automata programs without repeated reconfiguration and re-execution. To achieve high throughput, this paper proposes for the first time architectural support for AP to efficiently execute large-scale applications. We find that a large number of existing and new Non-deterministic Finite Automata (NFA) based applications have states that are never enabled but are still configured on the AP chips leading to their underutilization. With the help of careful characterization and profiling-based mechanisms, we predict which states are never enabled and hence need not be configured on AP. Furthermore, we develop SparseAP, a new execution mode for AP to efficiently handle the mis-predicted NFA states. Our detailed simulations across 26 applications from various domains show that our newly proposed execution model for AP can obtain 2.1x geometric mean speedup (up to 47x) over the baseline AP execution.
KW - Accelerators
KW - Automata
KW - Performance
UR - https://www.scopus.com/pages/publications/85060031937
UR - https://www.scopus.com/inward/citedby.url?scp=85060031937&partnerID=8YFLogxK
U2 - 10.1109/MICRO.2018.00078
DO - 10.1109/MICRO.2018.00078
M3 - Conference contribution
AN - SCOPUS:85060031937
T3 - Proceedings of the Annual International Symposium on Microarchitecture, MICRO
SP - 908
EP - 920
BT - Proceedings - 51st Annual IEEE/ACM International Symposium on Microarchitecture, MICRO 2018
T2 - 51st Annual IEEE/ACM International Symposium on Microarchitecture, MICRO 2018
Y2 - 20 October 2018 through 24 October 2018
ER -