TY - GEN
T1 - SoK
T2 - 42nd IEEE Symposium on Security and Privacy, SP 2021
AU - Pang, Chengbin
AU - Yu, Ruotong
AU - Chen, Yaohui
AU - Koskinen, Eric
AU - Portokalidis, Georgios
AU - Mao, Bing
AU - Xu, Jun
N1 - Publisher Copyright:
© 2021 IEEE.
PY - 2021/5
Y1 - 2021/5
N2 - Disassembly of binary code is hard, but necessary for improving the security of binary software. Over the past few decades, research in binary disassembly has produced many tools and frameworks, which have been made available to researchers and security professionals. These tools employ a variety of strategies that grant them different characteristics. The lack of systematization, however, impedes new research in the area and makes selecting the right tool hard, as we do not understand the strengths and weaknesses of existing tools. In this paper, we systematize binary disassembly through the study of nine popular, open-source tools. We couple the manual examination of their code bases with the most comprehensive experimental evaluation (thus far) using 3, 788 binaries. Our study yields a comprehensive description and organization of strategies for disassembly, classifying them as either algorithm or else heuristic. Meanwhile, we measure and report the impact of individual algorithms on the results of each tool. We find that while principled algorithms are used by all tools, they still heavily rely on heuristics to increase code coverage. Depending on the heuristics used, different coverage-vs-correctness trade-offs come in play, leading to tools with different strengths and weaknesses. We envision that these findings will help users pick the right tool and assist researchers in improving binary disassembly.
AB - Disassembly of binary code is hard, but necessary for improving the security of binary software. Over the past few decades, research in binary disassembly has produced many tools and frameworks, which have been made available to researchers and security professionals. These tools employ a variety of strategies that grant them different characteristics. The lack of systematization, however, impedes new research in the area and makes selecting the right tool hard, as we do not understand the strengths and weaknesses of existing tools. In this paper, we systematize binary disassembly through the study of nine popular, open-source tools. We couple the manual examination of their code bases with the most comprehensive experimental evaluation (thus far) using 3, 788 binaries. Our study yields a comprehensive description and organization of strategies for disassembly, classifying them as either algorithm or else heuristic. Meanwhile, we measure and report the impact of individual algorithms on the results of each tool. We find that while principled algorithms are used by all tools, they still heavily rely on heuristics to increase code coverage. Depending on the heuristics used, different coverage-vs-correctness trade-offs come in play, leading to tools with different strengths and weaknesses. We envision that these findings will help users pick the right tool and assist researchers in improving binary disassembly.
KW - Binary-disassembly
KW - Binary-security
KW - Knowledge-systematization
UR - http://www.scopus.com/inward/record.url?scp=85106724211&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85106724211&partnerID=8YFLogxK
U2 - 10.1109/SP40001.2021.00012
DO - 10.1109/SP40001.2021.00012
M3 - Conference contribution
AN - SCOPUS:85106724211
T3 - Proceedings - IEEE Symposium on Security and Privacy
SP - 833
EP - 851
BT - Proceedings - 2021 IEEE Symposium on Security and Privacy, SP 2021
Y2 - 24 May 2021 through 27 May 2021
ER -