Technical Report PHD-2020-04

Title: Similarity in Binary Executables
Authors: Yaniv David
Supervisors: Eran Yahav
PDFCurrently accessibly only within the Technion network
Abstract: We address the problem of binary code search in stripped executables (with no debug information). The main challenge is establishing binary code similarity even when the binary code has been compiled using different compilers, optimization levels, target architectures. Moreover, the source code being compiled might be from another version of the software package or another implementation altogether. Overcoming this challenge, while avoiding false-positives, is invaluable to guide other more costly tasks in the field of binary code analysis such as reverse engineering or automated vulnerability detection.

We present an iterative process of analyzing and presenting the different parts of the binary similarity problem. At each step we further refine our similarity method. Towards this end we incorporate several representations for binary code, each created by statically analyzing the binary code to decompose it into smaller parts carrying semantic meaning. These representations are matched with different concepts and tools from other fields to create a measure for binary similarity between procedures. These include fields include model theory, statistical frameworks, SMT solvers and deep neural networks.

We tested our developed methods in real-world scenarios by employing them to find vulnerabilities by search and perform name prediction on binary procedure. We discovered 373 vulnerabilities affecting publicly available firmware, 147 of them in the latest available firmware version for the device, and successfully predicted procedure names improving on the state-of-the-art by 20% and improving by 84% over state-of-the-art neural models that do not use any static analysis.

CopyrightThe above paper is copyright by the Technion, Author(s), or others. Please contact the author(s) for more information

Remark: Any link to this technical report should be to this page (, rather than to the URL of the PDF files directly. The latter URLs may change without notice.

To the list of the PHD technical reports of 2020
To the main CS technical reports page

Computer science department, Technion